Note
Go to the end to download the full example code.
Analyzing Events Using ctapipe#
Initially presented @ LST Analysis Bootcamp in Padova, 26.11.2018 by Maximilian Nöthe (@maxnoe) & Kai A. Brügge (@mackaiver)
14 import tempfile
15 import timeit
16 from copy import deepcopy
17
18 import astropy.units as u
19 import matplotlib.pyplot as plt
20 import numpy as np
21 from astropy.coordinates import AltAz, angular_separation
22 from matplotlib.colors import ListedColormap
23 from scipy.sparse.csgraph import connected_components
24 from traitlets.config import Config
25
26 from ctapipe.calib import CameraCalibrator
27 from ctapipe.image import (
28     ImageProcessor,
29     camera_to_shower_coordinates,
30     concentration_parameters,
31     hillas_parameters,
32     leakage_parameters,
33     number_of_islands,
34     timing_parameters,
35     toymodel,
36 )
37 from ctapipe.image.cleaning import tailcuts_clean
38 from ctapipe.io import DataWriter, EventSource, TableLoader
39 from ctapipe.reco import ShowerProcessor
40 from ctapipe.utils.datasets import get_dataset_path
41 from ctapipe.visualization import ArrayDisplay, CameraDisplay
42
43 # %matplotlib inline
46 plt.rcParams["figure.figsize"] = (12, 8)
47 plt.rcParams["font.size"] = 14
48 plt.rcParams["figure.figsize"]
[12.0, 8.0]
General Information#
Design#
DL0 → DL3 analysis
Currently some R0 → DL2 code to be able to analyze simtel files
ctapipe is built upon the Scientific Python Stack, core dependencies are
numpy
scipy
astropy
numba
Development#
ctapipe is developed as Open Source Software (BSD 3-Clause License) at cta-observatory/ctapipe
We use the “Github-Workflow”:
Few people (e.g. @kosack, @maxnoe) have write access to the main repository
Contributors fork the main repository and work on branches
Pull Requests are merged after Code Review and automatic execution of the test suite
Early development stage ⇒ backwards-incompatible API changes might and will happen
What’s there?#
Reading simtel simulation files
Simple calibration, cleaning and feature extraction functions
Camera and Array plotting
Coordinate frames and transformations
Stereo-reconstruction using line intersections
What’s still missing?#
Good integration with machine learning techniques
IRF calculation
Documentation, e.g. formal definitions of coordinate frames
What can you do?#
Report issues
Hard to get started? Tell us where you are stuck
Tell user stories
Missing features
Start contributing
ctapipe needs more workpower
Implement new reconstruction features
A simple hillas analysis#
Reading in simtel files#
146 input_url = get_dataset_path("gamma_prod5.simtel.zst")
147
148 # EventSource() automatically detects what kind of file we are giving it,
149 # if already supported by ctapipe
150 source = EventSource(input_url, max_events=5)
151
152 print(type(source))
<class 'ctapipe.io.simteleventsource.SimTelEventSource'>
155 for event in source:
156     print(
157         "Id: {}, E = {:1.3f}, Telescopes: {}".format(
158             event.count, event.simulation.shower.energy, len(event.r0.tel)
159         )
160     )
Id: 0, E = 0.073 TeV, Telescopes: 3
Id: 1, E = 1.745 TeV, Telescopes: 14
Id: 2, E = 1.745 TeV, Telescopes: 4
Id: 3, E = 1.745 TeV, Telescopes: 20
Id: 4, E = 1.745 TeV, Telescopes: 31
Each event is a DataContainer holding several Fields of data,
which can be containers or just numbers. Let’s look a one event:
168 event
ctapipe.containers.ArrayEventContainer:
                       index.*: event indexing information with default None
                          r0.*: Raw Data with default None
                          r1.*: R1 Calibrated Data with default None
                         dl0.*: DL0 Data Volume Reduced Data with default None
                         dl1.*: DL1 Calibrated image with default None
                         dl2.*: DL2 reconstruction info with default None
                  simulation.*: Simulated Event Information with default None
                                with type <class
                                'ctapipe.containers.SimulatedEventContainer'>
                     trigger.*: central trigger information with default None
                         count: number of events processed with default 0
                  monitoring.*: container for event-wise monitoring data (MON)
                                with default None
                        muon.*: Container for muon analysis results with default
                                None
(CameraDescription(name=CHEC, geometry=CHEC, readout=CHEC), CameraDescription(name=FlashCam, geometry=FlashCam, readout=FlashCam), CameraDescription(name=LSTCam, geometry=LSTCam, readout=LSTCam), CameraDescription(name=NectarCam, geometry=NectarCam, readout=NectarCam))
174 len(event.r0.tel), len(event.r1.tel)
(31, 31)
Data calibration#
The CameraCalibrator calibrates the event (obtaining the dl1
images).
186 calibrator = CameraCalibrator(subarray=source.subarray)
189 calibrator(event)
Event displays#
Let’s use ctapipe’s plotting facilities to plot the telescope images
199 event.dl1.tel.keys()
dict_keys([3, 4, 8, 9, 14, 15, 23, 24, 25, 29, 32, 36, 42, 46, 52, 56, 103, 104, 109, 110, 118, 119, 120, 126, 127, 129, 130, 137, 139, 143, 161])
202 tel_id = 130
205 geometry = source.subarray.tel[tel_id].camera.geometry
206 dl1 = event.dl1.tel[tel_id]
207
208 geometry, dl1
(CameraGeometry(name='NectarCam', pix_type=PixelShape.HEXAGON, npix=1855, cam_rot=0.000 deg, pix_rot=40.893 deg, frame=<CameraFrame Frame (focal_length=16.445049285888672 m, rotation=0.0 rad, telescope_pointing=None, obstime=None, location=None)>), ctapipe.containers.DL1CameraContainer:
                         image: Numpy array of camera image, after waveform
                                extraction.Shape: (n_pixel) if n_channels is 1
                                or data is gain selectedelse: (n_channels,
                                n_pixel) with default None
                     peak_time: Numpy array containing position of the peak of
                                the pulse as determined by the extractor.Shape:
                                (n_pixel) if n_channels is 1 or data is gain
                                selectedelse: (n_channels, n_pixel) with default
                                None
                    image_mask: Boolean numpy array where True means the pixel
                                has passed cleaning. Shape: (n_pixel, ) with
                                default None as a 1-D array with dtype bool
                      is_valid: True if image extraction succeeded, False if
                                failed or in the case of TwoPass methods, that
                                the first pass only was returned. with default
                                False
                    parameters: Image parameters with default None with type
                                <class
                                'ctapipe.containers.ImageParametersContainer'>)
211 dl1.image
array([-0.3856395 ,  1.2851689 ,  1.403967  , ...,  2.6847053 ,
        0.04941979, -1.0566036 ], shape=(1855,), dtype=float32)
215 display = CameraDisplay(geometry)
216
217 # right now, there might be one image per gain channel.
218 # This will change as soon as
219 display.image = dl1.image
220 display.add_colorbar()

Image Cleaning#
229 # unoptimized cleaning levels
230 cleaning_level = {
231     "CHEC": (2, 4, 2),
232     "LSTCam": (3.5, 7, 2),
233     "FlashCam": (3.5, 7, 2),
234     "NectarCam": (4, 8, 2),
235 }
238 boundary, picture, min_neighbors = cleaning_level[geometry.name]
239
240 clean = tailcuts_clean(
241     geometry,
242     dl1.image,
243     boundary_thresh=boundary,
244     picture_thresh=picture,
245     min_number_picture_neighbors=min_neighbors,
246 )
249 fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 5))
250
251 d1 = CameraDisplay(geometry, ax=ax1)
252 d2 = CameraDisplay(geometry, ax=ax2)
253
254 ax1.set_title("Image")
255 d1.image = dl1.image
256 d1.add_colorbar(ax=ax1)
257
258 ax2.set_title("Pulse Time")
259 d2.image = dl1.peak_time - np.average(dl1.peak_time, weights=dl1.image)
260 d2.cmap = "RdBu_r"
261 d2.add_colorbar(ax=ax2)
262 d2.set_limits_minmax(-20, 20)
263
264 d1.highlight_pixels(clean, color="red", linewidth=1)

Image Parameters#
{'intensity': np.float64(303.9444303512573),
 'kurtosis': np.float64(2.4649477808816265),
 'length': <Quantity 0.14667835 m>,
 'length_uncertainty': <Quantity 0.00509155 m>,
 'phi': <Angle -1.11344351 rad>,
 'psi': <Angle -1.12308339 rad>,
 'psi_uncertainty': <Angle 0.01927114 rad>,
 'r': <Quantity 0.71788425 m>,
 'skewness': np.float64(0.360906379879541),
 'transverse_cog_uncertainty': <Quantity 0.00261556 m>,
 'width': <Quantity 0.02584731 m>,
 'width_uncertainty': <Quantity 0.00111159 m>,
 'x': <Quantity 0.31699941 m>,
 'y': <Quantity -0.64410339 m>}
278 display = CameraDisplay(geometry)
279
280 # set "unclean" pixels to 0
281 cleaned = dl1.image.copy()
282 cleaned[~clean] = 0.0
283
284 display.image = cleaned
285 display.add_colorbar()
286
287 display.overlay_moments(hillas, color="xkcd:red")

290 timing = timing_parameters(geometry, dl1.image, dl1.peak_time, hillas, clean)
291
292 print(timing)
{'deviation': 0.792151827050336,
 'intercept': np.float64(21.81257256241495),
 'slope': <Quantity 31.32320575 1 / m>}
295 long, trans = camera_to_shower_coordinates(
296     geometry.pix_x, geometry.pix_y, hillas.x, hillas.y, hillas.psi
297 )
298
299 plt.plot(long[clean], dl1.peak_time[clean], "o")
300 plt.plot(long[clean], timing.slope * long[clean] + timing.intercept)

[<matplotlib.lines.Line2D object at 0x79f09fcb8ca0>]
{'intensity_width_1': np.float32(0.0),
 'intensity_width_2': np.float32(0.0),
 'pixels_width_1': 0.0,
 'pixels_width_2': 0.0}
307 disp = CameraDisplay(geometry)
308 disp.image = dl1.image
309 disp.highlight_pixels(geometry.get_border_pixel_mask(1), linewidth=2, color="xkcd:red")
2
{'cog': np.float64(0.2571728392782083),
 'core': np.float64(0.4014042974365615),
 'pixel': np.float64(0.09514053791448862)}
Putting it all together / Stereo reconstruction#
All these steps are now unified in several components configurable through the config system, mainly:
CameraCalibrator for DL0 → DL1 (Images)
ImageProcessor for DL1 (Images) → DL1 (Parameters)
ShowerProcessor for stereo reconstruction of the shower geometry
DataWriter for writing data into HDF5
A command line tool doing these steps and writing out data in HDF5
format is available as ctapipe-process
338 image_processor_config = Config(
339     {
340         "ImageProcessor": {
341             "image_cleaner_type": "TailcutsImageCleaner",
342             "TailcutsImageCleaner": {
343                 "picture_threshold_pe": [
344                     ("type", "LST_LST_LSTCam", 7.5),
345                     ("type", "MST_MST_FlashCam", 8),
346                     ("type", "MST_MST_NectarCam", 8),
347                     ("type", "SST_ASTRI_CHEC", 7),
348                 ],
349                 "boundary_threshold_pe": [
350                     ("type", "LST_LST_LSTCam", 5),
351                     ("type", "MST_MST_FlashCam", 4),
352                     ("type", "MST_MST_NectarCam", 4),
353                     ("type", "SST_ASTRI_CHEC", 4),
354                 ],
355             },
356         }
357     }
358 )
359
360 input_url = get_dataset_path("gamma_prod5.simtel.zst")
361 source = EventSource(input_url)
362
363 calibrator = CameraCalibrator(subarray=source.subarray)
364 image_processor = ImageProcessor(
365     subarray=source.subarray, config=image_processor_config
366 )
367 shower_processor = ShowerProcessor(subarray=source.subarray)
368 horizon_frame = AltAz()
369
370 f = tempfile.NamedTemporaryFile(suffix=".hdf5")
371
372 with DataWriter(source, output_path=f.name, overwrite=True, write_dl2=True) as writer:
373     for event in source:
374         energy = event.simulation.shower.energy
375         n_telescopes_r1 = len(event.r1.tel)
376         event_id = event.index.event_id
377         print(f"Id: {event_id}, E = {energy:1.3f}, Telescopes (R1): {n_telescopes_r1}")
378
379         calibrator(event)
380         image_processor(event)
381         shower_processor(event)
382
383         stereo = event.dl2.stereo.geometry["HillasReconstructor"]
384         if stereo.is_valid:
385             print("  Alt: {:.2f}°".format(stereo.alt.deg))
386             print("  Az: {:.2f}°".format(stereo.az.deg))
387             print("  Hmax: {:.0f}".format(stereo.h_max))
388             print("  CoreX: {:.1f}".format(stereo.core_x))
389             print("  CoreY: {:.1f}".format(stereo.core_y))
390             print("  Multiplicity: {:d}".format(len(stereo.telescopes)))
391
392         # save a nice event for plotting later
393         if event.count == 3:
394             plotting_event = deepcopy(event)
395
396         writer(event)
Overwriting /tmp/tmp_n_6szrv.hdf5
Id: 4009, E = 0.073 TeV, Telescopes (R1): 3
  Alt: 70.18°
  Az: 0.41°
  Hmax: 14066 m
  CoreX: -113.2 m
  CoreY: 366.5 m
  Multiplicity: 3
Id: 5101, E = 1.745 TeV, Telescopes (R1): 14
  Alt: 70.00°
  Az: 359.95°
  Hmax: 10329 m
  CoreX: -739.5 m
  CoreY: -268.4 m
  Multiplicity: 10
Id: 5103, E = 1.745 TeV, Telescopes (R1): 4
  Alt: 70.21°
  Az: 0.50°
  Hmax: 11526 m
  CoreX: 832.1 m
  CoreY: 945.4 m
  Multiplicity: 2
Id: 5104, E = 1.745 TeV, Telescopes (R1): 20
  Alt: 70.16°
  Az: 0.07°
  Hmax: 10643 m
  CoreX: -518.3 m
  CoreY: -451.5 m
  Multiplicity: 15
Id: 5105, E = 1.745 TeV, Telescopes (R1): 31
  Alt: 70.02°
  Az: 0.13°
  Hmax: 10449 m
  CoreX: -327.7 m
  CoreY: 408.8 m
  Multiplicity: 25
Id: 5108, E = 1.745 TeV, Telescopes (R1): 3
  Alt: 69.95°
  Az: 359.79°
  Hmax: 10409 m
  CoreX: -1304.5 m
  CoreY: 205.6 m
  Multiplicity: 2
Id: 5109, E = 1.745 TeV, Telescopes (R1): 8
  Alt: 69.62°
  Az: 359.93°
  Hmax: 10886 m
  CoreX: -1067.6 m
  CoreY: 480.5 m
  Multiplicity: 3
400 loader = TableLoader(f.name)
401
402 events = loader.read_subarray_events()
405 theta = angular_separation(
406     events["HillasReconstructor_az"].quantity,
407     events["HillasReconstructor_alt"].quantity,
408     events["true_az"].quantity,
409     events["true_alt"].quantity,
410 )
411
412 plt.hist(theta.to_value(u.deg) ** 2, bins=25, range=[0, 0.3])
413 plt.xlabel(r"$\theta² / deg²$")
414 None

ArrayDisplay#
423 angle_offset = plotting_event.monitoring.pointing.array_azimuth
424
425 plotting_hillas = {
426     tel_id: dl1.parameters.hillas for tel_id, dl1 in plotting_event.dl1.tel.items()
427 }
428
429 plotting_core = {
430     tel_id: dl1.parameters.core.psi for tel_id, dl1 in plotting_event.dl1.tel.items()
431 }
432
433
434 disp = ArrayDisplay(source.subarray)
435
436 disp.set_line_hillas(plotting_hillas, plotting_core, 500)
437
438 plt.scatter(
439     plotting_event.simulation.shower.core_x,
440     plotting_event.simulation.shower.core_y,
441     s=200,
442     c="k",
443     marker="x",
444     label="True Impact",
445 )
446 plt.scatter(
447     plotting_event.dl2.stereo.geometry["HillasReconstructor"].core_x,
448     plotting_event.dl2.stereo.geometry["HillasReconstructor"].core_y,
449     s=200,
450     c="r",
451     marker="x",
452     label="Estimated Impact",
453 )
454
455 plt.legend()
456 # plt.xlim(-400, 400)
457 # plt.ylim(-400, 400)

<matplotlib.legend.Legend object at 0x79f08eef52d0>
Reading the LST dl1 data#
465 loader = TableLoader(f.name)
466
467 dl1_table = loader.read_telescope_events(
468     ["LST_LST_LSTCam"],
469     dl2=False,
470     true_parameters=False,
471 )
474 plt.scatter(
475     np.log10(dl1_table["true_energy"].quantity / u.TeV),
476     np.log10(dl1_table["hillas_intensity"]),
477 )
478 plt.xlabel("log10(E / TeV)")
479 plt.ylabel("log10(intensity)")
480 None

Isn’t python slow?#
Many of you might have heard: “Python is slow”.
That’s trueish.
All python objects are classes living on the heap, even integers.
Looping over lots of “primitives” is quite slow compared to other languages.
But: “Premature Optimization is the root of all evil” — Donald Knuth
So profile to find exactly what is slow.
Why use python then?#
Python works very well as glue for libraries of all kinds of languages
Python has a rich ecosystem for data science, physics, algorithms, astronomy
Example: Number of Islands#
Find all groups of pixels, that survived the cleaning
516 geometry = loader.subarray.tel[1].camera.geometry
Let’s create a toy images with several islands;
523 rng = np.random.default_rng(42)
524
525 image = np.zeros(geometry.n_pixels)
526
527
528 for i in range(9):
529     model = toymodel.Gaussian(
530         x=rng.uniform(-0.8, 0.8) * u.m,
531         y=rng.uniform(-0.8, 0.8) * u.m,
532         width=rng.uniform(0.05, 0.075) * u.m,
533         length=rng.uniform(0.1, 0.15) * u.m,
534         psi=rng.uniform(0, 2 * np.pi) * u.rad,
535     )
536
537     new_image, sig, bg = model.generate_image(
538         geometry, intensity=rng.uniform(1000, 3000), nsb_level_pe=5
539     )
540     image += new_image
543 clean = tailcuts_clean(
544     geometry,
545     image,
546     picture_thresh=10,
547     boundary_thresh=5,
548     min_number_picture_neighbors=2,
549 )
552 disp = CameraDisplay(geometry)
553 disp.image = image
554 disp.highlight_pixels(clean, color="xkcd:red", linewidth=1.5)
555 disp.add_colorbar()

559 def num_islands_python(camera, clean):
560     """A breadth first search to find connected islands of neighboring pixels in the cleaning set"""
561
562     # the camera geometry has a [n_pixel, n_pixel] boolean array
563     # that is True where two pixels are neighbors
564     neighbors = camera.neighbor_matrix
565
566     island_ids = np.zeros(camera.n_pixels)
567     current_island = 0
568
569     # a set to remember which pixels we already visited
570     visited = set()
571
572     # go only through the pixels, that survived cleaning
573     for pix_id in np.where(clean)[0]:
574         if pix_id not in visited:
575             # remember that we already checked this pixel
576             visited.add(pix_id)
577
578             # if we land in the outer loop again, we found a new island
579             current_island += 1
580             island_ids[pix_id] = current_island
581
582             # now check all neighbors of the current pixel recursively
583             to_check = set(np.where(neighbors[pix_id] & clean)[0])
584             while to_check:
585                 pix_id = to_check.pop()
586
587                 if pix_id not in visited:
588                     visited.add(pix_id)
589                     island_ids[pix_id] = current_island
590
591                     to_check.update(np.where(neighbors[pix_id] & clean)[0])
592
593     n_islands = current_island
594     return n_islands, island_ids
598 n_islands, island_ids = num_islands_python(geometry, clean)
602 cmap = plt.get_cmap("Paired")
603 cmap = ListedColormap(cmap.colors[:n_islands])
604 cmap.set_under("k")
605
606 disp = CameraDisplay(geometry)
607 disp.image = island_ids
608 disp.cmap = cmap
609 disp.set_limits_minmax(0.5, n_islands + 0.5)
610 disp.add_colorbar()

613 timeit.timeit(lambda: num_islands_python(geometry, clean), number=1000) / 1000
0.0013858335699987946
617 def num_islands_scipy(geometry, clean):
618     neighbors = geometry.neighbor_matrix_sparse
619
620     clean_neighbors = neighbors[clean][:, clean]
621     num_islands, labels = connected_components(clean_neighbors, directed=False)
622
623     island_ids = np.zeros(geometry.n_pixels)
624     island_ids[clean] = labels + 1
625
626     return num_islands, island_ids
630 n_islands_s, island_ids_s = num_islands_scipy(geometry, clean)
633 disp = CameraDisplay(geometry)
634 disp.image = island_ids_s
635 disp.cmap = cmap
636 disp.set_limits_minmax(0.5, n_islands_s + 0.5)
637 disp.add_colorbar()

640 timeit.timeit(lambda: num_islands_scipy(geometry, clean), number=10000) / 10000
0.00022730472770053894
A lot less code, and a factor 3 speed improvement
Finally, current ctapipe implementation is using numba:
652 timeit.timeit(lambda: number_of_islands(geometry, clean), number=100000) / 100000
1.5196565660007763e-05
Total running time of the script: (0 minutes 24.483 seconds)