Note
Go to the end to download the full example code.
IO: TableLoader and EventSources#
This hands-on was presented in October 2023 at the DPPS Meeting at CEA Paris-Saclay (J. Hackfeld).
Introduction#
ctapipe provides basically two different ways of accessing its data products:
event-wise (EventSource)
column-wise (TableLoader)
EventSource(s):
EventSources read input files and generate ctapipe.containers.ArrayEventContainer
instances when iterated over.
A new EventSource should be created for each type of event file read
into ctapipe, e.g. simtel files are read by the ctapipe.io.SimTelEventSource .
EventSource provides a common high-level interface for accessing event information from different data sources. Creating an EventSource for a new file format or other event source ensures that data can be accessed in a common way, regardless of the file format or data origin.
EventSource itself is an abstract class, but will create an
appropriate subclass if a compatible source is found for the given
input_url .
TableLoader:
Loads telescope-event or subarray-event data from ctapipe HDF5 files and returns
astropy.table .
See Astropy docs
or this video from A.Donath at the ESCAPE Summer School
2021.
This class provides high-level access to data stored in ctapipe HDF5 files,
such as created by the ctapipe-process tool ( ctapipe.tools.process.ProcessorTool ).
There are multiple TableLoader methods loading data from all relevant tables
(depending on the options) and joins them into single tables:
TableLoader.read_subarray_eventsTableLoader.read_telescope_eventsTableLoader.read_telescope_events_by_idTableLoader.read_telescope_events_by_type
The last one returns a dict with a table per telescope type, which is needed for e.g. DL1 image data that might have different shapes for each of the telescope types as tables do not support variable length columns.
It is recommended to use the TableLoader when loading data into Tables,
because its much faster than EventSources!
Code Examples#
First import some classes/modules and get the example dataset path:
65 from traitlets.config import Config
66
67 from ctapipe import utils
68 from ctapipe.calib import CameraCalibrator
69 from ctapipe.io import DataWriter, EventSource, TableLoader
70
71 simtel_path = utils.get_dataset_path("gamma_prod5.simtel.zst")
EventSource(s)#
The already implemented EventSources are:
  HDF5EventSource -- ctapipe.io.hdf5eventsource.HDF5EventSource
SimTelEventSource -- ctapipe.io.simteleventsource.SimTelEventSource
EventSource will create an appropriate subclass if a compatible source is found for
the given input_url :
87 source = EventSource(input_url=simtel_path, max_events=5)
88 print(source)
<ctapipe.io.simteleventsource.SimTelEventSource object at 0x79f08d436830>
You can now loop over the ctapipe.containers.ArrayEventContainer generated by the
source.
94 for event in source:
95     print(event.count)
0
1
2
3
4
99 print(repr(event))
ctapipe.containers.ArrayEventContainer:
                       index.*: event indexing information with default None
                          r0.*: Raw Data with default None
                          r1.*: R1 Calibrated Data with default None
                         dl0.*: DL0 Data Volume Reduced Data with default None
                         dl1.*: DL1 Calibrated image with default None
                         dl2.*: DL2 reconstruction info with default None
                  simulation.*: Simulated Event Information with default None
                                with type <class
                                'ctapipe.containers.SimulatedEventContainer'>
                     trigger.*: central trigger information with default None
                         count: number of events processed with default 0
                  monitoring.*: container for event-wise monitoring data (MON)
                                with default None
                        muon.*: Container for muon analysis results with default
                                None
Every time a new loop is started through the source,
it tries to restart from the first event, which might not be supported
by the event source. It is encouraged to use EventSource in a context manager
to ensure the correct cleanups are performed when you are finished with the source:
107 with EventSource(input_url=simtel_path) as source:
108     for event in source:
109         print(
110             f"Event Count: {event.count},"
111             f"Tels with trigger: {event.trigger.tels_with_trigger}"
112         )
Event Count: 0,Tels with trigger: [  9  14 104]
Event Count: 1,Tels with trigger: [ 26  27  29  47  53  59  69  73 121 122 124 148 162 166]
Event Count: 2,Tels with trigger: [ 82  92 175 179]
Event Count: 3,Tels with trigger: [ 17  26  27  29  43  47  53  57  69 112 121 122 124 125 128 140 144 148
 162 166]
Event Count: 4,Tels with trigger: [  3   4   8   9  14  15  23  24  25  29  32  36  42  46  52  56 103 104
 109 110 118 119 120 126 127 129 130 137 139 143 161]
Event Count: 5,Tels with trigger: [ 87  98 151]
Event Count: 6,Tels with trigger: [ 68  72  76  94  98 151 165 177]
You can hand in the ID’s of the telescopes to be included in the data
with the allowed_tels attribute. If given, only this subset of telescopes
will be present in the generated events. If None, all available telescopes are used.
119 eventsource_config = Config(
120     {"EventSource": {"max_events": 5, "allowed_tels": [3, 4, 9]}}
121 )
122
123 with EventSource(input_url=simtel_path, config=eventsource_config) as source:
124     for event in source:
125         print(
126             f"Event Count: {event.count},"
127             f"Tels with trigger: {event.trigger.tels_with_trigger}"
128         )
Event Count: 0,Tels with trigger: [9]
Event Count: 1,Tels with trigger: [3 4 9]
If you want to calibrate your data in the event loop and write it to an .h5 file with
the DataWriter :
135 source = EventSource(input_url=simtel_path, max_events=50)
136 calibrate = CameraCalibrator(subarray=source.subarray)
137
138 with DataWriter(
139     event_source=source,
140     output_path="events.dl1.h5",
141     write_dl1_parameters=False,
142     overwrite=True,
143     write_dl1_images=True,
144 ) as write_data:
145     for event in source:
146         calibrate(event)
147         write_data(event)
Alternatively doing it with ctapipe-process would look like this:
! ctapipe-process -i {simtel_path} -o events.dl1.h5 --overwrite --progress
TableLoader#
Create a TableLoader instance with the above created dl1 file:
163 loader = TableLoader(input_url="events.dl1.h5")
Alternatively using a config file:
168 tableloader_config = Config(
169     {
170         "TableLoader": {
171             "input_url": "events.dl1.h5",
172         }
173     }
174 )
175
176 loader = TableLoader(config=tableloader_config)
Reading subarray-based event information:
181 subarray_events = loader.read_subarray_events(
182     start=None,
183     stop=None,
184     dl2=False,
185     simulated=True,
186     observation_info=False,
187 )
188
189 subarray_events
Reading subarray-based event information in chunks:
194 subarray_events_chunked = loader.read_subarray_events_chunked(
195     chunk_size=3,
196     dl2=False,
197     simulated=True,
198     observation_info=False,
199 )
200
201 for chunk in subarray_events_chunked:
202     print(" \n", chunk)
 Chunk(start=0, stop=3, data=<Table length=3>
obs_id event_id ... true_starting_grammage true_shower_primary_id
                ...        g / cm2
int32   int64   ...        float64                 int64
------ -------- ... ---------------------- ----------------------
     1     4009 ...                    0.0                      0
     1     5101 ...                    0.0                      0
     1     5103 ...                    0.0                      0)
 Chunk(start=3, stop=6, data=<Table length=3>
obs_id event_id ... true_starting_grammage true_shower_primary_id
                ...        g / cm2
int32   int64   ...        float64                 int64
------ -------- ... ---------------------- ----------------------
     1     5104 ...                    0.0                      0
     1     5105 ...                    0.0                      0
     1     5108 ...                    0.0                      0)
 Chunk(start=6, stop=7, data=<Table length=1>
obs_id event_id ... true_starting_grammage true_shower_primary_id
                ...        g / cm2
int32   int64   ...        float64                 int64
------ -------- ... ---------------------- ----------------------
     1     5109 ...                    0.0                      0)
Reading just LST events:
207 lst_events = loader.read_telescope_events(
208     telescopes=[1, 2, 3, 4],
209     start=None,
210     stop=None,
211     dl1_images=True,
212     dl1_parameters=False,
213     dl1_muons=False,
214     dl2=False,
215     simulated=False,
216     true_images=True,
217     true_parameters=False,
218     instrument=True,
219     observation_info=False,
220 )
221
222 lst_events
Loading telescope events by type returns a dict with the different telescope types:
227 telescope_events_by_type = loader.read_telescope_events_by_type(
228     telescopes=["LST_LST_LSTCam", "MST_MST_FlashCam"],
229     start=None,
230     stop=None,
231     dl1_images=True,
232     dl1_parameters=False,
233     dl1_muons=False,
234     dl2=False,
235     simulated=False,
236     true_images=True,
237     true_parameters=False,
238     instrument=True,
239     observation_info=False,
240 )
241
242 for tel_type, table in telescope_events_by_type.items():
243     print(f"Telescope Type: {tel_type} \n", table, "\n")
Telescope Type: LST_LST_LSTCam
 obs_id event_id tel_id ...        time       tels_with_trigger event_type
                       ...
------ -------- ------ ... ----------------- ----------------- ----------
     1     5105      3 ... 1604409973.646828    False .. False         32
     1     5105      4 ... 1604409973.646828    False .. False         32
Telescope Type: MST_MST_FlashCam
 obs_id event_id tel_id ...        time       tels_with_trigger event_type
                       ...
------ -------- ------ ... ----------------- ----------------- ----------
     1     4009      9 ...  1604409937.84765    False .. False         32
     1     4009     14 ...  1604409937.84765    False .. False         32
     1     5101     26 ... 1604409953.115399    False .. False         32
     1     5101     27 ... 1604409953.115399    False .. False         32
     1     5101     29 ... 1604409953.115399    False .. False         32
     1     5104     17 ... 1604409965.387411    False .. False         32
     1     5104     26 ... 1604409965.387411    False .. False         32
     1     5104     27 ... 1604409965.387411    False .. False         32
     1     5104     29 ... 1604409965.387411    False .. False         32
     1     5104    125 ... 1604409965.387411    False .. False         32
     1     5105      8 ... 1604409973.646828    False .. False         32
     1     5105      9 ... 1604409973.646828    False .. False         32
     1     5105     14 ... 1604409973.646828    False .. False         32
     1     5105     15 ... 1604409973.646828    False .. False         32
     1     5105     23 ... 1604409973.646828    False .. False         32
     1     5105     24 ... 1604409973.646828    False .. False         32
     1     5105     25 ... 1604409973.646828    False .. False         32
     1     5105     29 ... 1604409973.646828    False .. False         32
     1     5105    126 ... 1604409973.646828    False .. False         32
     1     5105    127 ... 1604409973.646828    False .. False         32
Loading telescope events by ID returns a dict with the different telescope IDs:
248 telescope_events_by_id = loader.read_telescope_events_by_id(
249     telescopes=[3, 14],
250     start=None,
251     stop=None,
252     dl1_images=True,
253     dl1_parameters=False,
254     dl1_muons=False,
255     dl2=False,
256     simulated=False,
257     true_images=True,
258     true_parameters=False,
259     instrument=True,
260     observation_info=False,
261 )
262 for tel_id, table in telescope_events_by_id.items():
263     print(f"Telescope ID: {tel_id} \n", table, "\n")
Telescope ID: 3
 obs_id event_id tel_id ...        time       tels_with_trigger event_type
                       ...
------ -------- ------ ... ----------------- ----------------- ----------
     1     5105      3 ... 1604409973.646828    False .. False         32
Telescope ID: 14
 obs_id event_id tel_id ...        time       tels_with_trigger event_type
                       ...
------ -------- ------ ... ----------------- ----------------- ----------
     1     4009     14 ...  1604409937.84765    False .. False         32
     1     5105     14 ... 1604409973.646828    False .. False         32
read_telescope_events_chunkedread_telescope_events_by_type_chunkedread_telescope_events_by_id_chunked
are also available.
Reading e.g. simulation- or observation-information:
Now you have astropy.table s including all the relevant data you need
for your analyses.
Total running time of the script: (0 minutes 6.766 seconds)