CBM Online Meeting

Name: CBM Online Meeting
Start: 2025-03-20T15:00:00+01:00
End: 2025-03-20T16:00:00+01:00
Location: Virtual

Donnerstag 20.03.2025, 15:00 → 16:00 Europe/Berlin

Virtual

Beschreibung

Zoom: see below

91452703794

Volker Friese

Ausblenden

● Updates / Progress

Nothing to report.

● Multi-threading performance

(B. Sobol, see presentation)

The number of used threads is steered by

the environment variable OMP_NUM_THREADS,
which is overridden by the command line argument --omp <number_of_threads>,
which is overridden for each algorithm by the usage of num_threads.

The run time for the online binary was investigated on a local machine with varying number of threads (from 1 to 16). The dependence of the runtime suggests that 75% of the code is multi-threaded, such that a maximal acceleration by a factor of 4 can be obtained. These findings are basically confirmed by tests by D. Hutter on the Virgo cluster.

The runtime behaviour should be investigated differentially for each processing step (unpacking, local reconstruction, tracking, trigger) for a better understanding.

● Comparison of ROOT and BOOST streamers

(J. de Cuveland, see presentation) The performance of the ROOT and BOOST streamers (plus the Cereal streamer) was investigated in a Bachelor thesis (Jonathan Werle, University Frankfurt) with CBM data classes (see materials). Contrary to our expectations, ROOT serialization was found to be best performing among the investigates streamers. This holds for both the data throughput and the compression rate.

There are several caveats: the tests were not performed on large binary objects like e.g., timeslices. However, writing these to file is not part of our performance-critical main data path. The absolute throughput numbers (around 100 MiB/s) appears low in comparison to the disk speed; however, data compression is already included in this number.

Based on these findings, we should consider to write ROOT files as result of the online data processing. This would ease the integration with the offline, ROOT-based analysis. In future, using RNTuple instead of TTree will improve on the performance.

● Convention: Header names and inclusion

The final cause of the problem seems to be the usage of pragma once instead of the conventional, manual multiple inclusion protection.

In general, we should either

maintain the online (algo) and offline (cbmroot) code in different projects / repositories (disfavoured bacuse of the additional maintenance overhead and snychronization efforts), or
agree on basic coding directives:
- allow or forbid non-unique filenames, e.g., algo/detectors/sts/Hit.h and algo/detectors/trd/Hit.h;
- have the relative path to the header file explicitly in the include statement or specify the used folders in the CMakeList.txt;
- install the header files flatly (in a single directory) or reflect the directory structure of the source code in the installation.

In existing, professional projects, both approaches can be found, so there is no directive for a choice. cbmroot developed along the lines of ROOT, AliRoot and FairRoot and thus reflects the full context in the file name, installs the header files flatly, and does not use relative paths in the include statements. The online code (algo directory) chose the opposite approach.

To be discussed in the next time and decided by the Computing Board.

● Other business

(D. Hutter) A script to run the online binary as batch job on Virgo on a list of input (tsa) files is provided in /lustre/cbm/online/bin/start_run_async. It launches one instance of the binary per job.

Es gibt eine zugehörige Notiz zu dieser Veranstaltung Anzeigen.

- 15:00 → 15:10
  
  Updates / Progress 10m
  
  Nothing to report.
- 15:10 → 15:25
  Multi-threading performance 15m
  
  How do we enable multi-threading in the online binary and does it perform as expected?
  
  Sprecher: Bartosz Sobol (Jagiellonian University)
  
  cbm_omp_2003.pdf
  (B. Sobol, see presentation)
  
  The number of used threads is steered by
  
  the environment variable OMP_NUM_THREADS,
  
  which is overridden by the command line argument --omp <number_of_threads>,
  
  which is overridden for each algorithm by the usage of num_threads.
  
  The run time for the online binary was investigated on a local machine with varying number of threads (from 1 to 16). The dependence of the runtime suggests that 75% of the code is multi-threaded, such that a maximal acceleration by a factor of 4 can be obtained. These findings are basically confirmed by tests by D. Hutter on the Virgo cluster.
  
  The runtime behaviour should be investigated differentially for each processing step (unpacking, local reconstruction, tracking, trigger) for a better understanding.
- 15:25 → 15:40
  
  Comparison of ROOT and BOOST streamers 15m
  
  We avoided using the ROOT data model and its streamer in the online binary for performance reasons. This decision makes the online-offline integration more difficult. It was not based on an actual performance comparison. Such a comparison was recently performed in a thesis at FIAS. We should review and digest the results.
  
  N.b.: The topic will probably become obsolete with the introduction of the new ROOT data model (RNTuple). Nevertheless, we should at least have a look at the results.
  
  Sprecher: Dirk Hutter (Frankfurt Institute for Advanced Studies (FIAS)), Jan de Cuveland (Frankfurt Institute for Advanced Studies (FIAS))
  
  2025-03-20 Serializers.key
  
  2025-03-20 Serializers.pdf
  
  Thesis Jonathan Werle
  
  (J. de Cuveland, see presentation) The performance of the ROOT and BOOST streamers (plus the Cereal streamer) was investigated in a Bachelor thesis (Jonathan Werle, University Frankfurt) with CBM data classes (see materials). Contrary to our expectations, ROOT serialization was found to be best performing among the investigates streamers. This holds for both the data throughput and the compression rate.
  
  There are several caveats: the tests were not performed on large binary objects like e.g., timeslices. However, writing these to file is not part of our performance-critical main data path. The absolute throughput numbers (around 100 MiB/s) appears low in comparison to the disk speed; however, data compression is already included in this number.
  
  Based on these findings, we should consider to write ROOT files as result of the online data processing. This would ease the integration with the offline, ROOT-based analysis. In future, using RNTuple instead of TTree will improve on the performance.
- 15:40 → 15:55
  Convention: Header names and inclusion 15m
  
  In the algo library, we use non-unique class and header names, e.g. cbm::algo::sts::hit and cbm::algo::trd::hit. Consequently, the inclusion of header files must use relative paths. This leads to problems with using these classes offline in ROOT, see https://redmine.cbm.gsi.de/issues/3577.
  
  We shall discuss and implement a clear directive on how to name header files and how to properly include them.
  
  Sprecher: Pierre-Alain Loizeau (GSI Helmholtzzentrum für Schwerionenforschung GmbH(GSI)), Volker Friese (GSI Helmholtzzentrum für Schwerionenforschung GmbH(GSI))
  
  Issue 3577
  The final cause of the problem seems to be the usage of pragma once instead of the conventional, manual multiple inclusion protection.
  
  In general, we should either
  
  maintain the online (algo) and offline (cbmroot) code in different projects / repositories (disfavoured bacuse of the additional maintenance overhead and snychronization efforts), or
  
  agree on basic coding directives:
  
  allow or forbid non-unique filenames, e.g., algo/detectors/sts/Hit.h and algo/detectors/trd/Hit.h;
  
  have the relative path to the header file explicitly in the include statement or specify the used folders in the CMakeList.txt;
  
  install the header files flatly (in a single directory) or reflect the directory structure of the source code in the installation.
  
  In existing, professional projects, both approaches can be found, so there is no directive for a choice. cbmroot developed along the lines of ROOT, AliRoot and FairRoot and thus reflects the full context in the file name, installs the header files flatly, and does not use relative paths in the include statements. The online code (algo directory) chose the opposite approach.
  
  To be discussed in the next time and decided by the Computing Board.
- 15:55 → 16:00
  
  Other business 5m
  
  (D. Hutter) A script to run the online binary as batch job on Virgo on a list of input (tsa) files is provided in /lustre/cbm/online/bin/start_run_async. It launches one instance of the binary per job.