Real-time data processing in the ALICE High Level Trigger at the LHC

At the Large Hadron Collider at CERN in Geneva, Switzerland, atomic nuclei are collided at ultra-relativistic energies. Many final-state particles are produced in each collision and their properties are measured by the ALICE detector. The detector signals induced by the produced particles are digitized leading to data rates that are in excess of 48 GB/$s$. The ALICE High Level Trigger (HLT) system pioneered the use of FPGA- and GPU-based algorithms to reconstruct charged-particle trajectories and reduce the data size in real time. The results of the reconstruction of the collision events, available online, are used for high level data quality and detector-performance monitoring and real-time time-dependent detector calibration. The online data compression techniques developed and used in the ALICE HLT have more than quadrupled the amount of data that can be stored for offline event processing.

 

Comput. Phys. Commun. 242 (2019) 25-48
e-Print: arXiv:1812.08036 | PDF | inSPIRE
CERN-EP-2018-337

Figure 2

Visualization of a heavy-ion collision recorded in ALICE with tracks reconstructed in real time on the GPUs of the HLT.

Figure 3

The \mbox{ALICE} HLT in the data readout scheme during \run{2} In the DAQ system the data flow through the local and the global data concentrators, LDC and GDC, respectively In parallel, HLT ships QA and calibration data via dedicated interfaces

Figure 5

Contribution of the HLT production (dotted line) and development (double-dotted dashed line) clusters to the WLCG between March 2017 and January 2018, with the sum of both contributions shown as the solid line.

Figure 6

Schema of the HLT components The colored boxes represent processes accelerated by GPU/FPGA (green), normal processes (blue), processes that produced HLT output that is stored (dark blue), entities that store data (purple), asynchronous failure-resilient processes (dark red), classical QA components that use the original HLT data flow (brown), input (orange), and sensor data (red) Incoming data are passed through by the \mbox{C-RORC} FPGA cards or processed internally The input nodes locally merge data from all links belonging to one event The compute nodes then merge all fragments belonging to one event and run the reconstruction The bottom of the diagram shows the asynchronous online calibration chain with a feedback loop as described in Section \ref{sec:calibration}

Figure 7

Schematic representation of the geometry of a TPC sector Local $y$ and $z$ coordinates of a charged-particle trajectory are measured at certain $x$ positions of 159 readout rows, providing a chain of spatial points (hits) along its trajectory

Figure 8

Processing time of the hardware cluster finder and the offline cluster finder The measurements were performed on an HLT node (circles, triangles, diamonds), a newer Core-i7 6700K CPU (squares), and on the \mbox{C-RORC} (inverted triangles)

Figure 9

The upper panel shows the distribution of TPC track $\chi^2$ residuals from offline track reconstruction obtained using total cluster charges from offline cluster finder (Offline CF) and different versions of the HLT hardware cluster finder (HWCF) Tracks, reconstructed using the TPC and ITS points, satisfy the following selection criteria: pseudorapidity $|\eta| < 0.8$ and $N_{\text{TPC clusters}} \geq 70$ The ratios of the distributions obtained using the offline cluster finder and the HLT cluster finder are shown in the lower panel.

Figure 10

Separation power ($S_{\pi-e}$) of pions and electrons (minimum ionizing particles, \ie pions at 0.3 to 0.6 GeV/$c$ versus electrons from gamma conversions at 0.35 to 0.5 GeV/$c$) as a function of the track momentum dip angle, where $\tan\mathrm{\lambda} = p_{\mathrm{z}}/p_{\mathrm{T}}$ Comparison of d$E$/d$x$ separation power using total cluster charges from Offline CF and different versions of the HWCF

Figure 11

Cellular automaton track seeding steps a) Neighbor finder. Each cluster at a row $k$ is linked to the best pair of its neighbors from the next and the previous row b) Evolution step. Non-reciprocal links are removed, chains of reciprocal links define the tracklets

Figure 12

Visualization of the pipelined GPU processing of the track reconstruction using multiple CPU cores to feed data to the GPU.

Figure 13

Time required for execution of the tracking algorithm on CPUs and on GPUs as function of the input data size expressed in terms of the number of TPC clusters The lines represent linear fits to the distributions The merging and refitting times are not included in the track finding time.

Figure 14

Speedup of HLT tracking algorithm executed on GPUs and CPUs compared to the offline tracker normalized to a single core and corrected for the serial processing part that the CPU contributes to GPU tracking as a function of the input data size expressed in terms of the number of TPC clusters. The plus markers show the speedup as a function of the number of TPC clusters with the HLT tracking executed on the CPU. The cross(astrisk) markers show the speedup obtained with the tracking executed on a older(newer) GPU.

Figure 15

Tracking efficiency of the HLT and offline trackers as function of the transverse momentum calculated as the ratio of reconstructed tracks and simulated tracks in HIJING generated Pb--Pb events at $\sqrt{s_{\rm{NN}}}$ = $5.02$\,TeV, shown for tracks that are a) primary, b) secondary, c) findable primary, and d) findable secondary Findable tracks are defined as reconstructed tracks that have at least $70$ clusters in the TPC

Figure 16

Mean value and track parameter resolutions of the HLT and offline trackers as function of the transverse momentum measured in HIJING generated Pb--Pb events at $\sqrt{s_{\rm{NN}}}$ = $5.02$\,TeV The resolution of a) $y$ and b) $z$ spatial positions, c) azimuthal angle ($\phi$), d) lambda ($\lambda$), and e) relative transverse momentum are shown.

Figure 17

Time required for the creation, serialization, and deserialization of the Flat ESD vs. the standard ESD for offline analysis as a function of the number of TPC tracks.

Figure 19

Average differences of the TPC cluster position along $z$-axis calculated with drift-velocity correction factors from the online (HLT) and offline calibration The differences are of the order of the intrinsic detector resolution Calibration of the forward and backward halves of the TPC are computed independently The error bands represent the statistical error, along with the $r$ and $\varphi$ dependent differences of online and offline calibration.

Figure 20

Total HLT TPC data compression ratio including improved TPC online cluster finder and Huffman compression in \run{2} on 2017 pp data as a function of the input data size expressed in terms of the number of TPC clusters.

Figure 21

Number of data taking runs terminated due to failure in the HLT during \run{1} and \run{2} since 2011, when the TPC data compression in the HLT was introduced Missing months correspond to long shutdowns, end-of-year shutdowns, commissioning phases for the data compression and recommissioning for updated TPC readout The yearly averages are shown as long tick marks along the right-side y-axis