Transformer-Based Approach to Enhance Positron Tracking Performance in MEG II
Episode

Transformer-Based Approach to Enhance Positron Tracking Performance in MEG II

Dec 22, 20257:41
hep-exphysics.data-an
No ratings yet

Abstract

We developed a Transformer-based pattern recognition method for positron track reconstruction in the MEG II experiment. The model acts as a classifier to remove pileup hits in the MEG II drift chamber, which operates under a high pileup occupancy of 35 - 50 %. The trained model significantly improved hit purity, leading to enhancements in tracking efficiency and resolution by 15 % and 5 %, respectively, at a muon stopping rate of $5\times 10^7 μ$/sec. This improvement translates into an approximately 10 % increase in the sensitivity of the $μ\to eγ$ branching ratio measurement.

Summary

This paper introduces a novel Transformer-based machine learning method for positron track reconstruction in the MEG II experiment, which searches for the rare decay μ → eγ. The primary challenge is dealing with high pileup occupancy (35-50%) in the cylindrical drift chamber (CDCH), which degrades tracking efficiency. The Transformer model acts as a classifier to distinguish signal hits from pileup hits, improving the purity of hits used for track reconstruction. This is achieved by leveraging global patterns of pTC (pixelated scintillation timing counter) and CDCH hits across different turn segments of the positron trajectory. The model takes pTC cluster information and all CDCH hits as input and outputs the probability of each CDCH hit belonging to the positron track associated with the pTC cluster. The trained Transformer model significantly enhances hit purity, leading to a 15% improvement in tracking efficiency and a 5% improvement in resolution at a muon stopping rate of 5x10^7 μ/sec. The model architecture uses separate attention blocks for pTC and CDCH hits, with cross-attention to match CDCH hits to pTC hits. The improved tracking performance allows the MEG II experiment to increase the muon stopping rate and reprocess existing data, resulting in an approximate 10% increase in the sensitivity of the μ → eγ branching ratio measurement. The model inference is implemented in C++ using the ONNX framework and runs on CPU processors, with the hit filtering reducing the total computing time for the entire positron reconstruction sequence to 70-80% of the conventional approach.

Key Insights

  • A Transformer-based architecture can effectively classify hits in a high-pileup environment within a particle detector, achieving 98% signal-hit efficiency while discarding 84% of pileup hits at a muon stopping rate of 5x10^7 μ/sec.
  • The model leverages global patterns between distant hits belonging to different turn segments of the positron trajectory, making it more robust against pileup compared to local algorithms that connect nearby CDCH hits on the same turn.
  • The input feature engineering includes novel elements like conformal coordinates and turn-segment-dependent parameters (z_turn, phi_turn) calibrated using Michel positron tracks to exploit the detector's rotational symmetry and improve attention between distant hits.
  • The model is not end-to-end; it only distinguishes signal hits from pileup hits and does not directly estimate track kinematics. The selected hits are then used with conventional track candidate construction and fitting algorithms.
  • While ML-reconstructed tracks that were also found by the conventional algorithm exhibit a 10% resolution improvement, newly identified tracks have a 10% worse resolution than the conventional average, suggesting the ML approach primarily recovers lower-quality tracks.
  • The integration of the trained Transformer model into the C++-based offline reconstruction framework using ONNX allows for efficient CPU-based inference, contributing to a reduction in the total computing time.
  • The model was trained using a combination of MC samples (90%) and data samples (10%) to mitigate potential MC-specific biases, with fewer than 5% of the data samples being mislabeled.

Practical Implications

  • This research provides a valuable methodology for improving track reconstruction performance in high-pileup environments, applicable to other particle physics experiments facing similar challenges.
  • The Transformer-based hit filtering approach can be adapted to other detector technologies and experimental setups by adjusting the input features and model architecture to match the specific detector geometry and data characteristics.
  • The improved tracking efficiency and resolution achieved in the MEG II experiment directly translate to increased sensitivity in the search for the μ → eγ decay, potentially leading to new discoveries in particle physics.
  • The successful integration of a deep learning model into a production-level data processing pipeline demonstrates the feasibility of using ML techniques for real-time data analysis in high-energy physics experiments.
  • Future research directions include developing an end-to-end Transformer model capable of directly estimating track kinematics and resolving hit-level ambiguities, which could further improve tracking performance and reduce computational cost.

Links & Resources

Authors