Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

VourMa · 2024-01-25T10:50:57Z

Lately, I have been working on central configurations in CMSSW (|η| < 2.5). In this region, we expect that the LST does the full building job, i.e. efficiency and low fake+duplicate rate, without any additional CKF aid. The following plot shows 4 different configurations, all of which apply duplicate cleaning for pLS TCs and include only quad pLS TCs:

BLUE: LST-only building without T5s
RED: LST-only building with T5s
BLACK: LST+CKF building without T5s
ORANGE: LST+CKF building with T5s

Comparing RED to ORANGE, it seems like there is an explosion of the fake+duplicate rate when the LST-only building is used. The breakdown shows that this is due to an increased duplicate rate. Pixel track plots show that the quad pLS, which are the only ones used here, have a very low duplicate rate. That means that the increased duplicate rate of LST comes from imperfect cross cleaning with OT objects.

The first suspect are T5s, so BLUE and BLACK compare configurations without T5. Ignoring the loss of efficiency, it is observed that the fake+duplicate rate (according to the breakdown, the latter) is decreased by a lot in this case, and in fact becomes better for LST-only building (BLUE) than the LST+CKF building (BLACK). This leads me to the following hypothesis:
We have a lot of pLS-T5 duplicates. When CKF is used to complement the building, the pLSs are merged in a single object with their duplicate T5s, hence dramatically reducing the duplicate rate. On top of that, even though the efficiency is similar in the LST-only and the LST+CKF configurations, tracks are on average longer and have better resolution in the latter case. All of this is supported by the MTV plots. This implies that the pT5 building in LST can be improved.

The following plot shows how the physics performance changes when the ΔR^2 value used for the pLS-T5 cross cleaning changes from the default of 0.001 to 0.01. That implies that there is possibly some room for optimization also in the cross cleaning step.

slava77 · 2024-01-25T13:11:32Z

The following plot shows how the physics performance changes when the ΔR^2 value used for the pLS-T5 cross cleaning changes from the default of 0.001 to 0.01. That implies that there is possibly some room for optimization also in the cross cleaning step.

What are the definitions of eta and phi in each case ... and where is the cross-cleaning code (just for a quick reference)? These have to be done at the same reference point.

VourMa · 2024-01-25T13:44:56Z

where is the cross-cleaning code (just for a quick reference)?

TrackLooper/SDL/TrackCandidate.h

Lines 304 to 387 in 50d8a25

    
           struct crossCleanpLS { 
        
             template <typename TAcc> 
        
             ALPAKA_FN_ACC void operator()(TAcc const& acc, 
        
                                           struct SDL::modules modulesInGPU, 
        
                                           struct SDL::objectRanges rangesInGPU, 
        
                                           struct SDL::pixelTriplets pixelTripletsInGPU, 
        
                                           struct SDL::trackCandidates trackCandidatesInGPU, 
        
                                           struct SDL::segments segmentsInGPU, 
        
                                           struct SDL::miniDoublets mdsInGPU, 
        
                                           struct SDL::hits hitsInGPU, 
        
                                           struct SDL::quintuplets quintupletsInGPU) const { 
        
               using Dim = alpaka::Dim<TAcc>; 
        
               using Idx = alpaka::Idx<TAcc>; 
        
               using Vec = alpaka::Vec<Dim, Idx>; 
        
               Vec const globalThreadIdx = alpaka::getIdx<alpaka::Grid, alpaka::Threads>(acc); 
        
               Vec const gridThreadExtent = alpaka::getWorkDiv<alpaka::Grid, alpaka::Threads>(acc); 
        
               int pixelModuleIndex = *modulesInGPU.nLowerModules; 
        
               unsigned int nPixels = segmentsInGPU.nSegments[pixelModuleIndex]; 
        
               for (int pixelArrayIndex = globalThreadIdx[2]; pixelArrayIndex < nPixels; 
        
                    pixelArrayIndex += gridThreadExtent[2]) { 
        
                 if (!segmentsInGPU.isQuad[pixelArrayIndex] || segmentsInGPU.isDup[pixelArrayIndex]) 
        
                   continue; 
        
                 float eta1 = segmentsInGPU.eta[pixelArrayIndex]; 
        
                 float phi1 = segmentsInGPU.phi[pixelArrayIndex]; 
        
                 unsigned int prefix = rangesInGPU.segmentModuleIndices[pixelModuleIndex]; 
        
                 int nTrackCandidates = *(trackCandidatesInGPU.nTrackCandidates); 
        
                 for (int trackCandidateIndex = globalThreadIdx[1]; trackCandidateIndex < nTrackCandidates; 
        
                      trackCandidateIndex += gridThreadExtent[1]) { 
        
                   short type = trackCandidatesInGPU.trackCandidateType[trackCandidateIndex]; 
        
                   unsigned int innerTrackletIdx = trackCandidatesInGPU.objectIndices[2 * trackCandidateIndex]; 
        
                   if (type == 4)  // T5 
        
                   { 
        
                     unsigned int quintupletIndex = innerTrackletIdx;  // T5 index 
        
                     float eta2 = __H2F(quintupletsInGPU.eta[quintupletIndex]); 
        
                     float phi2 = __H2F(quintupletsInGPU.phi[quintupletIndex]); 
        
                     float dEta = alpaka::math::abs(acc, eta1 - eta2); 
        
                     float dPhi = SDL::calculate_dPhi(phi1, phi2); 
        
                     float dR2 = dEta * dEta + dPhi * dPhi; 
        
                     if (dR2 < 1e-3f) 
        
                       segmentsInGPU.isDup[pixelArrayIndex] = true; 
        
                   } 
        
                   if (type == 5)  // pT3 
        
                   { 
        
                     int pLSIndex = pixelTripletsInGPU.pixelSegmentIndices[innerTrackletIdx]; 
        
                     int npMatched = checkPixelHits(prefix + pixelArrayIndex, pLSIndex, mdsInGPU, segmentsInGPU, hitsInGPU); 
        
                     if (npMatched > 0) 
        
                       segmentsInGPU.isDup[pixelArrayIndex] = true; 
        
                     int pT3Index = innerTrackletIdx; 
        
                     float eta2 = __H2F(pixelTripletsInGPU.eta_pix[pT3Index]); 
        
                     float phi2 = __H2F(pixelTripletsInGPU.phi_pix[pT3Index]); 
        
                     float dEta = alpaka::math::abs(acc, eta1 - eta2); 
        
                     float dPhi = SDL::calculate_dPhi(phi1, phi2); 
        
                     float dR2 = dEta * dEta + dPhi * dPhi; 
        
                     if (dR2 < 0.000001f) 
        
                       segmentsInGPU.isDup[pixelArrayIndex] = true; 
        
                   } 
        
                   if (type == 7)  // pT5 
        
                   { 
        
                     unsigned int pLSIndex = innerTrackletIdx; 
        
                     int npMatched = checkPixelHits(prefix + pixelArrayIndex, pLSIndex, mdsInGPU, segmentsInGPU, hitsInGPU); 
        
                     if (npMatched > 0) { 
        
                       segmentsInGPU.isDup[pixelArrayIndex] = true; 
        
                     } 
        
                     float eta2 = segmentsInGPU.eta[pLSIndex - prefix]; 
        
                     float phi2 = segmentsInGPU.phi[pLSIndex - prefix]; 
        
                     float dEta = alpaka::math::abs(acc, eta1 - eta2); 
        
                     float dPhi = SDL::calculate_dPhi(phi1, phi2); 
        
                     float dR2 = dEta * dEta + dPhi * dPhi; 
        
                     if (dR2 < 0.000001f) 
        
                       segmentsInGPU.isDup[pixelArrayIndex] = true; 
        
                   } 
        
                 } 
        
               } 
        
             } 
        
           };

The ΔR^2 I changed is here:

TrackLooper/SDL/TrackCandidate.h

Line 347 in 50d8a25

if (dR2 < 1e-3f)

What are the definitions of eta and phi in each case?

That's a good question, and one I will need to look around to answer, as the code is probably scattered throughout multiple files. If anyone knows off-hand, please come to the rescue, otherwise I will look for it within my day.

VourMa · 2024-04-03T00:27:43Z

Another idea about how we could deal with this issue is to run another linking iteration between the T5s and pLSs to be added in the TC collection with loosened selections compared to the original pT5 creation selections.

VourMa added the enhancement New feature or request label Jan 25, 2024

VourMa added the good first issue Good for newcomers label Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

VourMa commented Jan 25, 2024

slava77 commented Jan 25, 2024

VourMa commented Jan 25, 2024

VourMa commented Apr 3, 2024

Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

Optimization for pT5 building and for pLS-T5 cross cleaning needed #364

Comments

VourMa commented Jan 25, 2024

slava77 commented Jan 25, 2024

VourMa commented Jan 25, 2024

VourMa commented Apr 3, 2024