Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Candcloner bugfixes and follow up mitigation #102

Merged
merged 14 commits into from
Aug 18, 2022
Merged

Conversation

slava77
Copy link

@slava77 slava77 commented Jun 25, 2022

  • consistent handling of seeds sorting, which affected iterations that do not have seed merging (pixelLess)
  • Fixes in the candCloner:
    • properly trace the overlap hits, to not be used in the Kalman updates (these were out of order and were used for Kalman updates)
    • do not add overlap hit chi2 to the candidate score (the overlap hits are meant to be auxiliary during building)

Based on the validation plots http://uaf-10.t2.ucsd.edu/~legianni/check-clonerFIX2/ ,

  • out-of-the-box (OOB) there impact is rather minimal
    • the efficiency is almost unchanged (small increase at very low pt)
    • fakes are up a bit (likely via pixelLess); duplicates are a bit down
    • tracks are slightly longer (more visible in the barrel)
  • the initialStep built tracks are almost unchanged ; average fakes are lower (–5% in some bins) at low pt, but the average vs eta has an increase in the barrel (and some decrease in the endcaps). Tracks are longer by about 0.25 hits
  • highPtTriplet built tracks have up to 5% fractional increase in the efficiency at lower pt, but fakes are up by a few % absolute scale this iteration comparison is using trackDNN candidate/builtTracks selection (need a check without it). Tracks are longer by 0.06 hits on average (the distribution has an increase in the shorter tracks)
  • detachedQuad built tracks are mostly unchanged. Tracks are longer by 0.15 hits
  • detachedTriplet built tracks have up to 10% more (fractional) efficiency below 1 GeV, the fake rate is up by about 5% at low pt as well (it is on average higher for eta < 2, and is lower for eta > 2); the fake rate increase is most visible for d0>0.1 cm. Tracks are longer by 0.13 hits, but there is an increased component of short tracks (5-7 hits).
  • pixelLess is the most affected this iteration comparison is using trackDNN candidate/builtTracks selection (need a check without it):
    • the efficiency is up by a few % below 2 GeV; by about 15% for large displacements (this region is known to be depressed by the DNN selections )
    • the fake rate is up by about 20%
    • the track length is down by 0.2 on average; there is a small component of more longer tracks, but the average is overtaken by a factor of a few more short tracks

So, based on the observations, I think that this PR can not go as is

  • (already clear) we need some mitigation updates in the pixelLess selections to reduce the number of fakes.
  • MTV plots for running without candidate DNN are needed
  • need to get comparisons in the softQCD sample
  • extended comparisons are needed to see more details

osschar added 4 commits June 22, 2022 12:52
1. Loop over all available hits, until max-cand output slots are filled.

2. Make sure Kalman update is done on the main selected hit (not overlap).

3. Do not bump candidate chi2 when adding overlap hits.
@leonardogiannini
Copy link

I'm attaching TTbar and softQCD comparisons without cand. level DNN (in high pT triplet and pixelLess)

blue is CKF, red is baseline without cand DNN and black is after the fix in cand cloner.

*chi2 is also propagated from mkFit to the cand level DNN https://github.com/cms-sw/cmssw/blob/master/RecoTracker/MkFit/plugins/MkFitOutputConverter.cc#L693.

@slava77
Copy link
Author

slava77 commented Jun 27, 2022

  • highPtTriplet (now without the cand DNN) built tracks have fairly small change, perhaps about 0.5-1% higher efficiency for pt< 2 GeV and no substantive change in the fake rate. The tracks are 0.23 hits longer and there is no component with increasing short hits.
    • "byOriginalAlgo" is still almost unchanged
    • recall that the changes were much larger with trackDNN, which points to a larger impact on the candidates (before the final fit)
  • pixelLess built tracks have
    • no clear change in eff vs pt or eta; t
    • he fake rate goes up by 1-2% for pt<2 GeV (the total number of fake tracks is up by about 5% in the same range; the largest increase is around eta 1.2 with 30% more fake tracks)
    • tracks still show an increase in the longer tracks, but also in the short tracks (quality filters need review?)

…o the Kalman update plane in the barrel (can be controlled by KFO_Local_Cov)
@slava77
Copy link
Author

slava77 commented Jul 27, 2022

I pushed the "local" covariance updates to this PR f8f6ada...eecfaba

This effectively applies a restriction of the track state covariance to the plane where the Kalman update is made. The procedure is to rotate the space coordinates to have x orthogonal to the update plane (the same angle of rotation as the one used in the Kalman projection matrix), then zero-out the x component of the covariance and rotate back. The implementation in the code is after the explicit expansion and can be controlled with KFO_Local_Cov flag.

This affects only the barrel Kalman updates; in the endcap the update is already done properly in the xy plane.
In the barrel the propagated covariance is apparently constrained in the tangent plane (fixed path) instead of fixed R. The difference decreases with increasing pt.

In the following, the impact of these commits can be seen between black and orange:

image

image

image

@slava77
Copy link
Author

slava77 commented Jul 28, 2022

it turns out that while transferring from my debugging/testing setup in 12_4_0 to the mergeable in CMSSW version in 12_5_0_pre2, I missed a switch to enable the covariance matrix restriction on the backward fit.
(the summary plots were from a correctly working variant in 12_4_0; the longer story is that the initial validation plots were for a code where the constraint was accidentally enabled ignoring the control flag and then while making it properly controlled by the flag I missed the MkFinder calls for bw fit)

This is now pushed in 7f5d194 in this PR.

@mmasciov

@slava77
Copy link
Author

slava77 commented Aug 3, 2022

as noted on Monday, I have pushed a bugfix in the propToR Jacobian f71fb49.

Looking back, the effective flip of the sign happened in trackreco/mkFit#46 in trackreco/mkFit@a55225a#diff-12bbb3341241b082ec368e3bc565eddc0d144431034349ce475245bffd52ddebL453-R452

Effectively, only the first step of the 5-iteration propagation contributed to the Jacobian derivatives computation. After the fix, each iteration step contributes to the accumulation of affected derivatives.

One side effect is that cov used to compute chi2 (small step propagation from the nominal radius) and cov used in update (large step propagation from the previous hit layer) was inconsistent.

For one specific "friend" (pixelLess candidate with state params x y z 1/pt phi theta: -12.012802 41.732376 -14.565144 1.726228 2.292912 1.974368) prop errors xx, xy, yy:

  • before the fix (notice the quite significant differences in each step)
    • during chi2 comp: 0.051847 0.018698 0.006746
    • during Kalman update, before local projection: 0.0324688 0.0245153 0.0185123
    • after local projection from e3f2c8c earlier in this PR: 4.101098e-02 1.168057e-02 3.326809e-03
  • after the fix:
    • during chi2 comp: 5.755034e-02 1.638089e-02 4.662616e-03
    • during Kalman update before local projection: 5.896209e-02 1.577486e-02 4.220664e-03
    • during Kalman update after local projection: 5.842577e-02 1.664057e-02 4.739495e-03

After this fix the "friend" backward propagation has reasonable chi2 (max at 3.4 per hit), while before the fix some hits were going over chi2 of 200.

Even though the change seems significant, the changes in physics are fairly small, as seen in the incremental comparison with 6-iter mkFit (with DNN on candidates disabled) and (blue is before the fix)

  • ttbar PU50
    • OOB all tracks relatively unchanged (tiny increase in eff also has a small increase in fakes):

image

- pixelLess built tracks look a bit better (more eff, lower fake)

image

are a bit longer

image

.. and probably because of increase in length the fake rate instead increases after DNN selection in allTracksByOriginalAlgo

image

- [SoftQCD_XiFilter](https://slava77.web.cern.ch/slava77/mic/CMSSW_12_4_0-pixmkf/SoftQCD_XiFilter/mkfit-6-nocdnn-candcloner-3-fix-covLocal-noD0filt-43b4019_5e0cebb/)

@slava77
Copy link
Author

slava77 commented Aug 4, 2022

one more commit 4b7c60f ( fix matched layer counting in nHitsByTypeEncoded and nLayersByTypeEncoded). in one example of a candidate in the barrel with valid hits on layers 4 5 6 7 8 9 10 11, which should be 5 matched layers, we counted before the fix only 3 layers and filtered out quite a few candidates.

This fix can technically affect the detachedTriplets, but there the effect seems almost invisible. Most of the visible change is in the pixelLess step and in particular in cases without attached pixel hits. Validation plots with candidate DNN disabled and using our 6-iter mkFit setup; comparing CKF (blue) with baseline as in CMSSW_12_4_0 (red), next-to-last commit (black), and the last commit (orange), without changes in the candidate filters:

  • ttbar PU50
    • pixelLess built tracks vs displacement about a half of inefficiency is recovered without much increase in fakes image
  • SoftQCD
    • OOB all tracks show an improvement in efficiency even with the current DNN image
    • pixelLess built tracks are better at low pt, but still have more work to do image
    • pixelLess built tracks vs vtx position show very significant improvement for large displacement (black->orange) image

@slava77
Copy link
Author

slava77 commented Aug 8, 2022

One more commit is added aad62ce, to account for dphi spread corresponding to the layer thickness. It is used in the selectHitIndices and is enabled only if no holes are allowed in the trajectory. This dphi is still clamped by the range defined with the MC truth "training".

I tried to apply this more generally, but could not find a positive effect:

  • in the barrel still with zero holes almost nothing changes (ttbar and SoftQCD blue: before changes, red endcap, black both barrel and endcap) I'm not sure if this is a feature of the tracks or something is out of place in the phi windows.
  • if the cut is enabled for all iterations, the performance degrades a bit: more fakes and even a bot lower efficiency (incremental diffs: blue before changes, red with zero holes [just pixelLess] and black enabled everywhere : ttbar SoftQCD )
    • e.g. in ttbar detachedQuad image

Plots for the last commit are (comparing ckf blue, mkfit in the release red, before last commit in black, and this commit in orange):

  • ttbar
    • pixelLess builtTracks hit distributions show longer tracks image
    • pixelLess builtTracks eff and fake has a small increase in efficiency below 1 GeV and even some decrease in fakes in that same range image
    • pixelLess built track fakes at large d0 are a bit lower image
  • SoftQCD
    • in pixelLess builtTracks there is some noticeable increase of efficiency below 0.9 GeV image

@slava77 slava77 marked this pull request as ready for review August 13, 2022 19:49
@slava77
Copy link
Author

slava77 commented Aug 13, 2022

I have merged the commit from @mmasciov
and changed this to no-draft.

@slava77
Copy link
Author

slava77 commented Aug 13, 2022

the last commits include:

  • [f229157 and 9f5b62e] relaxed requirement in the pixelLess trajectory filtering to allow tracks with >= 6 matched hits/layers. Soft QCD pixelLess built tracks show the effect of this change from black to orange (compared to CKF in blue and red for what we have without this PR) image
  • [08be050] disable cand DNN selection in the pixelLess iteration. The justification is a bit biased by the physics performance and looking for a default setup that allows to monitor built tracks without this DNN in view of a retraining. The impact is most visible in the SoftQCD efficiency vs vtx displacement in the pixelLess iteration allTracks for sim tracks with pt<0.9 GeV pink has cand DNN, orange doesn't (red is before this PR and blue/black are with CKF pixelLess before/after this PR )image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants