Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speedup SiStripClusterizer(FromRaw) using ThreeThresholdAlgorithm #47061

Merged
merged 4 commits into from
Jan 9, 2025

Conversation

slava77
Copy link
Contributor

@slava77 slava77 commented Jan 8, 2025

The primary goal was to speedup the full-unpacking configuration of SiStripClusterizerFromRaw.

The following relatively straightforward updates are made:

  • concrete ThreeThresholdAlgorithm is passed using templates and most methods of ThreeThresholdAlgorithm are inlined
  • per-strip noise and quality bit values are precomputed in creation of SiStripClusterizerConditions (the downside here is around 32MB of memory in conditions data and the cost of precomputation equal to around 10 events of strip unpacking cost)

Overall SiStripClusterizerFromRaw is faster by 27% on ttbar relval Run3 MC, running with full unpacking (measured with callgrind on 300 events on the HLT config in CMSSW_14_1_4_patch5):

  • inlines and templates give 14%
  • conditions precomputation another 13%

No differences in physics results are expected.

In case backports are needed, the commits are made so that the backport is trivial down to 14_1_X.

@mmasciov

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 8, 2025

cms-bot internal usage

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 8, 2025

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 8, 2025

A new Pull Request was created by @slava77 for master.

It involves the following packages:

  • CalibFormats/SiStripObjects (alca)
  • RecoLocalTracker/SiStripClusterizer (reconstruction)

@atpathak, @cmsbuild, @consuegs, @jfernan2, @mandrenguyen, @perrotta can you please review it and eventually sign? Thanks.
@GiacomoSguazzoni, @VinInn, @VourMa, @alesaggio, @echabert, @felicepantaleo, @gbenelli, @gpetruc, @jlidrych, @missirol, @mmusich, @mtosi, @robervalwalsh, @rovere, @rsreds, @threus, @tocheng, @yduhm, @yuanchao this is something you requested to watch as well.
@antoniovilela, @mandrenguyen, @rappoccio, @sextonkennedy you are the release manager for this.

cms-bot commands are listed here

@slava77
Copy link
Contributor Author

slava77 commented Jan 8, 2025

@cmsbuild please test

@mmusich
Copy link
Contributor

mmusich commented Jan 8, 2025

type performance-improvements

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 8, 2025

+1

Size: This PR adds an extra 32KB to repository
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-06aba9/43681/summary.html
COMMIT: 49e294b
CMSSW: CMSSW_15_0_X_2025-01-08-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/47061/43681/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

@slava77
Copy link
Contributor Author

slava77 commented Jan 8, 2025

Reco comparison results: 351 differences found in the comparisons

DQMHistoTests: Total failures: 6113

these are apparently from non-reproducible workflows: .7 (all-mkfit), and phase-2

@jfernan2
Copy link
Contributor

jfernan2 commented Jan 9, 2025

+1

@perrotta
Copy link
Contributor

perrotta commented Jan 9, 2025

+alca

  • Memory increase in SiStripClusterizerConditions is 16x768 bool's and 16x768 uint16_t's, which is manageable I guess

@cmsbuild
Copy link
Contributor

cmsbuild commented Jan 9, 2025

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @sextonkennedy, @antoniovilela, @rappoccio, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2)

bool allBadBetween(uint16_t L, const uint16_t& R) const {
while (++L < R && bad(L)) {
};
return L == R;
}
static constexpr uint16_t kMaxStrips = 768;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could one think about sparing some empty entries by using vectors and resizing at construction time with the correct number of strips (4 APVs vs 6 APVs)? The size is known via the detid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mmusich
do you happen to know if the number of APVs is strictly connected to the already available variables in the SiStripClusterizerConditions::emplace_back method: invGains or connections ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think invGains.size() should know about it:

std::vector<float> invGains;
invGains.reserve(6);
std::transform(
gainRange.first, gainRange.second, std::back_inserter(invGains), [](auto gain) { return 1.f / gain; });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using vectors

I expect that access to vector<bool> is slower than array<bool. I guess I can use vector<uint8_t> or char to have a matching performance

@mandrenguyen
Copy link
Contributor

+1
Perhaps we take note of the following comment from Marco as a potential follow-up improvement:
#47061 (comment)

@cmsbuild cmsbuild merged commit d288c81 into cms-sw:master Jan 9, 2025
11 checks passed
@slava77
Copy link
Contributor Author

slava77 commented Jan 9, 2025

Memory increase in SiStripClusterizerConditions is 16x768 bool's and 16x768 uint16_t's, which is manageable I guess

I'm not sure why 16x; there are 14.5K strip modules.

@mandrenguyen
Copy link
Contributor

@mmusich
Copy link
Contributor

mmusich commented Jan 10, 2025

Could the following failure be related to this PR?

nope, see #46853 (comment)

@mandrenguyen
Copy link
Contributor

Could the following failure be related to this PR?

nope, see #46853 (comment)

Ah, thanks. Sorry for the noise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants