-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove use of thread_local from ModuleAllocMonitor #46606
Conversation
Problems were seen related to thread local algorithm being called recursively. Replaced thread local with static allocation assignment using thread ids.
please test |
cms-bot internal usage |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46606/42504 |
A new Pull Request was created by @Dr15Jones for master. It involves the following packages:
@Dr15Jones, @makortel, @smuzaffar can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
+1 Size: This PR adds an extra 16KB to repository Comparison SummarySummary:
|
I see the approach is similar to #45150, but not exactly the same. Why does I also wonder if it would feasible to share some of these complicated codes between here and |
The In this PR, each thread must have its own info for a long period of time (i.e. the time period for a module processing an event). If I were to use a mutex to resolve any possible hash collisions of the thread ids, then we could potentially slow down the job a great deal as threads would have to wait a long time. Also, if for some reason an end service signal did not fire so the lock was never released the job would probably deadlock. Therefore assigning permanent indices into the allocation info containers for each thread id seemed to be the potentially faster and safer option. In this code the first event to hash into the container gets the 'O(1)' access and any subsequent threads that have the same hash get relegated to the linear lookup access. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a note in the README.md that the AllocMonitor implementations should not use thread_local
, maybe with a pointer to ModuleAllocMonitor
if per-thread storage is needed? (although if that happens I'd like to consider stronger a shared implementation)
v = kUnusedEntry; | ||
} | ||
for (auto& v : hashed_threads_) { | ||
v = ++entry; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why these free "hashed threads" are marked differently from the "extra threads"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this special value for a free "hashed thread" mean there is an assumption the thread ID can not be smaller than kHashedEntries
? (probably it is a good assumption, but nevertheless)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the hashed thread all the thread ids have been modulo-ed by kHashedEntries
. The value initially assigned to the hashed_threads_
entry is actually a value that is impossible to be allowed in that spot with such a modulo (i.e. it is always 1 larger than the smallest allowed value in that spot).
Given the extra thread is linearly search, there is notvalue that can be safely inserted as the initial value for an entry. I assumed that no thread would be assigned ~entry_type(0)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, got it. Could you add these comments to code itself (for the next reader)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the comments.
README.md was updated with a warning about |
please test |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46606/42535 |
Pull request #46606 was updated. @Dr15Jones, @cmsbuild, @makortel, @smuzaffar can you please check and sign again. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-46606/42537 |
please test |
Pull request #46606 was updated. @Dr15Jones, @makortel, @smuzaffar can you please check and sign again. |
+1 Size: This PR adds an extra 20KB to repository Comparison SummarySummary:
|
Comparison differences are related to #46416 |
+core |
This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @rappoccio, @sextonkennedy, @antoniovilela, @mandrenguyen (and backports should be raised in the release meeting by the corresponding L2) |
I ran the same setup that resulted the failure reported in #45964 description with this PR a few times, and saw no failures. |
+1 |
PR description:
Problems were seen related to thread local algorithm being called recursively. Replaced thread local with static allocation assignment using thread ids.
PR validation:
Tested using step1 of workflow 11834.21 which previously had a segmentation fault when used with ModuleAllocMonitor and multiple threads. Now it worked.
fixes #45964