Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Move initialization of YieldProcessorNormalized to the finalizer thread #14058

Merged
merged 2 commits into from
Sep 19, 2017

Conversation

kouvel
Copy link
Member

@kouvel kouvel commented Sep 19, 2017

Fixes https://github.com/dotnet/coreclr/issues/13984

  • Also moved relevant functions out of the Thread class as requested in the issue
  • For some reason, after moving the functions out of the Thread class, YieldProcessorNormalized was not getting inlined anymore. It seems to be important to have it be inlined such that the memory loads are hoisted out of outer loops. To remove the dependency on the compiler to do it (even with forceinline it's not possible to hoist sometimes, for instance InterlockedCompareExchnage loops), changed the signatures to do what is intended.

@kouvel kouvel added this to the 2.1.0 milestone Sep 19, 2017
@kouvel kouvel self-assigned this Sep 19, 2017
Fixes https://github.com/dotnet/coreclr/issues/13984
- Also moved relevant functions out of the Thread class as requested in the issue
- For some reason, after moving the functions out of the Thread class, YieldProcessorNormalized was not getting inlined anymore. It seems to be important to have it be inlined such that the memory loads are hoisted out of outer loops. To remove the dependency on the compiler to do it (even with forceinline it's not possible to hoist sometimes, for instance InterlockedCompareExchnage loops), changed the signatures to do what is intended.

// Defaults are for when InitializeYieldProcessorNormalized has not yet been called or when no measurement is done, and are
// tuned for Skylake processors
int g_yieldsPerNormalizedYield = 1; // 9 for pre-Skylake
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this "9 for pre-Skylake" comment mean?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For pre-Skylake processors it would be about 9 yields per normalized yield

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. It'd be helpful to expand the comment to highlight that this is an estimate for what the value will be, not that we're somewhere setting this to a hardcoded "9".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -744,6 +744,8 @@ DWORD WINAPI FinalizerThread::FinalizerThreadStart(void *args)
#endif
GetFinalizerThread()->SetBackground(TRUE);

EnsureYieldProcessorNormalizedInitialized();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I understanding the change correctly that until the finalizer thread gets to this point, spinning will be done using a default value, and then once this runs, that value will be updated appropriately based on the system? I think that makes sense, just wanted to confirm.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's right

stephentoub
stephentoub previously approved these changes Sep 19, 2017
@kouvel
Copy link
Member Author

kouvel commented Sep 19, 2017

I just realized that the scheme doesn't work for the back-off version, I'll change it around a bit

@kouvel
Copy link
Member Author

kouvel commented Sep 19, 2017

Going to revoke your approval as a result

@kouvel kouvel dismissed stephentoub’s stale review September 19, 2017 15:11

Need to change it a bit

@stephentoub
Copy link
Member

Thanks

@tarekgh
Copy link
Member

tarekgh commented Sep 19, 2017

return optimalMaxNormalizedYieldsPerSpinIteration;

any reason why we cannot just return g_optimalMaxNormalizedYieldsPerSpinIteration?


Refers to: src/vm/comsynchronizable.cpp:1641 in ffab10f. [](commit_id = ffab10f, deletion_comment = False)

@kouvel
Copy link
Member Author

kouvel commented Sep 19, 2017

any reason why we cannot just return g_optimalMaxNormalizedYieldsPerSpinIteration?

I think it's ok to do that for this case, the other QCALL functions are organized in this way I think to make sure all of the time-consuming work happens between BEGIN_QCALL and END_QCALL so I just followed the same pattern.

@kouvel
Copy link
Member Author

kouvel commented Sep 19, 2017

@dotnet-bot test Windows_NT x64 Release Priority 1 Build and Test

@kouvel kouvel merged commit ca01314 into dotnet:master Sep 19, 2017
@kouvel kouvel deleted the MoveYpNormalized branch September 19, 2017 20:54
kouvel added a commit to kouvel/coreclr that referenced this pull request Sep 19, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants