Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify the behavior of @threads for #44168

Merged
merged 6 commits into from
Mar 3, 2022
Merged

Conversation

tkf
Copy link
Member

@tkf tkf commented Feb 14, 2022

The introduction of the new scheduler for @threads #43919 triggered some discussions:

which made me realize that the current docstring does not explain what it does clearly and accurately. So, I tried to clarify the docstring and also define the properties of the :dynamic scheduler a bit more to help users understand when to use it. Along the way, I also re-organized the docstring to hopefully make it easy to navigate.

ping @carstenbauer @Moelf

@tkf tkf added the multithreading Base.Threads and related functionality label Feb 14, 2022

See also: [`@spawn`](@ref Threads.@spawn),
`pmap` in [`Distributed`](@ref man-distributed), and
`BLAS.set_num_threads` in [`LinearAlgebra`](@ref man-linalg).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure the reference to set_num_threads is helpful?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought it was weird but I assumed someone wanted to add it. But looking at "git blame", it was added in #38606 which was a rather very broad "catch-all" patch. So I think dropping here makes sense.

@mcabbott Do you have some particular reasons to keep it?

One way to make it relevant is to link to @carstenbauer's https://carstenbauer.github.io/ThreadPinning.jl/dev/explanations/blas/ or add something similar to the manual.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong feelings. The intention was to point out that what's inside your loop may already be multi-threaded, and that you may want to disable that. But perhaps just a link doesn't help much unless you already know? Something like a manual section might be clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tkf I think that something along the lines of what I wrote there should, in some form, go into the manual. AFAIK, this is also @ViralBShah's opinion (see here).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I opened issue #44201 for tracking this. We can add a link once we add a manual section.

@vchuravy vchuravy added the docs This change adds or pertains to documentation label Feb 14, 2022
@vchuravy vchuravy added this to the 1.8 milestone Feb 14, 2022
base/threadingconstructs.jl Outdated Show resolved Hide resolved
@KristofferC KristofferC added the backport 1.8 Change should be backported to release-1.8 label Feb 16, 2022
@IanButterworth
Copy link
Member

IanButterworth commented Feb 20, 2022

Reading this, I am again wondering whether :dynamic is the right name..

@tkf your :uniform suggestion is making more sense now that this new behavior is default. i.e. if it wasn't default, I think the name isn't evocative enough, but as default it doesn't need to be.

So thinking ahead:

  • :static - legacy, and makes sense
  • :uniform - the new (default) behavior, for uniform division of work
  • :dynamic - reserve for potential future load-balanced option?

@tkf
Copy link
Member Author

tkf commented Feb 23, 2022

I prefer to focus on documentation in this PR, although I tried to structure it such that adding a new scheduler is easy. I'm not planning to do it myself, but I think we can make :dynamic more dynamic in a separate PR.

Comment on lines +152 to +156
iteration space. Thus, `@threads :dynamic for x in xs; f(x); end` is typically more
efficient than `@sync for x in xs; @spawn f(x); end` if `length(xs)` is significantly
larger than the number of the worker threads and the run-time of `f(x)` is relatively
smaller than the cost of spawning and synchronizaing a task (typically less than 10
microseconds).
Copy link
Member Author

@tkf tkf Feb 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"typically less than 10 microseconds" is from

julia> Threads.nthreads()
2

julia> @benchmark wait(Threads.@spawn nothing)
BenchmarkTools.Trial: 10000 samples with 4 evaluations.
 Range (min  max):  1.455 μs  53.389 μs  ┊ GC (min  max): 0.00%  0.00%
 Time  (median):     9.047 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   9.004 μs ±  1.242 μs  ┊ GC (mean ± σ):  0.00% ± 0.00%

                                        ██▁   ▁▄▄▂
  ▂▁▁▂▁▁▁▁▂▁▂▁▁▁▂▁▂▂▁▁▁▂▂▂▂▂▂▂▂▂▃▃▄▃▆▆▄▆███▆▅▆█████▆▅▄▄▅▅▅▄▃ ▃
  1.46 μs        Histogram: frequency by time        11.5 μs <

 Memory estimate: 480 bytes, allocs estimate: 5.

I'm looking at the median (and the mean, which is not very different from the median) as a "typical" case. It's more useful than min since the multi-queue is a random and concurrent algorithm.

This is on Linux. Can someone check it with other OSes? Note that it's important to use nthreads() > 1 since nthreads() == 1 is special-cased.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the second part of the statement be reversed?

and the run-time of f(x) is relatively smaller larger than the cost of spawning and synchronizing a task (typically less than 10 microseconds).

Copy link
Contributor

@Moelf Moelf Mar 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's correct as it is. If the cost of f(x) is larger, then the cost of @spawn would be negligible, so you won't see @threads :dynamic for being more efficient.

@KristofferC
Copy link
Member

Bump, can this be merged?

@oscardssmith
Copy link
Member

LGTM

@oscardssmith oscardssmith merged commit 2f67b51 into JuliaLang:master Mar 3, 2022
KristofferC pushed a commit that referenced this pull request Mar 4, 2022
* Clarify the behavior of `@threads for`

Co-authored-by: Ian Butterworth <[email protected]>
(cherry picked from commit 2f67b51)
KristofferC pushed a commit that referenced this pull request Mar 7, 2022
* Clarify the behavior of `@threads for`

Co-authored-by: Ian Butterworth <[email protected]>
(cherry picked from commit 2f67b51)
@KristofferC KristofferC removed the backport 1.8 Change should be backported to release-1.8 label Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs This change adds or pertains to documentation multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants