Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds interpolation syntax to Threads.@spawn, to evaluate arguments immediately #33119

Merged
merged 4 commits into from
Dec 18, 2019

Conversation

NHDaly
Copy link
Member

@NHDaly NHDaly commented Aug 30, 2019

Adds $-interpolation syntax to @async and Threads.@spawn, to evaluate arguments immediately.

Add the ability to evaluate some parts of a @spawn/@async
immediately, in the current thread context.

This prevents variables being "boxed" in order to capture them in the
closure, exactly the same as wrapping them in a let-block locally.

For example, $x expands like this:

julia> @macroexpand @async $x + 2
quote
    #= task.jl:361 =#
    let var"##454" = x
        #= task.jl:362 =#
        local var"#9#task" = Base.Task((()->begin
                            #= task.jl:358 =#
                            var"##454" + 2
                        end))
        #= task.jl:363 =#
        if $(Expr(:islocal, Symbol("##sync#95")))
            #= task.jl:364 =#
            Base.push!(var"##sync#95", var"#9#task")
        end
        #= task.jl:366 =#
        Base.schedule(var"#9#task")
        #= task.jl:367 =#
        var"#9#task"
    end
end

Fixes #30896

Add the ability to evaluate some parts of a @spawn immediately, in the current thread context.

Adds a new macro, Threads.@spawncall, which evaluates its arguments immediately, then @spawns the closure with the argument values.

This acts more like a "parallel function call", where the arguments are evaluated immediately, in the current thread, and the function is then invoked in any available thread with the stored argument values.

EDIT: Updated to add $-interpolation syntax support to @spawn to allow you to manually specify which parts of the expression to evaluate early.

This prevents variables being "boxed" when capturing them in the closure.

This mirrors the behavior of golangs go command, which requires it be used with a function call.

Fixes #30896

@NHDaly
Copy link
Member Author

NHDaly commented Aug 30, 2019

(If we move forward with this PR, we'll probably want to add an equivalent for @async. I don't love the name @asynccall because the consecutive 'c's make it hard to read. There's @async_call but it's a bit ugly...)

@vchuravy
Copy link
Member

I am not the biggest fan of adding a new name/verb with subtlety different behavior.

I liked Jeff's idea of passing arguments to be evaluated, which seems similar to let

@NHDaly
Copy link
Member Author

NHDaly commented Aug 30, 2019

I am not the biggest fan of adding a new name/verb with subtlety different behavior.

Yeah, i agree, i think that's a reasonable concern. Thanks.

I liked Jeff's idea of passing arguments to be evaluated, which seems similar to let

My concerns with passing arguments to be evaluated are two-fold:

  1. If we require users to specify which arguments to evaluate, it's not functionally much different than just having users use a let block. My main motivation for introducing something like this is to enable users to super easily write multithreaded function calls (as easy as it is in go) without having to understand the intricacies of scoping and closure capture. I think that if we had a mechanism that was specific to function calls, and we encourage its use, then people will just know "oh i call functions with this verb, and I spawn arbitrary blocks of code with this other verb." And then their code will just work as expected without having to understand why sometimes variables are boxed.
  2. I think the syntax is strange and non-obvious. If I just read @async i j k println(i, j, k), i would definitely not understand what the first few arguments were doing there, and i wouldn't know when I was supposed to put them there or not.
    • Maybe this could be improved if it was a more explicit kwarg? Something like @async capture=[i,j,k] println(i, j, k) or something?

But yeah, i do agree that it could be non-ideal to have these two verbs that are pretty similar yet subtly different...

What if this also had a different syntax, not just a different name? That could help distinguish them better?
Maybe something silly like:

function spawnfunc(f)
    (args...; kwargs...) -> @spawn f(args...; kwargs...)
end
julia> spawnfunc(println)(1)
1
Task (done) @0x0000000114d958d0

julia> fetch(spawnfunc(sort)([3 2; 1 0], dims=2))
2×2 Array{Int64,2}:
 2  3
 0  1

EDIT: One problem with doing it this way, I think, is that enclosing @sync blocks stop working with the @spawn, because the ##sync#72 variable is out of scope.. It would probably work if spawnfunc were a macro as well, something like @spawnfunc(f)(args...) but i'd need to think about it more.

@NHDaly
Copy link
Member Author

NHDaly commented Aug 30, 2019

  • Maybe this could be improved if it was a more explicit kwarg? Something like @async capture=[i,j,k] println(i, j, k) or something?

... Or, as another idea, maybe we should just use $ for this? It's more or less consistent with the concept of interpolation, and basically matches its use in other contexts...?

So it'd be something like @async println($i, $j, $k). Which actually seems not terrible to me? Anything being interpolated is evaluated in the current Task, and anything not interpolated is evaluated as part of the closure, in the spawned Task.

@quinnj
Copy link
Member

quinnj commented Aug 30, 2019

I love the interpolation idea.

@NHDaly
Copy link
Member Author

NHDaly commented Aug 30, 2019

It's growing on me as well.

Oh, haha, i just realized i already suggested it back in January as well, in the original issue... haha it's funny how our brains are little automata that respond the same way to the same situations... 😅😬

#30896 (comment)

Here's what I said then:

  • We could consider allowing @async to use the $ interpolation syntax within its expression body to indicate pass-by-value vs pass-by-reference semantics. This would be kind of like what @btime does.
    In my example, that would become:

        @async println($i)

    Sadly this might be breaking, though, for any expressions that already contained quoting and interpolation inside the @async body. 😦 (e.g. @async println(:($x + 5))

I'm less worried about that caveat now than I was then.. I think all the cases where this would start to take affect would currently be syntax: "$" expression outside quote errors, so this should be available syntax! :) I think it would be fairly straightforward to parse the $ -- it'd be the same as is done for @eval, and I don't think there'd be any ambiguous cases.

While i'm here, i'll copy over the other references from the original issue:

@vchuravy
Copy link
Member

IIUC the semantics of Cilk and Go agree here, they both use spawn/go on function calls and evaluate the arguments of the function call before spawning. https://www.cilkplus.org/sites/default/files/open_specifications/Intel_Cilk_plus_lang_spec_1.2.htm#keywd.spawn

The general closure syntax of @async/@spawn makes this harder, but I rather like the pun upon the interpolation syntax (similar to BenchmarkTools).

@NHDaly
Copy link
Member Author

NHDaly commented Sep 10, 2019

Since there haven't been any further comments on this PR, i've gone ahead and implemented the $-interpolation approach.

It seems pretty good to me! And i am 95% sure it is a non-breaking change, since it's only weakening the pre-conditions on valid input. :)

The original example now works, via this interpolation:

    begin
        outs = zeros(5)
        @sync begin
            local i = 1
            while i <= 5
                Threads.@spawn setindex!(outs, $i, $i)
                i += 1
            end
        end
        @test outs == 1:5
    end

If we're happy with this, we can move the implementation to a shared common location, and use it in @async as well. (and clean up the implementation.) :)

@NHDaly
Copy link
Member Author

NHDaly commented Sep 10, 2019

I guess one very real problem with the current approach is that the $ has a slightly different meaning than it has in some other macro contexts, since we are simply moving the variables to be evaluated earlier, rather than actually evaling the interpolated expression and splicing the result in, in-place:

julia> @eval $(Expr(:call, :+, x, x))
4

julia> fetch(@spawn $(Expr(:call, :+, x, x)))
:(2 + 2)

julia> @btime $(Expr(:call, :+, x, x))
  0.036 ns (0 allocations: 0 bytes)
4

I'm afraid this might be confusing. EDIT: Though maybe not... See below:


That said, after thinking about it more, I guess actually the behavior is consistent with the meaning of $-interpolation.. It's just that both @eval and @btime promise to eval the resultant expression, whereas @spawn does not, which is why the Expression persists unchanged. This is the same as you'd get when interpolating into a string or a quote, for example:

julia> "$(Expr(:call, :+, x, x))"
"2 + 2"

julia> :($(Expr(:call, :+, x, x)))
:(2 + 2)

So maybe there's nothing wrong here after-all? It does seem a bit mysterious though, because people might be used to interpolating into an @eval call.

@vchuravy vchuravy changed the title Adds Threads.@spawncall, which evaluates arguments immediately Adds interpolation syntax to Threads.@spawn, to evaluate arguments immediately Sep 11, 2019
@StefanKarpinski
Copy link
Member

Bump, this seems like a really nice idea. cc @JeffBezanson, @vtjnash

@fredrikekre fredrikekre added the multithreading Base.Threads and related functionality label Oct 1, 2019
Copy link
Member

@vchuravy vchuravy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Just a bit of polish and adding news and I think this is good to go.

@@ -103,22 +103,28 @@ Create and run a [`Task`](@ref) on any available thread. To wait for the task to
finish, call [`wait`](@ref) on the result of this macro, or call [`fetch`](@ref)
to wait and then obtain its return value.

(Values can be interpolated into `@spawn` via `\$` to evaluate them in the current task.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a !!! compat note

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks. Is it okay that I added the compat note below like this, or should it go above the section on $-interpolation? Should I move that paragraph below the compat notes?

base/task.jl Show resolved Hide resolved
base/expr.jl Outdated Show resolved Hide resolved
test/threads_exec.jl Outdated Show resolved Hide resolved
@fredrikekre fredrikekre added the needs news A NEWS entry is required for this change label Oct 2, 2019
Copy link
Member Author

@NHDaly NHDaly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, @vchuravy, and sorry for the delay! :)

EDIT: also added NEWS. (Unclear why the macos test is failing, but seems unrelated?)

base/expr.jl Outdated Show resolved Hide resolved
@@ -103,22 +103,28 @@ Create and run a [`Task`](@ref) on any available thread. To wait for the task to
finish, call [`wait`](@ref) on the result of this macro, or call [`fetch`](@ref)
to wait and then obtain its return value.

(Values can be interpolated into `@spawn` via `\$` to evaluate them in the current task.)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks. Is it okay that I added the compat note below like this, or should it go above the section on $-interpolation? Should I move that paragraph below the compat notes?

base/task.jl Show resolved Hide resolved
test/threads_exec.jl Outdated Show resolved Hide resolved
@NHDaly
Copy link
Member Author

NHDaly commented Dec 15, 2019

Friendly ping. I think this was mostly ready to go? :)

@vchuravy
Copy link
Member

Can you squash and rebase?

@JeffBezanson
Copy link
Member

Yes, let's get this rebased and merged.

@JeffBezanson JeffBezanson removed the needs news A NEWS entry is required for this change label Dec 16, 2019
This acts more like a "parallel function call", where the arguments are
evaluated immediately, in the current thread, and the function is then
invoked in any available thread with the stored argument values.

This prevents variables being "boxed" in order to capture them in the
closure.
This acts more like a "parallel function call", where the arguments are
evaluated immediately, in the current thread, and the function is then
invoked in any available thread with the stored argument values.

This prevents variables being "boxed" in order to capture them in the
closure.

Fix at-spawncall to work w/ kwargs; add unit tests

Rough implementation of `$`-interpolation in at-spawn
This acts more like a "parallel function call", where the arguments are
evaluated immediately, in the current thread, and the function is then
invoked in any available thread with the stored argument values.

This prevents variables being "boxed" in order to capture them in the
closure.

Fix at-spawncall to work w/ kwargs; add unit tests

Rough implementation of `$`-interpolation in at-spawn

Update spawncall tests to exercise the interpolation

Fix my basic "lift $" function for $ inside :()

Add string-interp test

Escape the `$` reference in at-spawn docstring

Oops, remove spurious "end" at end of file

Factor out `$`-lifting; share b/w `@async` & `@spawn`
@NHDaly
Copy link
Member Author

NHDaly commented Dec 18, 2019 via email

…uate arguments immediately.

Add the ability to evaluate some parts of a `@spawn`/`@async`
immediately, in the current thread context.

This prevents variables being "boxed" in order to capture them in the
closure, exactly the same as wrapping them in a let-block locally.

For example, `$x` expands like this:
```julia
julia> @macroexpand @async $x + 2
quote
    #= task.jl:361 =#
    let var"#JuliaLang#454" = x
        #= task.jl:362 =#
        local var"JuliaLang#9#task" = Base.Task((()->begin
                            #= task.jl:358 =#
                            var"#JuliaLang#454" + 2
                        end))
        #= task.jl:363 =#
        if $(Expr(:islocal, Symbol("##sync#95")))
            #= task.jl:364 =#
            Base.push!(var"##sync#95", var"JuliaLang#9#task")
        end
        #= task.jl:366 =#
        Base.schedule(var"JuliaLang#9#task")
        #= task.jl:367 =#
        var"JuliaLang#9#task"
    end
end
```
@NHDaly
Copy link
Member Author

NHDaly commented Dec 18, 2019

Okay looks good! :) It looks like the windows test failure is a flake? Thanks for the review

@JeffBezanson JeffBezanson merged commit 0df3fe7 into JuliaLang:master Dec 18, 2019
@NHDaly NHDaly deleted the spawncall branch December 19, 2019 06:00
NHDaly added a commit to NHDaly/julia that referenced this pull request Dec 19, 2019
Introduce synchronization (via a `Channel()`) to force spawned tasks to
run after the local variables are updated, showcasing the problem and
the solution with `$`-interpolation.
NHDaly added a commit to NHDaly/julia that referenced this pull request Dec 19, 2019
Introduce synchronization (via a `Channel()`) to force spawned tasks to
run after the local variables are updated, showcasing the problem and
the solution with `$`-interpolation.
vchuravy pushed a commit that referenced this pull request Dec 20, 2019
Introduce synchronization (via a `Channel()`) to force spawned tasks to
run after the local variables are updated, showcasing the problem and
the solution with `$`-interpolation.
KristofferC pushed a commit that referenced this pull request Apr 11, 2020
…immediately (#33119)

Adds $-interpolation syntax to `@async` and `Threads.@spawn`, to evaluate arguments immediately.

Add the ability to evaluate some parts of a `@spawn`/`@async`
immediately, in the current thread context.

This prevents variables being "boxed" in order to capture them in the
closure, exactly the same as wrapping them in a let-block locally.

For example, `$x` expands like this:
```julia
julia> @macroexpand @async $x + 2
quote
    #= task.jl:361 =#
    let var"##454" = x
        #= task.jl:362 =#
        local var"#9#task" = Base.Task((()->begin
                            #= task.jl:358 =#
                            var"##454" + 2
                        end))
        #= task.jl:363 =#
        if $(Expr(:islocal, Symbol("##sync#95")))
            #= task.jl:364 =#
            Base.push!(var"##sync#95", var"#9#task")
        end
        #= task.jl:366 =#
        Base.schedule(var"#9#task")
        #= task.jl:367 =#
        var"#9#task"
    end
end
```
KristofferC pushed a commit that referenced this pull request Apr 11, 2020
Introduce synchronization (via a `Channel()`) to force spawned tasks to
run after the local variables are updated, showcasing the problem and
the solution with `$`-interpolation.
@bramtayl
Copy link
Contributor

I wonder if this could be made more general? Perhaps marking any closure with $ could essentially "freeze" it? It would be a much nicer syntax than the let-block trick currently used to get around #15276

@bramtayl
Copy link
Contributor

As a bonus, $ currently errors at top-level, so it could be added pre-2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Boxed variables in closures passed to a Task can cause race condition
7 participants