Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to using const Ref instead of mutable global #829

Closed
omus opened this issue Jun 24, 2020 · 13 comments
Closed

Switch to using const Ref instead of mutable global #829

omus opened this issue Jun 24, 2020 · 13 comments

Comments

@omus
Copy link
Contributor

omus commented Jun 24, 2020

I noticed in https://github.com/JuliaBinaryWrappers/MozillaCACerts_jll.jl/blob/12060f50fe9948363c300167506f26aa6c75415d/src/wrappers/any.jl#L18 that cacert is a mutable global. It seems like it would be more performant to change the code to be:

const cacert = Ref{String}()

function __init__()
    ...
    cacert[] = "..."
end

I'm guessing the rational for using a mutable global is purely for convenience.

@fingolfin
Copy link
Member

But what about code that accesses MozillaCACerts_jll.cacert? It would be broken by this, wouldn't it? I don't know if such code exists, but I definitely know that it exists for the foo_path variables...

At the very least, I think there should be strict benchmarks that conclusively prove there is something to be gained here. E.g. similar to what I did in PR #812.

@omus
Copy link
Contributor Author

omus commented Jun 25, 2020

Updating packages to use Ref would be breaking and we'd need to tag the new JLL packages appropriately.

@giordano
Copy link
Member

we'd need to tag the new JLL packages appropriately.

Which is not going to be an easy task as we use the upstream version

@vchuravy
Copy link
Member

Maybe we can use @deprecate_binding?

julia> r = Ref{Int64}()
Base.RefValue{Int64}(140208815775824)

julia> Base.@deprecate_binding sym r[] false

julia> sym
WARNING: Main.sym is deprecated, use r[] instead.
  likely near REPL[6]:1
140208815775824

@fingolfin
Copy link
Member

So I tried to see what benefit this brings, if any, by using the same setup as in PR #812, and tested using FFMPEG_jll, which pulls in a load of other JLLs.

First, I set up a fresh Julia depot with just that JLL (and its dependencies in it), and then took a fresh baseline timing (see the end of this comment for the results).

Then I modified all those JLLs to imitate the changes proposed here (as I understand them), limited to all the _path variables (since there are lot of them, I figured that if there is a measurable effect, then just changing them should reflect that already). I used these three commands to do that:

perl -pi -e 's/(global [a-zA-Z_]+_path) = /\1\[\] = /' */*/src/wrappers/*.jl
perl -pi -e 's/^([a-zA-Z_]+_path) = ""$/const \1 = Ref{String}()/' */*/src/wrappers/*.jl
perl -pi -e 's/\(([a-zA-Z_]+_path)\)/(\1\[\])/' */*/src/wrappers/*.jl

This results in diffs like this (let me know if that correctly captures the gist of the idea proposed here):

diff --git a/Zlib_jll/hR4qW/src/wrappers/aarch64-linux-gnu.jl b/Zlib_jll/hR4qW/src/wrappers/aarch64-linux-gnu.jl
index e9b35b6..127a3ed 100644
--- a/Zlib_jll/hR4qW/src/wrappers/aarch64-linux-gnu.jl
+++ b/Zlib_jll/hR4qW/src/wrappers/aarch64-linux-gnu.jl
@@ -11,7 +11,7 @@ LIBPATH_default = ""
 const libz_splitpath = ["lib", "libz.so"]

 # This will be filled out by __init__() for all products, as it must be done at runtime
-libz_path = ""
+const libz_path = Ref{String}()

 # libz-specific global declaration
 # This will be filled out by __init__()
@@ -29,12 +29,12 @@ function __init__()

     # Initialize PATH and LIBPATH environment variable listings
     global PATH_list, LIBPATH_list
-    global libz_path = normpath(joinpath(artifact_dir, libz_splitpath...))
+    global libz_path[] = normpath(joinpath(artifact_dir, libz_splitpath...))

     # Manually `dlopen()` this right now so that future invocations
     # of `ccall` with its `SONAME` will find this path immediately.
-    global libz_handle = dlopen(libz_path)
-    push!(LIBPATH_list, dirname(libz_path))
+    global libz_handle = dlopen(libz_path[])
+    push!(LIBPATH_list, dirname(libz_path[]))

     # Filter out duplicate and empty entries in our PATH and LIBPATH entries
     filter!(!isempty, unique!(PATH_list))

Now to the timings. First, the baseline (with default optimization level / -O1 / -O0; note that just the Julia startup is 90-95 ms, as in PR #812, so that should be subtracted from all numbers below)

seress:/tmp$ JULIA_DEPOT_PATH=/tmp/juliadepot hyperfine 'julia-1.5.0-beta1 --startup-file=no -e "using FFMPEG_jll"' --warmup 2 -r 20
Benchmark #1: julia-1.5.0-beta1 --startup-file=no -e "using FFMPEG_jll"
  Time (mean ± σ):      1.132 s ±  0.003 s    [User: 1.264 s, System: 0.419 s]
  Range (min … max):    1.128 s …  1.137 s    20 runs

seress:/tmp$ JULIA_DEPOT_PATH=/tmp/juliadepot hyperfine 'julia-1.5.0-beta1 --startup-file=no -O1 -e "using FFMPEG_jll"' --warmup 2 -r 20
Benchmark #1: julia-1.5.0-beta1 --startup-file=no -O1 -e "using FFMPEG_jll"
  Time (mean ± σ):     914.4 ms ±   3.6 ms    [User: 1.038 s, System: 0.427 s]
  Range (min … max):   907.7 ms … 921.9 ms    20 runs

seress:/tmp$ JULIA_DEPOT_PATH=/tmp/juliadepot hyperfine 'julia-1.5.0-beta1 --startup-file=no -O0 -e "using FFMPEG_jll"' --warmup 2 -r 20
Benchmark #1: julia-1.5.0-beta1 --startup-file=no -O0 -e "using FFMPEG_jll"
  Time (mean ± σ):     894.7 ms ±   2.9 ms    [User: 1.024 s, System: 0.423 s]
  Range (min … max):   891.2 ms … 902.3 ms    20 runs

Now the results with the change proposed here:

seress:/tmp$ JULIA_DEPOT_PATH=/tmp/juliadepot hyperfine 'julia-1.5.0-beta1 --startup-file=no -e "using FFMPEG_jll"' --warmup 2 -r 20
Benchmark #1: julia-1.5.0-beta1 --startup-file=no -e "using FFMPEG_jll"
  Time (mean ± σ):      1.114 s ±  0.002 s    [User: 1.242 s, System: 0.423 s]
  Range (min … max):    1.110 s …  1.120 s    20 runs

seress:/tmp$ JULIA_DEPOT_PATH=/tmp/juliadepot hyperfine 'julia-1.5.0-beta1 --startup-file=no -O1 -e "using FFMPEG_jll"' --warmup 2 -r 20
Benchmark #1: julia-1.5.0-beta1 --startup-file=no -O1 -e "using FFMPEG_jll"
  Time (mean ± σ):     900.2 ms ±   1.0 ms    [User: 1.036 s, System: 0.415 s]
  Range (min … max):   898.3 ms … 901.8 ms    20 runs

seress:/tmp$ JULIA_DEPOT_PATH=/tmp/juliadepot hyperfine 'julia-1.5.0-beta1 --startup-file=no -O0 -e "using FFMPEG_jll"' --warmup 2 -r 20
Benchmark #1: julia-1.5.0-beta1 --startup-file=no -O0 -e "using FFMPEG_jll"
  Time (mean ± σ):     882.9 ms ±   2.4 ms    [User: 995.8 ms, System: 438.2 ms]
  Range (min … max):   878.2 ms … 889.2 ms    20 runs

Conclusion: Things get faster! Albeit just in the order of 1-2%.

@PallHaraldsson
Copy link
Contributor

Yes, I also got only small speedup, but something is better than nothing... and at least here no downside. I wasn't really expecting a big speedup, as you're not compiling a lot of code, given binaries... but note this adds up. I was looking into FFMPEG also myself, and all dependencies, and thereof etc. need the same treatment for same speedup.

I would want/propose something doing it for all JLL packages...

@fingolfin
Copy link
Member

But there is a downside: this introduces an incompatibility, code which accesses MozillaCACerts_jll.cacert would need to be updated (and possibly be adapted to deal with both older and newer versions of MozillaCACerts_jll.

@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Jul 14, 2020

Maybe I or you misunderstand. The change I made (if applied to a JLL package) works on 1.5+ but also on older. It just doesn't have any effect on older.

Also, as is, you need to opt into this for a JLL package, by releasing a new version. Your old version still works, and packages depending on it. And if you upgrade the minimum requirements to the newer then should be ok, as I explained. You can wait with doing that if you do not trust me.

@fingolfin
Copy link
Member

If the type of, say, cacert changes from String to Ref{String}, then I need to adjust all my code which access cacert to use cacert[] instead. This is difficult to protect against with versioning alone: if MozillaCACerts_jll is updated, it will changes its version from something like 2020.0.0-01-01+0 to 2020.0.0-01-01+1. No regular packages depending on a JLL won't have a version dependncy like that (i.e. <= 2020.0.0-01-01+0). Hence the switch wold break any packages which accesses cacert, with no good way to protect against that (I mean, you could try to first update all package using the JLL to be safe with regards to this change, e.g. by having code like mycacert = cacert isa Ref{String} ? cacert[] : cacert; but you'd have to update and release all of them first. Any you don't update will become broken.

Of course nothing may be directly accessing cacert in MozillaCACerts_jll right now; I just used those as stand-ins as that's what this issue started with. But I do have JLLs where I do access corresponding fields, and a change in BinaryBuilder would thus break them, at least the next time the JLL gets regenerated.

@omus
Copy link
Contributor Author

omus commented Jul 28, 2020

We just need to use @deprecate_binding as pointed out previously. For your example we could have:

julia> cacert_ref = Ref{String}("foo")
Base.RefValue{String}("foo")

julia> Base.@deprecate_binding cacert cacert_ref[] false

julia> cacert
WARNING: Main.cacert is deprecated, use cacert_ref[] instead.
  likely near REPL[7]:1
"foo"

@staticfloat
Copy link
Member

We're going to be changing everything to be functions in the near future, once ccall() can handle non-const library names. This is both for future-proofing (much easier to deprecate functions) and for necessary functionality (we want to lazily-load libraries, which means that the first time you call LibFoo_jll.libfoo() it will ensure that all dependencies of libfoo.so are already loaded, which requires LibFoo_jll.libfoo to be a function, not a variable).

So we're going to have a breaking release of JLL packages soon anyway. Our plan is to by default provide the old bindings on Julia < 1.6, and provide the new, faster and more feature-complete bindings on Julia >= 1.6. We're going to try and come up with a good way for packages to tell JLL packages that they want to opt-out of the fast bindings and still use slow bindings (for package authors that want to support Julia 1.3-1.6 with one codebase, for instance), possibly using Preferences (e.g. packages set the use_old_bindingstyle=1 preference for the JLL UUID that they are loading), but I'm not 100% sure on the specifics yet. In any case, I think the performance gains from the new bindings are really going to help push users to just jumping to 1.6+ anyway, since the difference is really quite impressive.

In summary, your LibraryProducts will go from:

ccall((:fooify, libfoo), ...)

to:

ccall((:fooify, libfoo()), ...)

your FileProducts will go from:

open(foo_conf, "r") do
	...
end

to:

open(foo_conf(), "r") do
	...
end

I haven't 100% finalized this yet, but I'm pretty sure we're going to change ExecutableProducts to, instead of using with_env() to just return a Cmd object that has the proper environment set already, so e.g. instead of this:

fooifier() do fooifier_path
	run(`$(fooifier_path) ...`)
end

You'll instead just do:

run(`$(fooifier()) ...`)

But this will require another fix to Julia such that environment blocks properly merge their setenv() results.

@fingolfin
Copy link
Member

I believe this issue has been addressed now with the new JLLs that are based on JLLWrappers.jl.

@giordano
Copy link
Member

Not entirely since there are still some mutable globals, but if the main concern of this ticket was performance, this has been largely addressed by JLLWrappers, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants