Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow startup #109

Closed
PallHaraldsson opened this issue Jun 8, 2020 · 13 comments
Closed

Slow startup #109

PallHaraldsson opened this issue Jun 8, 2020 · 13 comments

Comments

@PallHaraldsson
Copy link

PallHaraldsson commented Jun 8, 2020

I've been tracking down slow startup, and this package makes startup of many, including plotting ecosystem slow.

It has many invalidations, I infer from the second using, and SnoopCompile.jl I understand, is the solution and/or changing it to a JLL package.

julia> @time using DoubleFloats
  6.160255 seconds (10.22 M allocations: 524.213 MiB, 4.72% gc time)

julia> @time using DoubleFloats
  0.136392 seconds (224.05 k allocations: 11.463 MiB)

julia> @time using DoubleFloats
  0.000334 seconds (274 allocations: 16.984 KiB)

This package is the main factor slowing down VegaDatasets.jl (I'm not sure it needs this one), though dependency TextParse.jl.

I originally was looking into why using Queryverse is slow, a meta-package that includes the above, and people claim plotting (first plot) is slow, based on this remotely related package... I however think it's really cool.

@PallHaraldsson
Copy link
Author

PallHaraldsson commented Jun 8, 2020

FYI: Startup CAN be much faster, while this change not wanted I guess:

$ ~/julia-1.6-DEV-latest-7c980c6af5/bin/julia --startup-file=no -O0 --compile=min

julia> @time using DoubleFloats
  1.252497 seconds (1.58 M allocations: 102.014 MiB, 0.56% gc time)

With only -O0 or -O1 possibly not slower at runtime? Then such lower opt. can be added at the module level, like here (where I added to wrong package, as it wasn't causing the slowdown): https://github.com/queryverse/VegaDatasets.jl/pull/33/files

@JeffreySarnoff
Copy link
Member

I would need much more detailed guidance (hand-holding, ride-sharing), being entirely unfamiliar with how to approach this efficiently and cleanly.

@PallHaraldsson
Copy link
Author

PallHaraldsson commented Jun 8, 2020

It may not even be about your package [EDIT: seems to be], I just stopped looking there, as it was I thought a likely candidate (and even if, people may blame your). First see what the cause is and I checked some of your dependencies:

julia> @time using Polynomials
  0.540127 seconds (646.08 k allocations: 39.580 MiB)

julia> @time using Printf   # note, this one is likely not to blame as I got 0.000847 seconds later, and in that case didn't say it was precompiling (so I'm not sure why this slow):
  1.416696 seconds (2.36 M allocations: 118.137 MiB, 2.61% gc time)

Even if your package is no to blame, or if, a JLL might be a good idea, I'm just completely new to that.

And SnoopComplie.jl has instructions and I tried to read through, you may want to or first find the root cause.

@PallHaraldsson
Copy link
Author

What you could do, as it's interesting, is see what's the lowest optimization you can get away with, start with -O0 etc. You would know more about what to time. I use @btime (or from shell hyperfine).

Some more dependencies:

julia> @time using Quadmath
  0.507507 seconds (431.16 k allocations: 23.585 MiB)

julia> @time using SpecialFunctions
  0.463260 seconds (418.82 k allocations: 24.849 MiB, 1.57% gc time)

@PallHaraldsson
Copy link
Author

PallHaraldsson commented Jun 8, 2020

Actually looking at your dependencies (and of Quadmath's) it seems like this may be (mostly) because of your package, unless I missed some.

@PallHaraldsson
Copy link
Author

You (or your users) can actually compile similar to here: https://julialang.github.io/PackageCompiler.jl/dev/examples/plots/

But I thought JLLs might apply, but I'm not sure they do, unless you could make work with above:
https://juliapackaging.github.io/BinaryBuilder.jl/dev/jll/

@JeffreySarnoff
Copy link
Member

What are "invalidations" and how do I identify specific ones?

@JeffreySarnoff
Copy link
Member

This package is not a likely candidate for jll treatment. There is no underlying other library. To apply the module level optimization control would require divvying up the package into submodules and that design reworking is only worth considering if we could ascertain which files were gulping the compile time. Is that possible to determine easily? I am not surprised about the "invalidations"; the package design and development did not involve avoiding "invalidations". How is that done?

@PallHaraldsson
Copy link
Author

https://timholy.github.io/SnoopCompile.jl/stable/#Who-should-use-this-package-1

Finally, another alternative that reduces latency without any modifications to package files is Revise. It can be used in conjunction with SnoopCompile.

We do have a profiler, but we would really need a profiler or a mode showing only the proportion of time of each dependency (not sure SnoopCompile does that), as hunting down worst dependency (recursively) is tedious.

Otherwise these is an excellent tools:

julia> using Profile, ProfileView

julia> @profview using DoubleFloats

or as two-step process if you need those options:

julia> @profile using Quadmath

julia> ProfileView.view(C = true, fontsize=40)

@PallHaraldsson
Copy link
Author

PallHaraldsson commented Jun 9, 2020

It was easier than I thought:

julia> SnoopCompile.@snoopc "/tmp/DoubleFloats.log" begin
         using DoubleFloats, Pkg
         include(joinpath(dirname(dirname(pathof(DoubleFloats))), "test", "runtests.jl"))
       end
Launching new julia process to run commands...
Test Summary:                    | Pass  Total
maxintfloat DoubleFloat{Float16} |    4      4
[..]
julia> data = SnoopCompile.read("/tmp/DoubleFloats.log")

julia> pc = SnoopCompile.parcel(reverse!(data[2]))

julia> SnoopCompile.write("/tmp/precompile", pc)

I end up with lots of files such as, what you should add to the project:

shell> wc /tmp/precompile/precompile_DoubleFloats.jl
  95  247 8236 /tmp/precompile/precompile_DoubleFloats.jl

The other (inference) option is cryptic for me, and for now I do not look into it, not sure I need to:

julia> @snoopi using DoubleFloats
27-element Array{Tuple{Float64,Core.MethodInstance},1}:
 (0.0002739429473876953, MethodInstance for (::Quadmath.var"#9#12")())
 (0.0003180503845214844, MethodInstance for (::Pkg.BinaryPlatforms.var"#32#34"{Pkg.BinaryPlatforms.Linux})(::Pkg.BinaryPlatforms.FreeBSD))
 (0.0003352165222167969, MethodInstance for (::Pkg.BinaryPlatforms.var"#32#34"{Pkg.BinaryPlatforms.Linux})(::Pkg.BinaryPlatforms.MacOS))
 (0.0003609657287597656, MethodInstance for haskey(::Dict{Base.UUID,Dict{String,Union{Base.SHA1, String}}}, ::Base.UUID))
 (0.0003750324249267578, MethodInstance for Dates.DatePart{'z'}(::Int64, ::Bool))
 (0.00038909912109375, MethodInstance for _array_for(::Type{Symbol}, ::Array{Any,1}, ::Base.HasShape{1}))
 (0.00041294097900390625, MethodInstance for Dates.DateFormat{Symbol("yyyy-mm-ddTHH:MM:SS.ssszzz"),Tuple{Dates.DatePart{'y'},Dates.Delim{Char,1},Dates.DatePart{'m'},Dates.Delim{Char,1},Dates.DatePart{'d'},Dates.Delim{Char,1},Dates.DatePart{'H'},Dates.Delim{Char,1},Dates.DatePart{'M'},Dates.Delim{Char,1},Dates.DatePart{'S'},Dates.Delim{Char,1},Dates.DatePart{'s'},Dates.DatePart{'z'}}}(::Tuple{Dates.DatePart{'y'},Dates.Delim{Char,1},Dates.DatePart{'m'},Dates.Delim{Char,1},Dates.DatePart{'d'},Dates.Delim{Char,1},Dates.DatePart{'H'},Dates.Delim{Char,1},Dates.DatePart{'M'},Dates.Delim{Char,1},Dates.DatePart{'S'},Dates.Delim{Char,1},Dates.DatePart{'s'},Dates.DatePart{'z'}}, ::Dates.DateLocale))
 (0.0005068778991699219, MethodInstance for (::Pkg.BinaryPlatforms.var"#32#34"{Pkg.BinaryPlatforms.Linux})(::Pkg.BinaryPlatforms.Windows))
 (0.0005519390106201172, MethodInstance for NamedTuple{(:libgfortran_version, :libstdcxx_version, :cxxstring_abi),T} where T<:Tuple(::Tuple{VersionNumber,Nothing,Nothing}))
 (0.0005879402160644531, MethodInstance for Base.Generator(::Quadmath.var"#2#5", ::Array{Any,1}))
 (0.0006208419799804688, MethodInstance for Base.Generator(::Quadmath.var"#3#6", ::Array{Any,1}))
 (0.0006911754608154297, MethodInstance for Base.Generator(::Quadmath.var"#1#4", ::Array{Any,1}))
 (0.0006990432739257812, MethodInstance for Pair(::Pkg.BinaryPlatforms.FreeBSD, ::Dict{String,Any}))
 (0.0009012222290039062, MethodInstance for foreach(::OpenSpecFun_jll.var"#8#10", ::Tuple{Array{String,1}}))
 (0.001043081283569336, MethodInstance for foreach(::OpenSpecFun_jll.var"#7#9", ::Tuple{Array{String,1}}))
 (0.0012671947479248047, MethodInstance for Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Axes,F,Args} where Args<:Tuple where F where Axes(::typeof(esc), ::Tuple{Array{Symbol,1}}))
 (0.001867055892944336, MethodInstance for withnotifications(::String, ::Vararg{Any,N} where N))
 (0.0029449462890625, MethodInstance for collect_to_with_first!(::Array{Symbol,1}, ::Symbol, ::Base.Generator{Array{Any,1},Quadmath.var"#2#5"}, ::Int64))
 (0.0030679702758789062, MethodInstance for collect_to_with_first!(::Array{Symbol,1}, ::Symbol, ::Base.Generator{Array{Any,1},Quadmath.var"#3#6"}, ::Int64))
 (0.0038328170776367188, MethodInstance for all(::Base.Generator{Array{Any,1},Quadmath.var"#1#4"}))
 (0.004256010055541992, MethodInstance for (::Core.var"#Type##kw")(::NamedTuple{(:libgfortran_version, :libstdcxx_version, :cxxstring_abi),Tuple{VersionNumber,Nothing,Nothing}}, ::Type{Pkg.BinaryPlatforms.CompilerABI}))
 (0.004559993743896484, MethodInstance for ht_keyindex(::Dict{Base.PkgId,Array{Function,1}}, ::Base.PkgId))
 (0.006371021270751953, MethodInstance for collect(::Base.Generator{Array{Any,1},Quadmath.var"#2#5"}))
 (0.0064220428466796875, MethodInstance for collect(::Base.Generator{Array{Any,1},Quadmath.var"#3#6"}))
 (0.011010885238647461, MethodInstance for setindex!(::Dict{Pkg.BinaryPlatforms.Platform,Dict{String,Any}}, ::Dict{String,Any}, ::Pkg.BinaryPlatforms.FreeBSD))
 (0.018218994140625, MethodInstance for materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{1},Nothing,typeof(esc),Tuple{Array{Symbol,1}}}))
 (0.02099299430847168, MethodInstance for (::Core.var"#Type##kw")(::NamedTuple{(:libc, :compiler_abi),Tuple{Nothing,Pkg.BinaryPlatforms.CompilerABI}}, ::Type{Pkg.BinaryPlatforms.FreeBSD}, ::Symbol))

@JeffreySarnoff
Copy link
Member

Unfortunately, Revise.jl is required as part of a work-around for a problem with SpecialFunctions on windows.

@PallHaraldsson
Copy link
Author

PallHaraldsson commented Jul 9, 2020

This package is not a likely candidate for jll treatment. There is no underlying other library.

I'm not so sure. Yes, it see only for binaries, and a system image is better and could greatly help:

https://julialang.github.io/PackageCompiler.jl/dev/examples/plots/

Despite the name, it's meant to compile Apps too (Viral suggested renaming the package, and issue was closed as that discussion had been had before). But it's also for Packages, as he name says, I just do not locate how to generate JLLs. Maybe it's coming later, or the package a useful intermediate step, and then use BinaryBuilder.jl or something. I don't have much experience with all this, only have compiled an App.

For system image:

You can save loaded packages and compiled functions

It seems the name should be PackageSCompiler.jl and you can only save all your packgeS, as there's only one sysem image you can load (at a time)?

Since your package, and this double-double idea, is awesome, should it be integrated in Julia's Base? That would side-step the problem. If you're package is relatively stable now, it could be an option, and turn this package into a no-op (or for add-on future development).

@JeffreySarnoff
Copy link
Member

It is unlikely that this package would be integrated into Base because there is a good deal of other package's overhead involved, and, frankly, while quite stable, the code base is too massive for that. I am delighted to know that you find it to be so useful. In my spare time, I am considering reworking, simplifying and perhaps improving an implementation of Double64s -- it will be awhile :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants