Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pkg3 stdlib #26141

Merged
merged 292 commits into from
Feb 23, 2018
Merged

Pkg3 stdlib #26141

merged 292 commits into from
Feb 23, 2018

Conversation

KristofferC
Copy link
Member

Let's see if this passes CI.

Internalizing packages was necessary to avoid a conflict between
the versions of TOML and TerminalMenus loaded by Pkg3 and thus
precompiled during Pkg3 loading and precompilation after Pkg3 is
already loaded.
when user does `pkg> status` in a non-git-tracked environment,
print a clear error message instead of a erroring on a Void env.git
field (#4).
few fixes noticed when bringing Pkg3 into stdlib
@KristofferC KristofferC force-pushed the kc/pkg3_stdlib branch 3 times, most recently from 1b2035b to 2f0db6d Compare February 22, 2018 21:13
@KristofferC
Copy link
Member Author

So a bit of an update here.

The good news. This branch passes tests and incorporates the new package manager. It is fast and awesome.

In order for Pkg3 to be pleasant to use it requires precompile statements. As an example, the current branch has the following timings, calling two different function twice (the string in the called function is equivalent to entering it in the Pkg3 REPL mode):

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "st")
  ...
  0.529426 seconds (560.79 k allocations: 31.929 MiB, 1.22% gc time)

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "st")
 ...
  0.005712 seconds (10.55 k allocations: 739.734 KiB)

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "add JSON")
  ...
  1.568578 seconds (1.89 M allocations: 110.119 MiB, 2.22% gc time)

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "add JSON")
  ...
  0.113276 seconds (498.01 k allocations: 29.760 MiB, 5.69% gc time)

So the first time is slower, but the absolute time is such that, for an interactive package, it is pretty ok. Removing all the precompile statements, we instead have:

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "st")
  ...
  5.919937 seconds (6.83 M allocations: 392.043 MiB, 3.08% gc time)

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "st")
  ...
  0.005653 seconds (10.61 k allocations: 806.234 KiB)

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "add JSON")
 ...
 17.872648 seconds (16.79 M allocations: 933.106 MiB, 3.47% gc time)

julia> @time Pkg3.REPLMode.do_cmd(Base.active_repl, "add JSON")
  ...
  0.145271 seconds (498.02 k allocations: 29.745 MiB, 7.76% gc time)

The time for both first calls went up by a factor of 10 and, in fact, Pkg3.add is almost the same speed as the current one.
The bad news is the following:

  • When generating the machine code including Pkg3 precompile statement, the julia process uses over the memory limit for 32 bit systems, so 32 bit CI crashes due to OOM error.
  • Build time is longer ~30-40 seconds.
  • Sys image is larger:

Without precompile:

-rwxr-xr-x  1 kristoffer  staff  141040636 Feb 22 22:09 sys.dylib
-rw-r--r--  1 kristoffer  staff  160521512 Feb 22 22:09 sys.o

With precompile:

-rwxr-xr-x  1 kristoffer  staff  171086044 Feb 22 21:46 sys.dylib
-rw-r--r--  1 kristoffer  staff  197437996 Feb 22 21:46 sys.o

I am quite surprised that the effect of the precompile statements is so large. Pkg3 is about 6k lines of pretty normal Julia code (no generated function or excessive codegen). Anyway, some possible ways forward:

  • Put Pkg3 in stdlib without precompile statement. Add a warning or something and say that the slow first call is known and will be handled. While this sounds good I am slightly against this for a few reasons. First, a lot of work has gone into Pkg3 being fast, it is just sad to have it be slow. Secondly, letting users get their hands on the new shiny package manager and having them run a few basic commands and feel that it is so slow will probably be anticlimactic and disappointing for them. Thirdly, the performance of tools changes how you use them. With Revise.jl, the way I code has changed. Having Pkg3 be very slow for the first call will change how people use it. We want people to make new Projects often and try stuff out etc. A slow Pkg3 will not be used in the same way as a fast one, so the user reports we get out of it will be biased.
  • Try to find out ways to make Pkg3 less costly for the compiler. I spent some time benchmarking, and for the Pkg3.add command, around 12 seconds is spent in type inference, 10 seconds in compiling (and 200 ms running it). I haven't been able to spot any huge bottlenecks and I lack the experience in the profiling tools needed to more carefully pinpoint why it takes so long time.
  • Somehow change the way we emit the precompiled code with LLVM to make it use less memory / be faster. Is it some specific optimization pass that is slow? Is it possible to opt out of it for parts of the code? Etc.
  • Disable running Pkg3 precompilation statements on the CI that are failing with them (introduce some LOW_MEMORY_ENVIRONMENT environment variable which can perhaps be used to determine what level of precompilation should be used).

Comments / opinions welcome.

@vchuravy
Copy link
Member

I agree that Pkg3 needs to be fast and snappy, this means for me that we do need #25324 (comment)

I would like to understand the OOM situation better, is Julia really using more than 4Gigs of memory during the compilation step? Have you tried turning on:

# - dmesg

The last build on i686 Linux seemed to time out https://travis-ci.org/JuliaLang/julia/jobs/344988033

OT: Can we move TOML to be it's own independent stdlib project?

@StefanKarpinski
Copy link
Member

Can we move TOML to be it's own independent stdlib project?

Ultimately, yes, but for now we don't want to commit to it being official. It works fine as an internal part of Pkg3 but needs quite a bit of work as a general TOML parsing library.

@KristofferC
Copy link
Member Author

KristofferC commented Feb 22, 2018

I looked at activity monitor and it used max about 2.8-ish GB on my Mac. I thought 32 bit systems was limited to 2 GB. The OOM error is thrown when the PassManager is running in this function here: https://github.com/JuliaLang/julia/blob/master/src/jitlayers.cpp#L1060.

Can we move TOML to be it's own independent stdlib project?

Most likely, but let's have that discussion later imo. It is not what is urgent right now.

@vchuravy
Copy link
Member

Travis has a memory limit of about 4G and 32bit OSs are limited to slightly less than 4G.

I just build it and I peaked out at around 3.4-3.7G of RES (memory resident in RAM)

@KristofferC
Copy link
Member Author

@vchuravy
Copy link
Member

vchuravy commented Feb 22, 2018

Maximum resident set size (kbytes): 3625240

	Command being timed: "/home/vchuravy/src/julia/usr/bin/julia -O3 -C skylake,-rdrnd,-rdseed,-rtm --output-o /home/vchuravy/src/julia/usr/lib/julia/sys.o.tmp --startup-file=no --warn-overwrite=yes --sysimage /home/vchuravy/src/julia/usr/lib/julia/basecompiler.ji sysimg.jl"
	User time (seconds): 520.16
	System time (seconds): 2.44
	Percent of CPU this job got: 99%
	Elapsed (wall clock) time (h:mm:ss or m:ss): 8:47.68
	Average shared text size (kbytes): 0
	Average unshared data size (kbytes): 0
	Average stack size (kbytes): 0
	Average total size (kbytes): 0
	Maximum resident set size (kbytes): 3625240
	Average resident set size (kbytes): 0
	Major (requiring I/O) page faults: 3
	Minor (reclaiming a frame) page faults: 1090952
	Voluntary context switches: 47
	Involuntary context switches: 41648
	Swaps: 0
	File system inputs: 22016
	File system outputs: 453480
	Socket messages sent: 0
	Socket messages received: 0
	Signals delivered: 0
	Page size (bytes): 4096
	Exit status: 0

With https://github.com/JuliaLang/julia/tree/vc/pkg3_mem

@quinnj
Copy link
Member

quinnj commented Feb 22, 2018

Obviously any improvements in compile time/footprint are great, but I'd vote for disabling precompile on low memory systems for the moment and get this in.

@KristofferC
Copy link
Member Author

KristofferC commented Feb 23, 2018

Ok, so I did the following. Disabled Pkg3 precompile statements by default. At first command, print a small warning that Pkg3 precompile is off and that first call is known to be slow. It also says how to build julia with precompile statements on (set an env flag and rebuild) which will disable the warning. The warning can also be disabled by another env flag.

I think this is best for now. Get Pkg3 into stdlib so we can start working with it, it can optionally be fast by rebuilding, and people will at least see that there is a reason for it to be slow if they run it without precompile statements.

@mauro3
Copy link
Contributor

mauro3 commented Feb 23, 2018

What do I need to do to transition my julia-master to use Pkg3?

@StefanKarpinski
Copy link
Member

Build Julia and type ] and you'll get a Pkg3 REPL prompt. There is also using Pkg3 and there's a Pkg3 similar to the old Pkg API (but not exactly the same).

@KristofferC
Copy link
Member Author

The Pkg3 REPL mode has pretty good help with pkg> ? but we of course need proper docs. One step at a time. This was a large one.

@StefanKarpinski StefanKarpinski deleted the kc/pkg3_stdlib branch February 23, 2018 17:23
@Sacha0
Copy link
Member

Sacha0 commented Feb 23, 2018

Cheers!

vchuravy pushed a commit to JuliaPackaging/LazyArtifacts.jl that referenced this pull request Oct 2, 2023
Keno pushed a commit that referenced this pull request Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.