[Julep/WIP] Standalone AOT compilation mode #32273

tshort · 2019-06-10T00:12:14Z

This mode of compilation aims to statically compile Julia code to libraries or executables that do not need a system image. This will allow Julia to support more use cases:

Smaller standalone executables with faster startup.
Compilation to standalone libraries. For example, R or Python packages could link to Julia binary libraries.
Cross compilation to more limited systems. This could be an embedded system or WebAssembly for web apps.

To support these modes, the following compilation targets could be supported:

A shared library that links to the libjulia shared library.
An executable that links to the libjulia shared library.
An object file meant to dynamically link to the libjulia shared library.

In addition to these, we'd also like to support these same targets, but statically link to libjulia.a for smaller standalone executables or libraries.

My main interest is compilation to WebAssembly (see this issue). See here for a simple web app compiled using this branch of Julia.

Approach

This is based on @vtjnash's work on jn/codegen-norecursion. That capability will be great to have for codegen work. Hopefully, that can be merged soon.

This approach works by introducing a standalone-aot-mode into Julia's code generation process. This is similar to the imaging-mode. The main differences are:

ccall -- foreigncall's normally are converted to calls to function pointers. In standalone-aot-mode, these are compiled to normal external function calls to be resolved at link time.
cglobal -- As with ccall's, these are compiled to normal external references.
Global variables -- This is a tricky part. Global variables (symbols, strings,
and Julia global variables) are serialized to a "mini image" (a binary array). An
initialization function is provided to restore the global variables upon startup. The serialization code reuses the machinery in "src/dump.c". Some non-core structs and types are converted to tuples or other types that have the same memory layout.
Initialization -- This is another tricky part. Initialization includes a
simplified version of jl_init that does not load the standard library. It
initializes many types, including some defined in base/boot.jl.

Miscellaneous notes

Generic code that uses jl_invoke() or jl_apply_generic() isn't supported. A warning is currently issued for code that is compiled with either of these. This often includes error-handling code.
cfunction isn't supported. I'm not sure how to handle that.
The tests target Linux. The tests currently use julia-debug.
There's a garbage-collection bug lurking somewhere. For at least the rand() test, it crashes unless GC is disabled.

Feedback / next Steps

I'm looking forward to guidance on steps needed to get this into Julia as an experimental feature. This includes tests and code cleanups. If anyone one sees any big gotcha's or problems with the approach, that discussion would help, too.

c42f

Having a working wasm target would be amazing ❤️

Here's a few naive impressions. Hopefully they are more useful than distracting (I am not an expert on this part of the code).

About jl_apply_generic — what might you hope for this to do? Would it be acceptable to embed the julia IR for such functions and run it in the interpreter? I suppose I'm a bit confused about overall aim here, other than "make wasm work well". Is the goal to

Avoid runtime codegen?
Avoid paying for the size of the standard sysimage?
Allow embedding a sysimage?
Avoid paying for the size of libjulia?

c42f · 2019-06-10T05:12:23Z

src/aotcompile.cpp

+        // }
+    Module *M = data->M.get();
+    Function* init_lib_f = cast<Function>(
+        M->getOrInsertFunction("init_lib", Type::getVoidTy(Context), NULL));


Should have a jl_ prefix? jl_init_lib()?

c42f · 2019-06-10T05:46:05Z

test/standalone-aot/IRGen.jl

+        return LLVMNativeCode(native_code)
+    catch e
+        ccall(:jl_clear_standalone_aot_mode, Nothing, ())
+        throw(e)


(1) Use rethrow() rather than throw(e). The latter will duplicate the exception on the exception stack. (2) But instead, you could just put the call to jl_clear_standalone_aot_mode in a finally block (3) Instead of both of those... this global setting seems kind of icky anyway - is it possible to put it in CodegenParams?

Putting it in CodegenParams is probably the way to go. Will work on it.

c42f · 2019-06-10T05:50:59Z

src/init.c

@@ -832,6 +832,180 @@ void _julia_init(JL_IMAGE_SEARCH rel)
        jl_install_sigint_handler();
 }

+void jl_init_types2(void) JL_GC_DISABLED
+{
+    jl_module_t *core = NULL; // will need to be assigned later


Is this still TODO or is "later" elsewhere in the diff? What's special about this set of types and how does it relate to the init cycle? Could we handle them in a way which is more similar way to the usual system?

I should remove the comment. It's confusing. Later is somewhere else. These types are created in base/boot.jl and then assigned in C in init.c during post_boot_hooks. Without a system image, none of that happens, so this extra init step fixes up a few more basic types.

c42f · 2019-06-10T06:01:44Z

src/init.c

+                                             jl_perm_symsvec(1, "msg"), jl_svec(1, jl_string_type), 0, 0, 1);
+}
+
+// Basic initialization that doesn't load a system image


There seems to be some duplicate logic going on here. Could you generalize _julia_init to take the sysimage via a resource interface rather than expecting to find it in a file? Then various resource loaders could then be plugged in, for example:

In the usual case, loading from file via some search paths

An embedded binary blob like your mini sysimage

Loading it over the network (maybe this could have benefits for wasm, or maybe it's just crazy talk.)

It might allow you to avoid duplicating the init logic so much?

There is some duplicate logic, but I tried to minimize that by using the code in 'dump.c'. Your interface idea is interesting. I don't see how it minimizes duplication, though.

Well I noticed that jl_init_basics shares some 100 lines of code with _julia_init. It seems like this could possibly be factored back together with some more flags or factored apart by extracting some of the shared code.

Yeah, "factored apart" might be best. I'm worried about too many if (standalone_aot_mode) statements.

I'm still struggling with the resource interface. I understand what you mean, but I'm not sure how to code it, yet.

c42f · 2019-06-10T06:21:04Z

test/standalone-aot/runtests.lua

+void hello();
+]]
+lib.init_lib()
+lib.hello()


This is cool. Ultimately better to do this test from C though?

Yes. I'm not really sure how C tests fit in with Julia's testing infrastructure, though (not that Lua helps with that--Lua was just easy to try).

There's the test/embedding directory which seems quite similar in concept.

c42f · 2019-06-10T06:22:52Z

src/aotcompile.cpp

@@ -239,6 +331,113 @@ static void makeSafeName(GlobalObject &G)
        G.setName(StringRef(SafeName.data(), SafeName.size()));
 }

+bool isinlibjulia(std::string name) {


I'm not quite sure what's going on here, but this and jl_name_from_type look fishy :-)

By which I mean - having these lists written out makes me think "there must be a better way" ;-)

tshort · 2019-06-10T10:57:28Z

About jl_apply_generic — what might you hope for this to do? Would it be acceptable to embed the julia IR for such functions and run it in the interpreter?

Running in an interpreter might be an option. I'm not sure how complex that would be to handle or how much overhead it would add. For now, I'm planning to just not support it.

I suppose I'm a bit confused about overall aim here, other than "make wasm work well". Is the goal to

Avoid runtime codegen?

Yes.

Avoid paying for the size of the standard sysimage?

Yes.

Allow embedding a sysimage?

No. There's a mini image embedded that holds global variables, but it doesn't hold code, and it's very limited.

Avoid paying for the size of libjulia?

Maybe. libjulia isn't that big. But, if you use static linking, you might be able to strip out unused parts of libjulia. The same is true of other C/C++ libraries some compiled Julia code uses.

andyferris · 2019-06-13T13:52:38Z

I just wanted to say thank you for looking at this - this could immensely expand where I could use Julia (e.g. at work we were discussing the difficulty of using Julia in AWS Lambda functions; smallish precompiled binaries would make this feasible, same for responsive CLI tools).

c42f · 2019-06-14T05:30:33Z

Looking more at this, I still feel like the concepts of sysimage and mini sysimage might not really be that different and could share more code.

_julia_init and jl_init_basics are quite similar
jl_save_incremental and jl_save_mini_image_to_stream share many similarities
_jl_restore_incremental and jl_restore_mini_sysimg are similar

What are the essential points of difference, and can we make things neater by closing the gap a bit? For example

Generalizing the code which locates the image data (cf comments about "resource" data above)
Allowing a few more things to go into the mini image to possibly avoid some special cases like jl_init_types2

tshort · 2019-06-14T18:00:12Z

You're right on about the overlap, @c42f. I'll spend some time looking for ways to bridge the gap.

JeffBezanson · 2019-06-18T16:26:32Z

src/dump.c

+            dt = jl_uint32_type;
+        }
+        else if (dt->size == 8) {  // change the type to a UInt64
+            dt = jl_uint64_type;


I don't really understand this. Saving the wrong type tag for something doesn't seem useful?

It was useful in the sense that it could make some code compile and run where it wouldn't compile before. The wrong type is often not a problem because the compiled code doesn't really use the type information, it just needs to get the size right (I know that this isn't always true, so it'd be nice not to do this). I was having problems where saving some types would cause the mini-image to explode in size as it tried to pull in more dependent modules and types. That is likely a sign of another problem, and maybe (hopefully) this is just a temporary band aid, but I haven't found the right approach.

JeffBezanson · 2019-06-18T16:27:31Z

src/dump.c

+            dt = jl_uint64_type;
+        }
+        else if (dt->size > 0) {  // change the type to a primitive type with correct size
+            dt = jl_new_primitivetype(jl_symbol("BitsTypeX"), jl_core_module, jl_any_type, jl_emptysvec, dt->size * 8);


If this code doesn't have a fixed repertoire of types --- such that it can handle this new primitive type --- then why not just save the correct type to begin with?

JeffBezanson · 2019-06-18T16:33:46Z

I don't understand the notion that jl_apply_generic can't work. It's a perfectly normal C-callable function. It just does a table lookup, gets a function pointer from that, and calls it. It's also going to be very hard to get any significant piece of julia code working without it. I agree with sometimes wanting to remove the JIT, possibly wanting to remove eval, wanting to remove large parts of Base/stdlib, and maybe even removing the GC. But removing jl_apply_generic would need a very unusual and restrictive context indeed --- an architecture with no indirect call instruction perhaps?

tshort · 2019-06-19T03:03:30Z

I don't understand the notion that jl_apply_generic can't work.

I think of it more as "can't easily work" (at least by me at my state of understanding). I don't understand how to generate or store the table and the functions it points to. I don't think the functions that are pointed to are compiled as part of jl_create_native(). I'm probably missing something...

JeffBezanson · 2019-06-19T15:57:33Z

I don't understand how to generate or store the table and the functions it points to.

We already do that in the system image, so it's possible...

JeffBezanson · 2019-06-19T16:13:06Z

Let me address some of the goals of this:

Avoid runtime codegen?

There are three nascent mechanisms related to this that could be developed further:

You can pass --compile=no or --compile=min to disable the JIT.
You can build a system image with --compile=all and we'll attempt to exhaustively compile everything, such that --compile=no can work in a subsequent run.
You can change the build-time variable JULIACODEGEN to exclude LLVM codegen from libjulia entirely. This is not tested so probably needs some attention.

Avoid paying for the size of the standard sysimage?

https://www.youtube.com/watch?v=4NHJqGA6fTw
"Since the dawn of time mankind hath sought to make things smaller"

The easiest way to do this currently is to remove the stdlibs from the sysimg build (base/sysimg.jl). To improve further, I suspect we need some kind of tree-shaking mechanism that tries to remove everything that won't be used at run time (e.g. global bindings that are never referenced). Of course that can't work in general (e.g. if a program calls eval) but can be addressed per-application when needed.

tshort · 2019-06-19T17:23:50Z

On avoiding the runtime codegen, @Keno did all that in julia-wasm.

Regarding size, the cut-down "PackageCompiler" approach is interesting. It's not clear to me how the tree shaking would work. Another issue with that is if/how it would support cross-compilation. With the jl_create_native() approach, that is straightforward a la CUDAnative.

tshort · 2019-06-27T01:13:32Z

Closing as core developers have suggested that a better approach is through a PackageCompiler / static compilation approach. A key to small code size will be the tree shaking.

tkoolen · 2019-06-27T01:23:34Z

Thanks for meta-shaking the AOT compilation tree anyway!

datnamer · 2019-07-01T21:58:46Z

@tshort can that path also potentially support minimal runtime targets like WASM and embedded ?

Keno · 2019-07-01T22:00:22Z

WASM is not necessarily a minimal runtime target. In any case, that'll be the right place to start. We can come back here and any missing features to julia as necessary.

tshort added 2 commits June 5, 2019 21:24

Cherry pick standalone-mode

b75fe71

Cleanups

c5b3599

c42f reviewed Jun 10, 2019

View reviewed changes

Petr-Hlavenka mentioned this pull request Jun 13, 2019

Pkg + BinaryProvider JuliaLang/Pkg.jl#841

Closed

Remove debug line

28162fb

JeffBezanson reviewed Jun 18, 2019

View reviewed changes

tshort mentioned this pull request Jun 19, 2019

Instructions tshort/julia#2

Open

tshort closed this Jun 27, 2019

tshort mentioned this pull request Aug 25, 2019

Starting points on cross compiling / static compiling Julia-Embedded/Julia-Embedded-Master#1

Open

tshort mentioned this pull request Oct 24, 2019

Tree shaking / reducing custom system images #33670

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Julep/WIP] Standalone AOT compilation mode #32273

[Julep/WIP] Standalone AOT compilation mode #32273

tshort commented Jun 10, 2019

c42f left a comment

c42f Jun 10, 2019

c42f Jun 10, 2019

tshort Jun 10, 2019

c42f Jun 10, 2019

tshort Jun 10, 2019

c42f Jun 10, 2019

tshort Jun 10, 2019

c42f Jun 14, 2019

tshort Jun 14, 2019

c42f Jun 10, 2019

tshort Jun 10, 2019

c42f Jun 10, 2019

c42f Jun 10, 2019

c42f Jun 10, 2019

tshort commented Jun 10, 2019

andyferris commented Jun 13, 2019

c42f commented Jun 14, 2019

tshort commented Jun 14, 2019

JeffBezanson Jun 18, 2019

tshort Jun 19, 2019

JeffBezanson Jun 18, 2019

JeffBezanson commented Jun 18, 2019

tshort commented Jun 19, 2019

JeffBezanson commented Jun 19, 2019

JeffBezanson commented Jun 19, 2019

tshort commented Jun 19, 2019

tshort commented Jun 27, 2019

tkoolen commented Jun 27, 2019

datnamer commented Jul 1, 2019 •

edited

Loading

Keno commented Jul 1, 2019

[Julep/WIP] Standalone AOT compilation mode #32273

[Julep/WIP] Standalone AOT compilation mode #32273

Conversation

tshort commented Jun 10, 2019

Approach

Miscellaneous notes

Feedback / next Steps

c42f left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tshort commented Jun 10, 2019

andyferris commented Jun 13, 2019

c42f commented Jun 14, 2019

tshort commented Jun 14, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JeffBezanson commented Jun 18, 2019

tshort commented Jun 19, 2019

JeffBezanson commented Jun 19, 2019

JeffBezanson commented Jun 19, 2019

tshort commented Jun 19, 2019

tshort commented Jun 27, 2019

tkoolen commented Jun 27, 2019

datnamer commented Jul 1, 2019 • edited Loading

Keno commented Jul 1, 2019

datnamer commented Jul 1, 2019 •

edited

Loading