Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional file extensions for deploying julia scripts/apps #34759

Open
rapus95 opened this issue Feb 14, 2020 · 37 comments
Open

Additional file extensions for deploying julia scripts/apps #34759

rapus95 opened this issue Feb 14, 2020 · 37 comments
Labels
speculative Whether the change will be implemented is speculative

Comments

@rapus95
Copy link
Contributor

rapus95 commented Feb 14, 2020

"Taking end user deployability seriously" would've been a proper name for it aswell 😛.

I'd like to propose adding two more file extensions to the repertoire of Julia.

Current situation: Extension .JL
That's the extension for the source files of Julia. As it currently is the only file extension all julia files have it and the extension is probably bound to a custom editor preferred by the user.


Case 1: Deploy julia scripts to "clicky" end users (e.g. Windows)

Let them install Julia and send them the script file. Then, just open the REPL & call include or instruct them to pass the file as the first command line argument.

Proposal 1: New Extension .JLR (=julia runnable)

This is just an ordinary julia source file which is bound to be run by the julia executable.
That way we don't need to mess with the binding of .jl files which most people probably bound to an editor of their choice. We'd also be able to ship the "run with Julia" by default with all kinds of installers. As such those files would naturally mark script-like entry points into running julia programs.

Sidenote: This file extension was used by Juliar language until late 2019 for their source files. Since that language got renamed to Juka they renamed their source files to .JUK, and thus freeing it again.


Case 2: Provide a more standalone feeling similar to that you get from Java

To run a Java program you need to install the JVM and from then on you can run arbitrary .JAR files which mostly are packed and precompiled code repositories together with their dependencies.

Proposal 2: New Extension .JLB (=julia binary)

While technically not a binary it would be very similar to the .JAR format of Java as it'd also hold the necessary data in a packed format and be bound to be run with the runtime executable. For now it would just be an archive consisting of the corresponding Julia packages (and maybe a runnable).
The actual binary format can change at any time since its sole use is to work out of the box on any platform which has a Julia installation (call it the JuliaVM). No internet required.
It's some sort of the batteries included approach which only relies on a preexisting runtime.

That'd increase the spread of Julia a lot for the similar reason why that works for JAR files. Most users of Java programs aren't programmers. Nevertheless, for a a java program, running is dead simple (once the VM installer was run). With that approach we have a very similar dead simple solution which is straight forward to install.

So while keeping to reach for the stars as in true AOT statically compiled shared shared libraries/executable binaries, this approach would be a good intermediate solution for creating conveniently shareable programs in Julia.

@fredrikekre fredrikekre changed the title Taking end user deployability seriously Additional file extensions for deploying julia scripts/apps Feb 14, 2020
@ndgnuh
Copy link

ndgnuh commented Feb 14, 2020

I have some ideas:

  • julie - Julia.Instant.Executable 🤣
  • jlp - Juliapp
  • jlx - "executable", again

But still, jl script are executable too, soo name based on "executable" doesn't mean much (although jlx sounds kinda cool).

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 14, 2020

But still, jl script are executable too, soo name based on "executable" doesn't mean much

While true, it allows for a different file binding and thus for different default behaviour when doubleclicking it.

@StefanKarpinski
Copy link
Member

I do like .jlr, but it seems a bit odd for applications to have an extension at all—the extension is generally meant to indicate something to the user or the system about what's in something, but for an executable, why does it matter to either Julia is the implementation language?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 14, 2020

I do like .jlr, but it seems a bit odd for applications to have an extension at all—the extension is generally meant to indicate something to the user or the system about what's in something, but for an executable, why does it matter to either Julia is the implementation language?

I could ask the same about .JAR 😛 or .EXE or .BAT or .SH and so on. the latter ones are technically just .txt files.

@ghost
Copy link

ghost commented Feb 14, 2020

In gnome, executable text files show a prompt:
Screenshot from 2020-02-14 19-56-36

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 14, 2020

The common convention seems to be a .jl file with a shebang line and/or in a bin directory. There is also the executable bit to indicate whether a file is meant/allowed to be run as a program or not. Regarding other extensions:

  • .jar is a distinct format, not an executable Java source file
  • .exe is a holdover from DOS, long before executable permission bits existed
  • .bat and .sh are more to tell applications how to edit files, not for the user

The tendency on modern systems is to hide the extension from the user so they don't have to notice details like "oh, this program is implemented with a batch script". Rather than precedents that exist by historical accident like this, how about the present day motivation: what is the benefit (to user or system) of having an extension for executable Julia files?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 14, 2020

@StefanKarpinski
The runnable (jlr) is comparable to .bat/.sh as it is a script aswell which just calls the right stuff in the right order. And would be mostly used for that sort of convenient glue code. If you have larger code bases, you'd go for the more complete solution (jlb) anyway.

For more complex programs which would be packed into the jlb the comparison to .jar is quite good as a .jar holds java bytecode (and that is decompilable into the java source). As such it can be considered bijectively equivalent to a source file archive/a tarball. Also, the actual spec of the binary format would change anyway once we have AOT static compilation. Augmenting the tarballs with entry points and packing the dependencies into it aswell is just a good solution in the meantime. Also, using a jlr file as the entry point would here be quite consistent as you'd fall back to the jlr workflow in case you unpack the tarball-style-binary.

And regarding the executable flag: that might be true for Linux but the whole system of runnable/binary deployables how I proposed it aims at users who aren't programmers in the first place. And as such, many of those are not on Linux but on Windows. There the executable bit doesn't matter (not even sure if it even exists). On Windows it just depends on which application a given filetype has been bound to. Even if we could open such a dialog first which asks whether to run or open, that could be more confusing than hunting the .jlr file. At least to those who just need to run programs and have no clue how to write them.

Sure, often the file endings are hidden, but even then, the default setting is a "filetype" column:
image
(For security reasons)
Now consider you indeed have a single jlr file within a bunch of jl files. You can see how easy it is to spot the jlr file just by watching at the icon. (The text sheet is the icon for vscode btw)

@c42f
Copy link
Member

c42f commented Feb 17, 2020

what is the benefit (to user or system) of having an extension for executable Julia files

I think the idea here is that it would be nice for non-technical windows users to be able to run a julia application by clicking on it. FWIW I agree that:

  • This would be very useful for julia programmers who support non-technical end users
  • It's very easy to implement on windows with a file extension association. Other methods are a lot less obvious.

But I also think there's other ways to achieve the goal of "deploy a complete application and have a great end-user experience". Perhaps PackageCompiler is a better way forward, for example.

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 17, 2020

But I also think there's other ways to achieve the goal of "deploy a complete application and have a great end-user experience". Perhaps PackageCompiler is a better way forward, for example.

That's the elegance in my proposal. If we define a "binary" format for it we are able to design it freely and progressively. By that we could simply go a route like that:

binaryformat@v1: combined tarballs packed in an archive
binaryformat@v2: combined tarballs packed in an archive & augmented with a sysimg from PackageCompiler.jl
binaryformat@v3: standalone executable once we have a solution to generate those

In those cases the only requirement we will have is an existing julia installation but won't require internet (for v3 even julia is optional).

From the non-julia-affine user perspective those all work the same, so I'd consider the actually used version an implementation detail. Thus, it makes a perfect solution for deploying julia programs in a progressively enhancing way. In each version step you'd get a performance boost with the outlined order above.

@KristofferC KristofferC added the speculative Whether the change will be implemented is speculative label Feb 17, 2020
@StefanKarpinski
Copy link
Member

I still don't see why the file extension is necessary. The entire benefit seems to come from being able to produce something that the system considers to be an application (which has different formats on different systems). How does having a special extension for "Julia application" help?

@c42f
Copy link
Member

c42f commented Feb 18, 2020

Agreed, this seems like a windows-specific deployment issue which would be better served by having a really easy way to create YourApp.exe where the source happens to be Julia code. Is there a reason that PackageCompiler can't solve these problems?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 18, 2020

@StefanKarpinski @c42f in that case may anyone explain me the purpose of .JAR files? Why not turn any .JAR into a .exe? Same holds for .deb, .rpm, .msi, well, in fact all of those formats which are meant for sharing code/program bundles.

Even when adding support for compiling into .exe (which will be the last step of the roadmap I outlined above and which is currently rather far away due to several issues when code isn't available anymore) we won't have a consistent way to pack and share julia programs across distributions.
Except advising others to unpack the manually packed source hierarchy & search the entry point & then run with given commands and whatsoever.

So that is by no means a platform dependent issue. A language which has source as their sharing language (as opposed to binaries) should at least have a suggested & comfortable way to share programs without juggling lots of files or requiring internet. Touching the Pkg organized local .julia & packing what is needed manually can't be the suggested way to share programs IMO.

.vsix (vscode extension format) is a good example aswell. Registered versions are automatically pulled from the internet once their installation is requested. But in case you want to test an unregistered version or don't have internet at your place you can just import a .vsix extension file. Those are cross-platform btw.

I currently don't see PackageCompiler (or any other solution) handling relocatability & cross-compilation in a way that can replace sharing the source within the next few years. That's a fundamental issue, not one which could be solved easily by cloning @KristofferC et al. The only languages which managed to do so run in a VM or are interpreted, thus, run in a VM. So to have true platform independency without requiring a VM we would have to bundle multiple platform dependent versions and share those bundles. Which then again lack a cross-platform entry point. Thus, we need a platform dependent entry point anyway. I'd call for taking the code as macro-lifted platform dependent code and bundle it into a .JLB and let julia.exe (or .sh) be the platform dependent VM that lifts the actual platform dependency. Once we find a solution for the former problem we still can introduce a 2nd version for the same format extension.

In general, why are you trying to prevent claiming even a 2nd file extension? Literally every program which is compiled at some stage exists as multiple file formats. Usually one for code and one for executable. Maybe even more formats for intermediate stages. In case of Java whose portability actually is very good, there's .java (source), .class (compiled), .jar (packed runnable). We currently (!!) don't have a real compiled format (at least not file based) so that falls out anyway but having a portable runnable with all batteries included seems to be a good thing no matter what.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 18, 2020

in that case may anyone explain me the purpose of .JAR files? Why not turn any .JAR into a .exe? Same holds for .deb, .rpm, .msi, well, in fact all of those formats which are meant for sharing code/program bundles.

Those are all data formats for consumption by specific applications, not applications themselves. A .jar file is a data file that the system knows should be opened by a JVM. The system does not natively know how to execute a .jar file. A .deb file is a Debian package. An .rpm file is a Red Hat package. None of these are executables, they're bundles of data in a specific format.

we won't have a consistent way to pack and share julia programs across distributions.

I'm not sure what this means. What, specifically do you want to do?

In general, why are you trying to prevent claiming even a 2nd file extension?

I'm not preventing anything. Feel free to tweet all day about how .jlr and .jlb are whatever you want to claim that they are. You are trying to convince me and others here that we should also claim this. I am unconvinced and I'm asking you to convince me that this makes sense. I don't, as a general rule, go around claiming things that I don't understand that don't make sense to me.

We currently (!!) don't have a real compiled format (at least not file based) so that falls out anyway but having a portable runnable with all batteries included seems to be a good thing no matter what.

This seems to be getting at something concrete. Is what you want that a bundle of Julia source files can have a specific extension that can be associated with some program (maybe julia, maybe something else, like a JVM) that knows how to invoke a pre-installed julia binary on that bundle and run it as an application?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 18, 2020

Finally, we're making progress, I found where we didn't share assumptions. 😄

Those are all data formats for consumption by specific applications, not applications themselves. A .jar file is a data file that the system knows should be opened by a JVM. The system does not natively know how to execute a .jar file.

That's exactly what I want for .JLB aswell! It won't run by itself but needs julia.exe to be executed, i.e. julia -jlb myapp.jlb would execute it for example

EDIT: you found it first 😆

This seems to be getting at something concrete. Is what you want that a bundle of Julia source files can have a specific extension that can be associated with some program (maybe julia, maybe something else, like a JVM) that knows how to invoke a pre-installed julia binary on that bundle and run it as an application?

That's it!

@c42f
Copy link
Member

c42f commented Feb 19, 2020

I completely agree that easier deployment for end users is a desirable goal. The problem with this issue is that we've jumped immediately from agreeing on a high level goal to disagreeing about an implementation detail (ie, which, if any, file extensions should be involved). I think the discussion would be more productive if we could first agree on the exact problem we're trying to solve here.

From my point of view, a relevant "user story" (from my previous job) would be thus:

I'm a developer of julia applications for signal processing and I'd like to deploy these to my non-technical windows users without having to maintain any server infrastructure. Many of these users don't understand command line interfaces and it's not always a good use of their time — or mine — to expect them to learn. Instead, they should be able to click on an icon on their desktop (or menu) and have some options presented to them in a graphical interface. For example, to select input and output files. The application may depend on several packages, including those which call C code like FFTW. I want to be able to fix package versions using Project/Manifest so that deployment is reproducible.

@c42f
Copy link
Member

c42f commented Feb 19, 2020

Note that the story above is probably way too broad, but it exposes several difficulties which don't have a lot to do with file extensions:

  • Desiring reproducible deployment has led to a naively multi-file situation where we'd like to have Project/Manifest available somehow; perhaps embedded, or have the environment pre-packaged so that instantiation happens at build-time. It's somewhat possible to have the environment embedded in a script, but this is kind of a hack. If we're going to do "runnable apps" properly we should encourage reproducible deployment: we should require a Manifest and make it really easy to provide one. Here I suspect we've already gone beyond what a simple renaming of julia source .jl into .jlr could achieve.
  • Calling C libs from Julia is quite typical which makes having truly cross-platform bundles (.jlb like .jar) less attractive. Apparently you can do it with jar files but it's kind of messy and certainly bloats the package above what would be required for a platform-specific one. Platform-specific bundles are an alternative, but if we go in that direction why not just provide good support for native platform-specific bundling mechanisms?

Overall, I think the .jlb idea is potentially interesting (I say this somewhat naively; for example I'm no expert on the internals of Pkg nor PackageCompiler). There's a lot of detail to work out though, and I think this work would best be done in a separate package to start with. This is purely a practical thing: experimenting in a package gives you a lot more freedom to move fast and break things :-)

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

@c42f I'm not sure if you followed our last comments regarding .jlb files not being stand-alone. They'd still require a preinstalled julia binary (julia.exe for windows) which knows how to execute those bundles. That would also manage platform dependencies. In my suggested first iteration of this idea the bundle would just contain all the julia source files and "executing the bundle" in fact just does the same as if it were proper packages, including bundle-internal dependency resolution, precompilation & compilation. In a further iteration we could include a sysimage for a faster startup. And in the final iteration we'd remove the requirement of any preinstalled VM (like julia.exe) but would gain strict platform dependency. By then it'd be a tradeoff which sometimes is definitively worthwhile and sometimes it's not. So having both options at the end will be benefitial.

@KristofferC
Copy link
Member

KristofferC commented Feb 19, 2020

What's the advantage of shipping all the packages sources when you still have to go get the artifacts etc? You might as well get the packages in the same way during this "build step" of the bundle (with the exception of locally developed packages that would, of course, need to be bundled) and you might as well share that with the local environment because maybe many of those packages/artifacts are already installed.

It seems that the whole thing just folds into a Project/Manifest and a file (main.jl) plus eventual locally developed code that does

Pkg.activate(@__DIR__)
Pkg.instantiate()

using DataFrames
using LocalPackage # tracked by relative path in Manifest

function run_my_code()
   ...
end

run_my_code()

that you run with julia main.jl?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

Well, I'd bundle the artifacts together with it. Also, there are a lot of packages which don't require artifacts. Especially from the linear algebra part. Those would already benefit. And as a last resort if there's some really desperate need for an artifact we could still require the user to install it manually or if that artifact is needed widely enough, it could be added to the julia distribution (similar to blas etc). Though, I'd not really like the latter option and would simply bundle them together.

It seems that the whole thing just folds into a Project/Manifest and a file (main.jl) plus eventual locally developed code

That's almost it. But then again, that file would be within the jlb archive and thus couldn't be run. So as a matter of fact, that snippet of code had to live in the julia executable and be executed whenever someone calls julia with a jlb file.

Also, there's currently no way to conveniently setup a sealed bundle, i.e. one which only resolves dependencies within the provided packages and without touching the local storage managed by Pkg.

Tbh I don't even know how to drop in packages when my local julia install doesn't have internet? Can I simply put a somewhere else copied packages/T into my local packages/ directory? Or would I have to drop in a clone of the general registry aswell etc. Seems more cumbersome than a sealed bundle which already contains everything that's needed. Similar to a docker container with everything setup, just without the julia distribution and without the platform. Only Julia code and its direct dependencies.

@KristofferC
Copy link
Member

Well, I'd bundle the artifacts together with it.

Then you need to make sure that you bundle the correct artifact for the correct system etc which I kind of thought the point was not to have to do :

They'd still require a preinstalled julia binary (julia.exe for windows) which knows how to execute those bundles. That would also manage platform dependencies.

You also need to deal with packages that use BinaryProvider which is more difficult.

Also, there's currently no way to conveniently setup a sealed bundle, i.e. one which only resolves dependencies within the provided packages and without touching the local storage managed by Pkg.

Don't really understand what this means but isn't DEPOT_PATH what you are talking about here?

Can I simply put a somewhere else copied packages/T into my local packages/ directory?

Sure, if you then Pkg.develop that path.

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

I may describe the problem from the end users perspective:

  1. A friend sends me a program.jlb which I wanna run. I know that it requires Julia.
  2. (Only has to be done once) Go to julialang.org/downloads and download the appropriate distribution for my platform. Install that.
  3. Double click the .jbl file.
  4. Profit

The fact that it's only a small snippet of code which would be needed to support that first iteration of the format, is what makes it so appealing IMO.

You also need to deal with packages that use BinaryProvider which is more difficult.

If you don't have internet you're out of luck in that case anyway. BinaryProvider would need an update, I guess, to first search in the bundle. Or we just offer that bundle packing only on Julia 1.3 and upwards.

There's another advantage aswell. Even if you have internet, without bundling, you'll never be able to achieve the convenience outlined above if any dependency is private. Except of course you setup a script which either has a ssh key or token inlined in plaintext or by having a script which does authenticate on a 3rd server first to get a ssh access key. That authentication needs to be stored anywhere aswell though, so again access being plain text. I know, if I bundle private repositories in the bundle then it's not secure either but it doesn't force me to make my package public. Sharing the snapshot currently would definitively be the most convenient variant to work with private packages on a computer where the user doesn't understand much more than "install julia".

@KristofferC
Copy link
Member

KristofferC commented Feb 19, 2020

I may describe the problem from the end users perspective:
...

Yeah, that process is pretty much what you get with PackageCompiler.jl and the create_app function (except it doesn't create a .jlb file, it creates a folder with an executable in it). And as a bonus, you don't even have to install Julia.

Now the reason against PackageCompiler was relocatability (FWIW, I don't think the statement below is true, _jll packages are quickly being used and these are relocatable):

I currently don't see PackageCompiler (or any other solution) handling relocatability & cross-compilation in a way that can replace sharing the source within the next few years.

The whole reason the strategy here (using the source files) would help with relocatability is if you allow the build step to run on the user machine so the Julia machinery gets to do its "normal thing" where it runs Julia code to download dependencies and set up paths etc.
You cannot remove this step because then you have the same problem with relocatability as PackageCompiler. The build step is actually modifying the source files (so each user has a different set of "sources", which is discussed in https://julialang.github.io/PackageCompiler.jl/dev/apps/#Relocatability-1).

If you don't have internet you're out of luck in that case anyway. BinaryProvider would need an update, I guess, to first search in the bundle. Or we just offer that bundle packing only on Julia 1.3 and upwards.

That sounds exactly like what is already implemented in PackageCompiler for artifacts.

Even if you have internet, without bundling, you'll never be able to achieve the convenience outlined above if any dependency is private.

Yes, which is why I said:

(with the exception of locally developed packages that would, of course, need to be bundled)

I still don't see any argument why this is not just inherently #34759 (comment).

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

The whole reason the strategy here (using the source files) would help with relocatability is if you allow the build step to run on the user machine so the Julia machinery gets to do its "normal thing" where it runs Julia code to download dependencies and set up paths etc.

I know, and that's why I propose this source bundling.

@KristofferC
Copy link
Member

But at the same time, you say

Well, I'd bundle the artifacts together with it

How would that work with relocatability then? What would the build step do?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

Yeah, that process is pretty much what you get with PackageCompiler.jl and the create_app function (except it doesn't create a .jlb file, it creates a folder with an executable in it).

That's what I explicitly wanna not have.
In your case I need to send a directory with some hierarchy and stuff. Inside that directory there's a file which shall be executed.
In my case there's a single file which you double click.
This difference is the sole reason for self-extracting archives to exist 😅
Regarding the bonus of not having to install julia, I'm fine to install julia once and from then on be able to run .jlb files whenever I get them.

(with the exception of locally developed packages that would, of course, need to be bundled)

If we need bundles for locally developed (or privately hosted) packages anyway, then IMO we should support all kinds of packages in those bundles, no matter if they are developed locally. 😁

How would that work with relocatability then? What would the build step do?

extract the artifact for the correct platform out of the bundle and use that in the build step.

And to repeat myself:

Having both options at the end will be benefitial.

Options are: binary deployment which is platform dependent and source bundle deployment which is platform independent but needs a VM-style executable preinstalled and artifacts for all required platforms included. That's batteries included.
Binary deployment comes in the platform specific executable format (.exe for windows) while source deployment always comes as .jlb

@KristofferC
Copy link
Member

In your case I need to send a directory with some hierarchy and stuff. Inside that directory there's a file which shall be executed.
In my case there's a single file which you double click.

That's the easy part once you have the first part working well.

source bundle deployment which is platform independent but needs a VM-style executable preinstalled and artifacts for all required platforms included.

Okay, so you want to bundle all artifacts for all platforms in the .jlb?. So if someone uses FFTW you would bundle everything in https://github.com/JuliaBinaryWrappers/FFTW_jll.jl/blob/master/Artifacts.toml? That just seems like a non-starter...
Also, the build step in Julia runs the build.jl file with arbitrary Julia code in its glory, that's it. And the build.jl file looks like https://github.com/JuliaIO/CodecZlib.jl/blob/master/deps/build.jl. Hooking into that to redirect looking for libraries somewhere else is not easy and it would in large part replicate the artifact system we already have.

Anyway, I think the next step here is for someone interested in this to provide some proof of concept or MVP of the thing so it can be tried out and potential issues can be discovered / fixed.

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

Okay, so you want to bundle all artifacts for all platforms in the .jlb?

Exactly.

So if someone uses FFTW you would bundle everything in https://github.com/JuliaBinaryWrappers/FFTW_jll.jl/blob/master/Artifacts.toml? That just seems like a non-starter...

That's even another reason why we should offer both binary & source deployment. For cases where you heavily depend on platform dependent libraries, source deployment will indeed be a non-starter. But in cases where your code only has a single and small platform dependency (or none at all) the source deployment can be very convenient. The more artifacts will find their way into a pure julia rewrite the broader will the usecase of source deployment become.

I envision a future where we have both approaches even mixed, like:
You deploy a source bundle which, once on the target system, is then compiled into a platform dependent executable.
By that you get maximum relocatability (because you do the build step at the target system) aswell as maximum startup and overall performance (because you compile it into a binary).

Also, the build step in Julia runs the build.jl file with arbitrary Julia code in its glory, that's it. And the build.jl file looks like https://github.com/JuliaIO/CodecZlib.jl/blob/master/deps/build.jl. Hooking into that to redirect looking for libraries somewhere else is not easy and it would in large part replicate the artifact system we already have.

That's why I'd probably suggest to just only support 1.3 onwards and then just give the artifact system the explicit capability that it looks up everything in the bundle if started from the bundle.

Once we have that, we could simply pack stdlib as a bundle and distribute that by default. That'd also better reflect the immutable state of stdlib in pre-built julia binaries. Also, once we can start a sysimage from a bundle we will have a perfect separation of stdlib and more julia intrinsic. We can even offer different bundles for other use cases (e.g. focus on plotting, machine learning etc).
Thus, hitting the stdlib.jlb will just kick up the default Julia REPL.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 19, 2020

@rapus95, I think you're under the impression that something like JAR files make sense for Julia, which I don't believe that they do. Let me try to explain why I don't think they make sense.

Why do JAR files work for Java?

Java has, from its inception, actively rejected applications having native dependencies. The whole premise of Java from the beginning has been that everything runs inside of the platform independent JVM. They've made it about as hard as possible to call native code. The security model does not allow you to ship applications with external binary dependencies. Therefore, the nature of a typical Java app is that it only depends on Java code and the JVM.

How well does shipping apps as JAR files work for Java?

In the beginning, everyone was really amped about Java's "write once, run everywhere" promise. It was anticipated that web applications would be Java "Applets" running in a JVM embedded in the browser and that people would write desktop applications in Java and then send the same JAR file to people on any OS. But none of that happened. JavaScript beat Java in the browser and native apps beat Java on the desktop. No one writes browser apps in Java anymore and everyone hates Java desktop apps. The only place Java did well is on servers. There are many reasons, but before pushing for the JAR model for Julia, shouldn't one stop and consider whether it was a successful model in the first place?

Would something like JAR work well for Julia?

Unlike Java, Julia very much embraces natives libraries. It's pretty much unavoidable in the technical computing space. You need BLAS, LAPACK, GMP, MPFR, dSFMT, FFTW, and tons more. Sure, we ship with some of those, but we'd like to reduce the ones we ship and provide them as JLLs as much as possible. There are 253 JLL packages already with more being added all the time. If you want a GUI and you use Gtk, that's already 73 JLLs.

Your position seems to be that the pre-installed Julia will handle the JLLs. But then why not just let it handle all the dependencies, not just the binary ones? After all, there's no such thing as "a bit self-contained", something either is or isn't self-contained. If I send someone an app that needs to download and install 73 JLLs to work, why would they care if it also downloads and installs some pure Julia packages? On the other hand, if I'm going to send them something that's self-contained with all the Julia packages and binaries it needs, why not also include the Julia binary itself?

The middle ground of shipping a bunch of Julia source files without Julia and without the necessary binary dependencies just doesn't make sense to me. Unlike Java, there are hardly any non-trivial Julia applications, especially ones with GUIs, that don't have additional binary dependencies.

In summary:

  1. the JAR model only ever made sense for VM-based systems like JVM and CLR;
  2. it wasn't even very successful as a way to distribute apps for those VM-based systems;
  3. it offers no benefits to a system like Julia over these already-available options:
    • share apps as projects, require Julia be installed, let Julia install dependencies
    • ship applications as standalone platform-specific executables

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 19, 2020

The middle ground of shipping a bunch of Julia source files without Julia and without the necessary binary dependencies just doesn't make sense to me.

I feel like we still don't have the same perception of what I am suggesting. So, again, to clarify it: For me, a bundle will be created with the following capabilities: You select an entry script which you wanna bind to execution on double click and all platforms you want the bundle to support. Then, a file is created which contains the necessary source files and the necessary artifacts for those platforms. Now you can pass that file to any previously selected platform. There, when calling julia.exe -jlb myfile.jlb the julia exe can fetch all dependencies from inside the bundle and starts ordinary work within that closed system. Thus, it does JIT compilation and linking to the provided binaries etc.

The other format is a true system executable. That will be created by PackageCompiler and is fully compiled and self-contained and won't depend on julia.exe to exist. Especially that has literally no startup time.

While I agree that Java often is a pain, still, neither of your two options give the clicky experience, i.e. having a default for Julia where files are interpreted/executed on double clicking them without sacrificing any capabilities. Recall the original post: I suggested 2 formats which should be runnable by double clicking them while still in fact just includeing a format-dependent set of scripts.

I'm simply missing the capability to see packages as modules which I can move around freely (and which are relocateable between platforms). There's no recommended (and explicitly supported) way, how to do manual module management without requiring to be familiar with the REPL or Pkg. I have some history in Minecraft Mod development and while DEFINITIVELY being a pain to write mods building upon each other, once they worked it was a very smooth experience. Just drop all mods you want into a common directory and fire up minecraft (a .jar file which relied upon a library which handled binary dependencies). Put in your core mod and your own extension modules. It will just work.

Instead of saying "java failed on providing convenient binary interop and JAR was made for truly platform independent applications" I'd rather call JAR a very good idea which was flawed because they failed to provide good binary interop (due to security concerns). Julia doesn't have those concerns and has very good binary interop. Now, mind if we also had a working JAR model. What that means I tried to outline several times now. So if no one chimes in to rephrase what I'm trying to explain, I probably have to come up with a more concrete implementation. Hopefully a working one.

@oscardssmith
Copy link
Member

What @StefanKarpinski is saying is that the only reason jars make sense as a format is that java doesn't handle binary interop. To make a package that interops with binary code (which would be basically every package), you would need to bundle a version of the binary package for every type of system you want it to be able to run on. Such a bundle would both be massive, and not run-anywhere (just run anywhere for which you have included binary dependencies). A bundle is great, but if even a simple graphics app is 1GB, that starts becoming pretty unappealing.

@StefanKarpinski
Copy link
Member

There, when calling julia.exe -jlb myfile.jlb the julia exe can fetch all dependencies from inside the bundle and starts ordinary work within that closed system. Thus, it does JIT compilation and linking to the provided binaries etc.

The other format is a true system executable. That will be created by PackageCompiler and is fully compiled and self-contained and won't depend on julia.exe to exist. Especially that has literally no startup time.

What benefit does the former (invoking Julia on a .jlb) have over the latter (executable apps)?

There's no recommended (and explicitly supported) way, how to do manual module management without requiring to be familiar with the REPL or Pkg.

There is: just clone repos or unpack tarballs of packages in a directory that's in LOAD_PATH. Yes, the package manager will do this for you, but it's trivial to do manually as well.

You seem to be doing a bait-and-switch about what you want to make possible: is it running applications or passing packages around?

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 20, 2020

If it feels like doing a bait-and-switch, I didn't mean to. I want solutions for both scenarios. Like I have a runnable which will do some job where I can simply extend the funcrionality by placing a given package next to it.

Regarding the advantages of running julia over running an executable, in the former case you have the full power of Julia JIT compilation for example to allow the users to supply customized user functions.

E. g. I have a game with some entities running around. Now I make a source bundle of that. The user now may just plug new user created content as packages next to it but can also utilize coding as part of solving some puzzles in the game.

@martinholters
Copy link
Member

Suppose Julia could access files in an archive just as if they were in the LOAD_PATH (like Java can load classes from JARs). And it had an option that said: "use this archive as project, instantiate it, run some standardized script file in it." To what extent would that solve your problem? Oh, assuming we agree on an extension for those archives and associate it with Julia, of course. The first start might be slow due to the instantiate taking place, but it would fulfill the clicky experience thing.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Feb 20, 2020

Regarding the advantages of running julia over running an executable, in the former case you have the full power of Julia JIT compilation for example to allow the users to supply customized user functions.

We seem to be getting at some kind of disconnect here. There's nothing about shipping an executable app that bundles Julia with it that prevents JIT compilation or limits the power of Julia in any way. It makes no difference if the Julia executable is inside or outside of the application. An application can choose not to allow JIT, but that's totally unrelated.

Like I have a runnable which will do some job where I can simply extend the funcrionality by placing a given package next to it.

E. g. I have a game with some entities running around. Now I make a source bundle of that. The user now may just plug new user created content as packages next to it but can also utilize coding as part of solving some puzzles in the game.

If you put a package in the LOAD_PATH and load it, that works. We could allow loading code from tarballs of Julia source except for packages that have deps/build.jl files, since those expect to live in writable directories. A tarball of Julia source code could have a special extension or format, but that's not really necessary. They could just be tarballs whose contents are Julia packages.

@c42f
Copy link
Member

c42f commented Feb 21, 2020

We could allow loading code from tarballs of Julia source except for packages that have deps/build.jl

This sounds like a pretty good idea especially for systems which struggle with a lot of files. Currently I've got 94133 files in .julia/packages of which 20516 are .jl files. But that would be only 1109 files if we had an archive per package. It would require packages to access their non-jl "resource" files through an abstraction but that might be a good thing anyway.

Recently I was working on an old HPC system where the home directory was mounted on an old SGI DMF. I had to link .julia to another mount point because the home dir was slooow. Reducing the number of files may have helped.

@StefanKarpinski
Copy link
Member

Windows also has issues with lots of files. It's been a long-time goal, but we only recently have built the artifact system that makes immutable package installation feasible. With artifacts, there's no reason we can't just load packages directly from tarballs.

@rapus95
Copy link
Contributor Author

rapus95 commented Feb 23, 2020

There's nothing about shipping an executable app that bundles Julia with it that prevents JIT compilation or limits the power of Julia in any way.

Except that it'd force all such executable apps to bundle Julia. Platform-dependently
(isn't it?)

@martinholters that would perfectly solve it.

use this archive as project, instantiate it, run some standardized script file in it.

Instantiation should have a flag to take the archive into account exclusively.

It would require packages to access their non-jl "resource" files through an abstraction but that might be a good thing anyway.

This definitively would help writing relocatable packages by default.

A tarball of Julia source code could have a special extension or format, but that's not really necessary

For loading files from there I agree. But for the default entry point runnable case it would definitively be convenient. Thanks to given mechanics on windows (default file handlers etc). Though those should be also loadable as simple archives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
speculative Whether the change will be implemented is speculative
Projects
None yet
Development

No branches or pull requests

7 participants