-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional file extensions for deploying julia scripts/apps #34759
Comments
I have some ideas:
But still, |
While true, it allows for a different file binding and thus for different default behaviour when doubleclicking it. |
I do like .jlr, but it seems a bit odd for applications to have an extension at all—the extension is generally meant to indicate something to the user or the system about what's in something, but for an executable, why does it matter to either Julia is the implementation language? |
I could ask the same about .JAR 😛 or .EXE or .BAT or .SH and so on. the latter ones are technically just .txt files. |
The common convention seems to be a
The tendency on modern systems is to hide the extension from the user so they don't have to notice details like "oh, this program is implemented with a batch script". Rather than precedents that exist by historical accident like this, how about the present day motivation: what is the benefit (to user or system) of having an extension for executable Julia files? |
@StefanKarpinski For more complex programs which would be packed into the jlb the comparison to .jar is quite good as a .jar holds java bytecode (and that is decompilable into the java source). As such it can be considered bijectively equivalent to a source file archive/a tarball. Also, the actual spec of the binary format would change anyway once we have AOT static compilation. Augmenting the tarballs with entry points and packing the dependencies into it aswell is just a good solution in the meantime. Also, using a jlr file as the entry point would here be quite consistent as you'd fall back to the jlr workflow in case you unpack the tarball-style-binary. And regarding the executable flag: that might be true for Linux but the whole system of runnable/binary deployables how I proposed it aims at users who aren't programmers in the first place. And as such, many of those are not on Linux but on Windows. There the executable bit doesn't matter (not even sure if it even exists). On Windows it just depends on which application a given filetype has been bound to. Even if we could open such a dialog first which asks whether to run or open, that could be more confusing than hunting the .jlr file. At least to those who just need to run programs and have no clue how to write them. Sure, often the file endings are hidden, but even then, the default setting is a "filetype" column: |
I think the idea here is that it would be nice for non-technical windows users to be able to run a julia application by clicking on it. FWIW I agree that:
But I also think there's other ways to achieve the goal of "deploy a complete application and have a great end-user experience". Perhaps PackageCompiler is a better way forward, for example. |
That's the elegance in my proposal. If we define a "binary" format for it we are able to design it freely and progressively. By that we could simply go a route like that: binaryformat@v1: combined tarballs packed in an archive In those cases the only requirement we will have is an existing julia installation but won't require internet (for v3 even julia is optional). From the non-julia-affine user perspective those all work the same, so I'd consider the actually used version an implementation detail. Thus, it makes a perfect solution for deploying julia programs in a progressively enhancing way. In each version step you'd get a performance boost with the outlined order above. |
I still don't see why the file extension is necessary. The entire benefit seems to come from being able to produce something that the system considers to be an application (which has different formats on different systems). How does having a special extension for "Julia application" help? |
Agreed, this seems like a windows-specific deployment issue which would be better served by having a really easy way to create |
@StefanKarpinski @c42f in that case may anyone explain me the purpose of .JAR files? Why not turn any .JAR into a .exe? Same holds for .deb, .rpm, .msi, well, in fact all of those formats which are meant for sharing code/program bundles. Even when adding support for compiling into .exe (which will be the last step of the roadmap I outlined above and which is currently rather far away due to several issues when code isn't available anymore) we won't have a consistent way to pack and share julia programs across distributions. So that is by no means a platform dependent issue. A language which has source as their sharing language (as opposed to binaries) should at least have a suggested & comfortable way to share programs without juggling lots of files or requiring internet. Touching the Pkg organized local .julia & packing what is needed manually can't be the suggested way to share programs IMO. .vsix (vscode extension format) is a good example aswell. Registered versions are automatically pulled from the internet once their installation is requested. But in case you want to test an unregistered version or don't have internet at your place you can just import a .vsix extension file. Those are cross-platform btw. I currently don't see PackageCompiler (or any other solution) handling relocatability & cross-compilation in a way that can replace sharing the source within the next few years. That's a fundamental issue, not one which could be solved easily by cloning @KristofferC et al. The only languages which managed to do so run in a VM or are interpreted, thus, run in a VM. So to have true platform independency without requiring a VM we would have to bundle multiple platform dependent versions and share those bundles. Which then again lack a cross-platform entry point. Thus, we need a platform dependent entry point anyway. I'd call for taking the code as macro-lifted platform dependent code and bundle it into a .JLB and let julia.exe (or .sh) be the platform dependent VM that lifts the actual platform dependency. Once we find a solution for the former problem we still can introduce a 2nd version for the same format extension. In general, why are you trying to prevent claiming even a 2nd file extension? Literally every program which is compiled at some stage exists as multiple file formats. Usually one for code and one for executable. Maybe even more formats for intermediate stages. In case of Java whose portability actually is very good, there's .java (source), .class (compiled), .jar (packed runnable). We currently (!!) don't have a real compiled format (at least not file based) so that falls out anyway but having a portable runnable with all batteries included seems to be a good thing no matter what. |
Those are all data formats for consumption by specific applications, not applications themselves. A .jar file is a data file that the system knows should be opened by a JVM. The system does not natively know how to execute a .jar file. A .deb file is a Debian package. An .rpm file is a Red Hat package. None of these are executables, they're bundles of data in a specific format.
I'm not sure what this means. What, specifically do you want to do?
I'm not preventing anything. Feel free to tweet all day about how
This seems to be getting at something concrete. Is what you want that a bundle of Julia source files can have a specific extension that can be associated with some program (maybe |
Finally, we're making progress, I found where we didn't share assumptions. 😄
That's exactly what I want for .JLB aswell! It won't run by itself but needs julia.exe to be executed, i.e. EDIT: you found it first 😆
That's it! |
I completely agree that easier deployment for end users is a desirable goal. The problem with this issue is that we've jumped immediately from agreeing on a high level goal to disagreeing about an implementation detail (ie, which, if any, file extensions should be involved). I think the discussion would be more productive if we could first agree on the exact problem we're trying to solve here. From my point of view, a relevant "user story" (from my previous job) would be thus:
|
Note that the story above is probably way too broad, but it exposes several difficulties which don't have a lot to do with file extensions:
Overall, I think the |
@c42f I'm not sure if you followed our last comments regarding .jlb files not being stand-alone. They'd still require a preinstalled julia binary (julia.exe for windows) which knows how to execute those bundles. That would also manage platform dependencies. In my suggested first iteration of this idea the bundle would just contain all the julia source files and "executing the bundle" in fact just does the same as if it were proper packages, including bundle-internal dependency resolution, precompilation & compilation. In a further iteration we could include a sysimage for a faster startup. And in the final iteration we'd remove the requirement of any preinstalled VM (like julia.exe) but would gain strict platform dependency. By then it'd be a tradeoff which sometimes is definitively worthwhile and sometimes it's not. So having both options at the end will be benefitial. |
What's the advantage of shipping all the packages sources when you still have to go get the artifacts etc? You might as well get the packages in the same way during this "build step" of the bundle (with the exception of locally developed packages that would, of course, need to be bundled) and you might as well share that with the local environment because maybe many of those packages/artifacts are already installed. It seems that the whole thing just folds into a Project/Manifest and a file (
that you run with |
Well, I'd bundle the artifacts together with it. Also, there are a lot of packages which don't require artifacts. Especially from the linear algebra part. Those would already benefit. And as a last resort if there's some really desperate need for an artifact we could still require the user to install it manually or if that artifact is needed widely enough, it could be added to the julia distribution (similar to blas etc). Though, I'd not really like the latter option and would simply bundle them together.
That's almost it. But then again, that file would be within the jlb archive and thus couldn't be run. So as a matter of fact, that snippet of code had to live in the julia executable and be executed whenever someone calls julia with a jlb file. Also, there's currently no way to conveniently setup a sealed bundle, i.e. one which only resolves dependencies within the provided packages and without touching the local storage managed by Pkg. Tbh I don't even know how to drop in packages when my local julia install doesn't have internet? Can I simply put a somewhere else copied packages/T into my local packages/ directory? Or would I have to drop in a clone of the general registry aswell etc. Seems more cumbersome than a sealed bundle which already contains everything that's needed. Similar to a docker container with everything setup, just without the julia distribution and without the platform. Only Julia code and its direct dependencies. |
Then you need to make sure that you bundle the correct artifact for the correct system etc which I kind of thought the point was not to have to do :
You also need to deal with packages that use BinaryProvider which is more difficult.
Don't really understand what this means but isn't
Sure, if you then |
I may describe the problem from the end users perspective:
The fact that it's only a small snippet of code which would be needed to support that first iteration of the format, is what makes it so appealing IMO.
If you don't have internet you're out of luck in that case anyway. BinaryProvider would need an update, I guess, to first search in the bundle. Or we just offer that bundle packing only on Julia 1.3 and upwards. There's another advantage aswell. Even if you have internet, without bundling, you'll never be able to achieve the convenience outlined above if any dependency is private. Except of course you setup a script which either has a ssh key or token inlined in plaintext or by having a script which does authenticate on a 3rd server first to get a ssh access key. That authentication needs to be stored anywhere aswell though, so again access being plain text. I know, if I bundle private repositories in the bundle then it's not secure either but it doesn't force me to make my package public. Sharing the snapshot currently would definitively be the most convenient variant to work with private packages on a computer where the user doesn't understand much more than "install julia". |
Yeah, that process is pretty much what you get with PackageCompiler.jl and the Now the reason against PackageCompiler was relocatability (FWIW, I don't think the statement below is true,
The whole reason the strategy here (using the source files) would help with relocatability is if you allow the build step to run on the user machine so the Julia machinery gets to do its "normal thing" where it runs Julia code to download dependencies and set up paths etc.
That sounds exactly like what is already implemented in PackageCompiler for artifacts.
Yes, which is why I said:
I still don't see any argument why this is not just inherently #34759 (comment). |
I know, and that's why I propose this source bundling. |
But at the same time, you say
How would that work with relocatability then? What would the build step do? |
That's what I explicitly wanna not have.
If we need bundles for locally developed (or privately hosted) packages anyway, then IMO we should support all kinds of packages in those bundles, no matter if they are developed locally. 😁
extract the artifact for the correct platform out of the bundle and use that in the build step. And to repeat myself:
Options are: binary deployment which is platform dependent and source bundle deployment which is platform independent but needs a VM-style executable preinstalled and artifacts for all required platforms included. That's batteries included. |
That's the easy part once you have the first part working well.
Okay, so you want to bundle all artifacts for all platforms in the Anyway, I think the next step here is for someone interested in this to provide some proof of concept or MVP of the thing so it can be tried out and potential issues can be discovered / fixed. |
Exactly.
That's even another reason why we should offer both binary & source deployment. For cases where you heavily depend on platform dependent libraries, source deployment will indeed be a non-starter. But in cases where your code only has a single and small platform dependency (or none at all) the source deployment can be very convenient. The more artifacts will find their way into a pure julia rewrite the broader will the usecase of source deployment become. I envision a future where we have both approaches even mixed, like:
That's why I'd probably suggest to just only support 1.3 onwards and then just give the artifact system the explicit capability that it looks up everything in the bundle if started from the bundle. Once we have that, we could simply pack stdlib as a bundle and distribute that by default. That'd also better reflect the immutable state of stdlib in pre-built julia binaries. Also, once we can start a sysimage from a bundle we will have a perfect separation of stdlib and more julia intrinsic. We can even offer different bundles for other use cases (e.g. focus on plotting, machine learning etc). |
@rapus95, I think you're under the impression that something like JAR files make sense for Julia, which I don't believe that they do. Let me try to explain why I don't think they make sense. Why do JAR files work for Java?Java has, from its inception, actively rejected applications having native dependencies. The whole premise of Java from the beginning has been that everything runs inside of the platform independent JVM. They've made it about as hard as possible to call native code. The security model does not allow you to ship applications with external binary dependencies. Therefore, the nature of a typical Java app is that it only depends on Java code and the JVM. How well does shipping apps as JAR files work for Java?In the beginning, everyone was really amped about Java's "write once, run everywhere" promise. It was anticipated that web applications would be Java "Applets" running in a JVM embedded in the browser and that people would write desktop applications in Java and then send the same JAR file to people on any OS. But none of that happened. JavaScript beat Java in the browser and native apps beat Java on the desktop. No one writes browser apps in Java anymore and everyone hates Java desktop apps. The only place Java did well is on servers. There are many reasons, but before pushing for the JAR model for Julia, shouldn't one stop and consider whether it was a successful model in the first place? Would something like JAR work well for Julia?Unlike Java, Julia very much embraces natives libraries. It's pretty much unavoidable in the technical computing space. You need BLAS, LAPACK, GMP, MPFR, dSFMT, FFTW, and tons more. Sure, we ship with some of those, but we'd like to reduce the ones we ship and provide them as JLLs as much as possible. There are 253 JLL packages already with more being added all the time. If you want a GUI and you use Gtk, that's already 73 JLLs. Your position seems to be that the pre-installed Julia will handle the JLLs. But then why not just let it handle all the dependencies, not just the binary ones? After all, there's no such thing as "a bit self-contained", something either is or isn't self-contained. If I send someone an app that needs to download and install 73 JLLs to work, why would they care if it also downloads and installs some pure Julia packages? On the other hand, if I'm going to send them something that's self-contained with all the Julia packages and binaries it needs, why not also include the Julia binary itself? The middle ground of shipping a bunch of Julia source files without Julia and without the necessary binary dependencies just doesn't make sense to me. Unlike Java, there are hardly any non-trivial Julia applications, especially ones with GUIs, that don't have additional binary dependencies. In summary:
|
I feel like we still don't have the same perception of what I am suggesting. So, again, to clarify it: For me, a bundle will be created with the following capabilities: You select an entry script which you wanna bind to execution on double click and all platforms you want the bundle to support. Then, a file is created which contains the necessary source files and the necessary artifacts for those platforms. Now you can pass that file to any previously selected platform. There, when calling julia.exe -jlb myfile.jlb the julia exe can fetch all dependencies from inside the bundle and starts ordinary work within that closed system. Thus, it does JIT compilation and linking to the provided binaries etc. The other format is a true system executable. That will be created by PackageCompiler and is fully compiled and self-contained and won't depend on julia.exe to exist. Especially that has literally no startup time. While I agree that Java often is a pain, still, neither of your two options give the clicky experience, i.e. having a default for Julia where files are interpreted/executed on double clicking them without sacrificing any capabilities. Recall the original post: I suggested 2 formats which should be runnable by double clicking them while still in fact just I'm simply missing the capability to see packages as modules which I can move around freely (and which are relocateable between platforms). There's no recommended (and explicitly supported) way, how to do manual module management without requiring to be familiar with the REPL or Pkg. I have some history in Minecraft Mod development and while DEFINITIVELY being a pain to write mods building upon each other, once they worked it was a very smooth experience. Just drop all mods you want into a common directory and fire up minecraft (a .jar file which relied upon a library which handled binary dependencies). Put in your core mod and your own extension modules. It will just work. Instead of saying "java failed on providing convenient binary interop and JAR was made for truly platform independent applications" I'd rather call JAR a very good idea which was flawed because they failed to provide good binary interop (due to security concerns). Julia doesn't have those concerns and has very good binary interop. Now, mind if we also had a working JAR model. What that means I tried to outline several times now. So if no one chimes in to rephrase what I'm trying to explain, I probably have to come up with a more concrete implementation. Hopefully a working one. |
What @StefanKarpinski is saying is that the only reason jars make sense as a format is that java doesn't handle binary interop. To make a package that interops with binary code (which would be basically every package), you would need to bundle a version of the binary package for every type of system you want it to be able to run on. Such a bundle would both be massive, and not run-anywhere (just run anywhere for which you have included binary dependencies). A bundle is great, but if even a simple graphics app is 1GB, that starts becoming pretty unappealing. |
What benefit does the former (invoking Julia on a
There is: just clone repos or unpack tarballs of packages in a directory that's in LOAD_PATH. Yes, the package manager will do this for you, but it's trivial to do manually as well. You seem to be doing a bait-and-switch about what you want to make possible: is it running applications or passing packages around? |
If it feels like doing a bait-and-switch, I didn't mean to. I want solutions for both scenarios. Like I have a runnable which will do some job where I can simply extend the funcrionality by placing a given package next to it. Regarding the advantages of running julia over running an executable, in the former case you have the full power of Julia JIT compilation for example to allow the users to supply customized user functions. E. g. I have a game with some entities running around. Now I make a source bundle of that. The user now may just plug new user created content as packages next to it but can also utilize coding as part of solving some puzzles in the game. |
Suppose Julia could access files in an archive just as if they were in the LOAD_PATH (like Java can load classes from JARs). And it had an option that said: "use this archive as project, instantiate it, run some standardized script file in it." To what extent would that solve your problem? Oh, assuming we agree on an extension for those archives and associate it with Julia, of course. The first start might be slow due to the instantiate taking place, but it would fulfill the clicky experience thing. |
We seem to be getting at some kind of disconnect here. There's nothing about shipping an executable app that bundles Julia with it that prevents JIT compilation or limits the power of Julia in any way. It makes no difference if the Julia executable is inside or outside of the application. An application can choose not to allow JIT, but that's totally unrelated.
If you put a package in the LOAD_PATH and load it, that works. We could allow loading code from tarballs of Julia source except for packages that have |
This sounds like a pretty good idea especially for systems which struggle with a lot of files. Currently I've got 94133 files in .julia/packages of which 20516 are .jl files. But that would be only 1109 files if we had an archive per package. It would require packages to access their non-jl "resource" files through an abstraction but that might be a good thing anyway. Recently I was working on an old HPC system where the home directory was mounted on an old SGI DMF. I had to link .julia to another mount point because the home dir was slooow. Reducing the number of files may have helped. |
Windows also has issues with lots of files. It's been a long-time goal, but we only recently have built the artifact system that makes immutable package installation feasible. With artifacts, there's no reason we can't just load packages directly from tarballs. |
Except that it'd force all such executable apps to bundle Julia. Platform-dependently @martinholters that would perfectly solve it.
Instantiation should have a flag to take the archive into account exclusively.
This definitively would help writing relocatable packages by default.
For loading files from there I agree. But for the default entry point runnable case it would definitively be convenient. Thanks to given mechanics on windows (default file handlers etc). Though those should be also loadable as simple archives. |
"Taking end user deployability seriously" would've been a proper name for it aswell 😛.
I'd like to propose adding two more file extensions to the repertoire of Julia.
Current situation: Extension .JL
That's the extension for the source files of Julia. As it currently is the only file extension all julia files have it and the extension is probably bound to a custom editor preferred by the user.
Case 1: Deploy julia scripts to "clicky" end users (e.g. Windows)
Let them install Julia and send them the script file. Then, just open the REPL & call include or instruct them to pass the file as the first command line argument.
Proposal 1: New Extension .JLR (=julia runnable)
This is just an ordinary julia source file which is bound to be run by the julia executable.
That way we don't need to mess with the binding of .jl files which most people probably bound to an editor of their choice. We'd also be able to ship the "run with Julia" by default with all kinds of installers. As such those files would naturally mark script-like entry points into running julia programs.
Sidenote: This file extension was used by Juliar language until late 2019 for their source files. Since that language got renamed to Juka they renamed their source files to .JUK, and thus freeing it again.
Case 2: Provide a more standalone feeling similar to that you get from Java
To run a Java program you need to install the JVM and from then on you can run arbitrary .JAR files which mostly are packed and precompiled code repositories together with their dependencies.
Proposal 2: New Extension .JLB (=julia binary)
While technically not a binary it would be very similar to the .JAR format of Java as it'd also hold the necessary data in a packed format and be bound to be run with the runtime executable. For now it would just be an archive consisting of the corresponding Julia packages (and maybe a runnable).
The actual binary format can change at any time since its sole use is to work out of the box on any platform which has a Julia installation (call it the JuliaVM). No internet required.
It's some sort of the batteries included approach which only relies on a preexisting runtime.
That'd increase the spread of Julia a lot for the similar reason why that works for JAR files. Most users of Java programs aren't programmers. Nevertheless, for a a java program, running is dead simple (once the VM installer was run). With that approach we have a very similar dead simple solution which is straight forward to install.
So while keeping to reach for the stars as in true AOT statically compiled shared shared libraries/executable binaries, this approach would be a good intermediate solution for creating conveniently shareable programs in Julia.
The text was updated successfully, but these errors were encountered: