Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: add wrapper .exe to work around lack of RPATH #35629

Closed
wants to merge 3 commits into from

Conversation

staticfloat
Copy link
Member

Windows has no concept of RPATH, which makes shipping binaries with
relatively-located dependencies quite challenging. Despite
investigating many potential avenues of RPATH emulation, (including
but not limited to Application Manifests, PE file patching, and bundling
the true Julia executable as a resource inside of a launcher) the most
reliable (And least breaking for external workflows) was found to be to
create a launcher executable that invokes the true Julia executable from
within the libexec directory.

@staticfloat staticfloat requested a review from vtjnash April 28, 2020 19:00
@staticfloat
Copy link
Member Author

staticfloat commented Apr 28, 2020

Note that I'm carrying another commit in here that is part of another PR; that will be rebased out once #35626 is merged, this implementation uses some of that scaffolding.

EDIT It was merged, this no longer carries any other commits.

@staticfloat staticfloat force-pushed the sf/win_exe_wrapper branch 2 times, most recently from 991a386 to e62f006 Compare April 29, 2020 16:42
contrib/windows/exe_path_wrapper.c Outdated Show resolved Hide resolved
contrib/windows/exe_path_wrapper.c Outdated Show resolved Hide resolved
Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the main concern here is that adding anything to PATH will break spawning cmd for some people due to https://software.intel.com/en-us/articles/limitation-to-the-length-of-the-system-path-variable. (And perhaps that it'll take slightly longer to launch Julia and annoy people who are looking in task manager.)

contrib/windows/exe_path_wrapper.c Outdated Show resolved Hide resolved
Copy link
Member

@vtjnash vtjnash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From CI, seems like we also need to reconfigure all the other dependency objects to move to the libexec folder also?

Seems somewhat tedious to have two "julia.exe" named things also (esp. since we expose unsafe_string(Base.JLOptions().julia_bin)). Hm :/

@staticfloat
Copy link
Member Author

From CI, seems like we also need to reconfigure all the other dependency objects to move to the libexec folder also?

No, I was just using a $(PATHSEP) value that was rebased out during me splitting things apart. :)

Seems somewhat tedious to have two "julia.exe" named things also (esp. since we expose unsafe_string(Base.JLOptions().julia_bin)). Hm :/

Yes, it's slightly annoying. Luckily, however, as long as unsafe_string(Base.JLOptions().julia_bin)) is used by subprocesses that have inherited PATH, that will continue to work. It's only if someone, for instance, ships that value to another process over IPC or something, that they would run into trouble.

@staticfloat staticfloat force-pushed the sf/win_exe_wrapper branch 3 times, most recently from 26dc2c9 to d4e371d Compare May 4, 2020 21:35
@staticfloat staticfloat force-pushed the sf/win_exe_wrapper branch from d4e371d to 2c7942e Compare May 14, 2020 21:31
@staticfloat
Copy link
Member Author

@vtjnash is this good to merge?

Comment on lines 50 to 51
if (MAX_PATH_LEN < totalPathLen) {
fprintf(stderr, "ERROR: Cannot append entries to PATH: not enough space in environment block. Reduce size of PATH!");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just concerned that it may not be unusual for some user paths to be already near this limit due to the system configuration (that the user might not be able to alter)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any stats on this so you may be right, but I don't know what we can do about it; I haven't been able to come up with any valid alternatives.

@davidanthoff
Copy link
Contributor

Is the motivation for this that we want to add some directory that contains a julia.exe to the PATH, so that users can start julia from the command line, without putting all the DLLs we ship onto the PATH as well?

Note that #36153 would remove the need to put anything on the PATH: we declare an execution alias in the MSIX file, and with that julia becomes available automatically when the app is installed. The way this is implemented in MSIX is that their app installer adds one folder to PATH that is shared for all MSIX programs, and then whenever you install an MSIX that has an execution alias, it will add a wrapper or symbolic link (not sure) to that folder that points to the exe in your private install folder.

@staticfloat
Copy link
Member Author

Is the motivation for this that we want to add some directory that contains a julia.exe to the PATH, so that users can start julia from the command line, without putting all the DLLs we ship onto the PATH as well?

Yes, specifically, the JLL stdlib effort. See #33973 and #35193

Using an MSIX won't solve this for all users; we need a solution that doesn't depend on how you install Julia and that works everywhere.

@staticfloat
Copy link
Member Author

I've altered this PR to not append to PATH, but to replace it and store the true path in a separate variable (JULIA_ORIGINAL_PATH) that then gets restored at the earliest possible time. This should dodge all concerns about PATH length limits and should make this even more transparent.

@davidanthoff
Copy link
Contributor

If I understand this correctly, then this whole PATH modifying part is only used so that the true julia.exe can find libjulia.dll?

If that is so, then I have to admit this seems unnecessarily complicated :) Can't you just put julia.exe and libjulia.dll into the same folder (not bin, some other folder name and that folder does not get added to PATH), and then add a julia.cmd to the bin folder that starts the julia.exe? And if folks want to have julia on the PATH they would add the bin folder with the julia.cmd to their PATH? That seems to be the pattern that most programs follow on Windows, including from MS.

Or is this code also adding other folders to PATH for the time when the true julia.exe starts?

@staticfloat
Copy link
Member Author

If I understand this correctly, then this whole PATH modifying part is only used so that the true julia.exe can find libjulia.dll?

Not just libjulia.dll, but libLLVM.dll, libgcc_s_seh-1.dll, etc... all of which will be in separate folders once they are provided via artifacts. This is a part of the work of splitting these things up so that we are no longer just dumping DLL files into a single large bin directory, but instead having each installed in its own artifact and allowing Julia to do the work of locating and loading each separately. Unfortunately, for these special cases, the ones that julia.exe needs before it can run a single instruction, we need to tell the OS how to find these libraries.

I investigated using a wrapper .cmd file, but there are too many workflows out there already that assume that invoking julia.exe is the right way to get a Julia process started. I also can't just put a julia.cmd file into the bin folder and hope that a user that runs julia will pick up the .cmd instead of the .exe because windows (by default) searches for .exe before .cmd. And finally, I also want the outward-facing API of invoking Julia to be as consistent across platforms as possible.

@tkf
Copy link
Member

tkf commented Jul 1, 2020

What happens when you use libjulia directly in Windows? Do people have to replicate the logic in exe_path_wrapper.c to load libjulia in Windows?

In particular, I wonder if this hack in PyJulia would stop working:

https://github.com/JuliaPy/pyjulia/blob/ca04b008136a592c6234be752c271fc366c37d99/src/julia/libjulia.py#L219-L236

@staticfloat
Copy link
Member Author

What happens when you use libjulia directly in Windows?

Right now, nothing changes since everything is still in bin. In the future, anyone linking against libjulia will need to ensure that they can find all dependent libraries of libjulia.

In a dynamic language like Python, this is not a problem; you just search for and dlopen() the dependencies first, before loading libjulia. Doing that explicitly should obviate the need for the workaround in pyjulia, since that is just using the fact that the current directory is implicitly on the load path.

In a static language, when linking directly against libjulia through something like gcc -ljulia, it gets a little harder. You will need to provide an appropriate PATH when launching your executable, likely through some kind of .cmd wrapper or even an .exe wrapper. Luckily, as the application developer, you generally have control over this.

@tkf
Copy link
Member

tkf commented Jul 1, 2020

Thanks for the clarifications.

you just search for and dlopen() the dependencies first

Are the DLLs in one place? I thought the plan was to move them to individual artifact directories but my worry was that it's hard to find them.

Or will they be listed in Base.DL_LOAD_PATH? I guess I can just manually expand it? It'd be a bit cumbersome but not crazy.

@staticfloat
Copy link
Member Author

Are the DLLs in one place? I thought the plan was to move them to individual artifact directories but my worry was that it's hard to find them.

Eventually; no, they will not be in one place. From an already-running Julia REPL session, you'll be able to get their locations through things like using Libdl; dlpath(dlopen("libLLVM")), or even better, using Base.libLLVM_jll: libLLVM_path; println(libLLVM_path).

You will be able to get it from the build system via things like make print-LIBLLVM_SHLIBDIR and make print-CSL_SHLIBDIR.

@tkf
Copy link
Member

tkf commented Jul 1, 2020

For PyJulia, it'd be nice if I can just call (say) Libdl.expand_dl_load_path() or something to get a list of all load paths I have to add before dlopening libjulia. (For PyJulia, it's OK if this is in-Julia API because I need to launch a julia process once anyway to get the path of the default sysimage.) Is something like this implementable with the direction of this PR?

@davidanthoff
Copy link
Contributor

This PATH hack is only needed for shared libraries that are load time dynamically linked, right? Anything that is run-time dynamically linked you wouldn't need any of this? I would have assumed that anything that is moved into an artifact load story is run-time dynamically linked in any case? Or is the plan to move even things like libjulia and libgcc_s_seh-1.dll into an artifact (I'm assuming that those have to be load time dynamically linked)?

Might using isolated applications and side-by-side assemblies be an alternative way to achieve something similar?

I think I also still don't understand fully what the final directory architecture goal here is, i.e. which files are meant to go into which folder in the final design.

I also can't just put a julia.cmd file into the bin folder and hope that a user that runs julia will pick up the .cmd instead of the .exe because windows (by default) searches for .exe before .cmd

Well, the whole idea is that you put the .cmd file into one folder and the .exe into a different folder (together with the dlls), and you add only the folder that has the .cmd file to the PATH. In that case you don't have any problem that Windows would load the .exe first, right?

@staticfloat staticfloat force-pushed the sf/win_exe_wrapper branch from 23d82c5 to fdbbfcd Compare July 1, 2020 05:17
@staticfloat
Copy link
Member Author

This PATH hack is only needed for shared libraries that are load time dynamically linked, right?

Correct. So that means libLLVM, libgcc, etc... libjulia won't be moved, but everything else will.

Might using isolated applications and side-by-side assemblies be an alternative way to achieve something similar?

No, unfortunately, they will not work. Native Assemblies do not have the ability to store a path to them within themselves, so you still need to add things onto PATH.

I think I also still don't understand fully what the final directory architecture goal here is, i.e. which files are meant to go into which folder in the final design.

The end goal is for the Julia build system to use JLL packages for its binary dependencies. So everything having to do with llvm would live within usr/share/julia/stdlib/vX.Y/artifacts/<llvm_artifact_hash>. You can read more about what the ramifications of that are in these two issues: #33973 and #35193

Well, the whole idea is that you put the .cmd file into one folder and the .exe into a different folder (together with the dlls), and you add only the folder that has the .cmd file to the PATH

The problem with this is the number of tools that are already hardcoded to look for julia in bin/julia.exe; we don't want developers to all have to go and change their code to instead say bin/julia.cmd or put_me_on_path/julia.cmd. It would break a lot of workflows; better to be compatible.

Windows has no concept of `RPATH`, which makes shipping binaries with
relatively-located dependencies quite challenging.  Despite
investigating many potential avenues of `RPATH` emulation, (including
but not limited to Application Manifests, PE file patching, and bundling
the true Julia executable as a resource inside of a launcher) the most
reliable (And least breaking for external workflows) was found to be to
create a launcher executable that invokes the true Julia executable from
within the `libexec` directory.  In order to not run afoul of PATH
length limits, we store the original path in a separate sidechannel
environment variable (`JULIA_ORIGINAL_PATH`) then restore it after
startup is complete.
@staticfloat staticfloat force-pushed the sf/win_exe_wrapper branch from af73805 to a0d2a92 Compare July 2, 2020 00:25
@staticfloat
Copy link
Member Author

This is passing all tests now, so it's ready to merge on my end. The only downside of this now is that we have to alter two tests:

  1. The getpid() test in spawn.jl, since the result of getpid(run("julia")) is now the PID of the wrapper process, not the Julia process itself.

  2. The environment test in InteractiveUtils, since our wrapper always sets JULIA_BINDIR to the proper location (so that it doesn't think it's actually at libexec.

These are acceptable downsides in my eyes; if anyone disagrees speak up now, otherwise I'll merge this sometime this weekend.

@StefanKarpinski
Copy link
Member

  1. The getpid() test in spawn.jl, since the result of getpid(run("julia")) is now the PID of the wrapper process, not the Julia process itself.

I have not been following along here, but is it possible to do the setup and then exec the true Julia executable in the same process instead of starting it as a grandchild? That way it would have the expected relationship (child) and pid?

@vtjnash
Copy link
Member

vtjnash commented Jul 2, 2020

Yeah, I agree those are a problem, and I liked where this was headed now. But I think we can fix those too. We just may need to make this wrapper extremely lightweight(*), use LoadLibrary (instead of spawning a new whole process) and then bootstrap up from there (using the PATH tricks here, but then there's also no length limit).

(*) For example, see https://github.com/vtjnash/libwhich/blob/be900cd7ba35dd8357f08766b4f778d7243b5982/Makefile#L8-L14 and https://github.com/vtjnash/libwhich/blob/87cffe10080c98e7b5786c5166e420bf1ada1d41/libwhich.c#L100

@staticfloat
Copy link
Member Author

I've looked for an exec-like syscall on Windows, but I haven't been able to find one.

use LoadLibrary (instead of spawning a new whole process) and then bootstrap up from there

I'm not sure what you mean by this. Which libraries would I be loading?

@vtjnash
Copy link
Member

vtjnash commented Jul 2, 2020

Which libraries would I be loading?

All of them. The links I sent show how you can make a binary that doesn't link against anything, then you can pull in each library as you need it for the next one. Not quite as good as exec, but maybe close.

@staticfloat
Copy link
Member Author

I think what you're saying is to change the main julia.exe executable to use LoadLibrary() to load in libLLVM and libgcc and whatnot; that way we can run code before needing to load those libraries. Is that correct?

If so, that will break a lot of code because ccall(:symbol, ...) won't work; the default NULL handle to dlsym() won't pick up symbols from libjulia or libLLVM anymore. I looked into this approach a few months ago and the changes needed to not only Base but other packages was extensive.

@vtjnash
Copy link
Member

vtjnash commented Jul 2, 2020

the default NULL handle to dlsym() won't pick up symbols from libjulia or libLLVM anymore

I was pretty sure it couldn't do that before either, since the OS has no concept of a default or NULL handle (we shim it internally to search our list of standard places)

@vtjnash
Copy link
Member

vtjnash commented Jul 2, 2020

that way we can run code before needing to load those libraries

Correct. The default behavior is for the compiler and kernel to cooperate on injecting code at start that calls LoadLibrary. We can intercept that by not populating its list of files, running our code, then calling LoadLibrary on the list just like it did.

@musm
Copy link
Contributor

musm commented Jul 6, 2020

I'm trying to fully understand the folder architecture in this PR:

Does julia-install-path/bin hold the wrapper julia.exe, which points to julia-install-path/libexecdir/julia.exe ?

In other words, we can thus safely add to path julia-install-path/bin


int wmain(int argc, wchar_t *argv[], wchar_t *envp[]) {
// Determine absolute path to true julia.exe sitting in `libexec/`
WCHAR currFileDir[MAX_PATH_LEN];
Copy link
Contributor

@musm musm Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be MAX_PATH and same for the variable below if using PathCombine

A pointer to a buffer that, when this function returns successfully, receives the combined path string. You must set the size of this buffer to MAX_PATH to ensure that it is large enough to hold the returned string.

https://docs.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-pathcombinew

exit(1);
}
PathRemoveFileSpec(currFileDir);
PathCombine(juliaPath, currFileDir, TEXT(JULIA_EXE_PATH));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PathCombine is not recommended from Microsoft:

Note This function, PathCchCombineEx, or PathAllocCombine should be used in place of PathCombine to prevent the possibility of a buffer overrun.

https://docs.microsoft.com/en-us/windows/win32/api/shlwapi/nf-shlwapi-pathcombinew

// within the callee.
LPWSTR pathVal = (LPWSTR) malloc(MAX_PATH_LEN*sizeof(WCHAR));
DWORD dwRet = GetEnvironmentVariable(TEXT("PATH"), pathVal, MAX_PATH_LEN);
if (dwRet > 0 && dwRet < MAX_PATH_LEN) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be !dwRet && dwRet <= MAX_PATH_LEN

However looking at the documentation for GetEnvironmentVariable. we should handle resizing the buffer instead of assuming MAX_PATH_LEN

@GunnarFarneback
Copy link
Contributor

After reading this I'm still unclear on the implications and I certainly can't interpret the PR itself, so I had better check this specifically.

I have an application that effectively uses Julia as a plugin module. On Windows it relies on the fact that having the Julia bin directory in PATH implies that libjulia can be loaded at run-time with

    const char *library_name = "libjulia.dll";
    HMODULE libjulia = LoadLibrary(library_name);

and that's all the loading that is necessary. Will this change in any way with this PR?

@davidanthoff
Copy link
Contributor

I have to say this PATH hack makes me very nervous :) Here are some reasons:

  • Won't this throw julia back into the pre dll safeload days, where if I start julia.exe from a folder that has say a libjulia.dll in it, it will load that and not the version that was shipped with the julia.exe one is starting? PATH I think has the lowest priority in the search order for DLLs, even after the current working folder.
  • I think in general it would be very weird if their is a julia.exe around that one actually can't start, because it needs a wrapper. I don't mind the wrapper that goes into some folder that can be added to PATH so that one can start julia more easily from the command line without adding all the DLLs to the PATH, but I think a wrapper that is necessary to start the main julia binary is just a really weird setup.
  • I'm almost finished with a MSIX installer that would hopefully allow us to get Julia into the Windows Store. I don't really know, but I'd be very worried that we wouldn't get through store approval with non-standard things like this PATH proposal, it just seems so clearly not how MS wants folks to load DLLs.

Wouldn't another option be to just ship julia.exe, libjulia.dll and whatever other DLLs have to be load time dynamically linked, move some other DLLs that don't need to be load time dynamically linked from that folder into artifacts folders (which should work, without much of a problem?) and then if there is still a need for the artifacts system to give access to some of the DLLs that are still located in the same folder as the julia.exe, just special case those situations in the artifacts code?

My sense from reading a fair bit on this is that at the end of the day the platform wants us to put load time dynamically linked DLLs into the same folder as the julia.exe, and trying to get around that will almost certainly always be a hack that can cause trouble.

@staticfloat
Copy link
Member Author

closing in favor of #36588

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants