Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support relative paths in package manager #14603

Merged
merged 5 commits into from
Sep 29, 2023
Merged

Conversation

AdamGoertz
Copy link
Contributor

@AdamGoertz AdamGoertz commented Feb 9, 2023

Closes #14339 and #14825.

This PR adds support in the package manager for fetching dependencies from relative paths. For example, a project with the following dependency structure successfully compiles:

base/
├── build.zig
├── build.zig.zon
├── src
│   └── . . .
dep/
├── build.zig
├── build.zig.zon
├── dep2
│   ├── build.zig
│   └── src
│       └── . . .
└── src
    └── . . .

base/build.zig.zon:

.{
    .name = "base",
    .version = "0.0.0",
    .dependencies = .{
        .dep = .{
            .path = "../dep",
        },
    },
}

dep/build.zig.zon:

.{
    .name = "dep",
    .version = "0.0.0",
    .dependencies = .{
        .dep2 = .{
            .path = "./dep2",
        }
    }
}

Other tests

These projects all build successfully

File dependencies can be provided using the new .path field. These paths can refer to either a directory containing a build.zig or to a file that the package manager is capable on unpacking, such as a .tar.gz.

In the case of a file that must be unpacked, such as a .tar, it is unpacked into the global cache in the same way as if the file was fetched from the internet. However, if the provided path points to a directory, no caching is performed and the build system instead references the files using the provided paths. Relative paths are resolved relative to the build.zig of the package that declared the dependency.

When a dependency references a directory, no hash field is required, and an error will be emitted if the user provides one.

Unlike the original issue which this PR addresses, file:/// URIs are not allowed. For example, the following build.zig.zon produces an error:

.{
    .name = "base",
    .version = "0.0.0",
    .dependencies = .{
        .dep = .{
            .url = "file:///home/adam/dep",
        },
    },
}
Fetch Packages... dep... /home/adam/base/build.zig.zon:6:20: error: 'file' scheme is not allowed for URLs. Use '.path' instead
            .url = "file:///home/adam/dep",
                   ^~~~~~~~~~~~~~~~~~~~~~~

Absolute paths are also disallowed:

.{
    .name = "base",
    .version = "0.0.0",
    .dependencies = .{
        .dep = .{
            .path = "/home/adam/dep",
        },
    },
}
Fetch Packages... dep... /home/adam/base/build.zig.zon:6:21: error: Absolute paths are not allowed. Use a relative path instead
            .path = "/home/adam/dep",
                    ^~~~~~~~~~~~~~~~

Let me know if you have any feedback or suggestions.

@AdamGoertz AdamGoertz changed the title #14339 Support file:/// URIs and relative paths in package manager Support file:/// URIs and relative paths in package manager Feb 9, 2023
@komuw
Copy link

komuw commented Feb 15, 2023

Apologies, I'm not that familiar with Zig(so take the following question with that caveat).

Is this prone to a path traversal attack? If I clone a git repo that has a build.zig.zon with .url = "file:///etc/passwd" and then I run zig build, will it actually read that file?

@praschke
Copy link
Contributor

relative paths are not URIs, i believe it is inappropriate to add this special casing to std.Uri

@leecannon
Copy link
Contributor

leecannon commented Feb 15, 2023

This type of functionality is useful for projects like https://github.com/SerenityOS/serenity.

In that single project tree are ~60+ libraries and ~80+ applications all with dependencies on each other i.e. appA -> libA -> libB.

Needing hashes, references to archive files or needing absolute paths (how can that work on my machine and yours?) flies in the face of developing multiple things in a single source tree.

@praschke
Copy link
Contributor

praschke commented Feb 15, 2023

@leecannon i am not saying relative paths should not be supported, i am saying that std.Uri is not the place to implement that support.

additionally, this PR does nothing about the onerous hash requirement.

@AdamGoertz
Copy link
Contributor Author

AdamGoertz commented Feb 16, 2023

I agree that parsing relative paths inside std.Uri is a bit weird. It does make the subsequent logic a bit easier, since we don't need to handle a special case for relative paths, since absolute paths are already a valid URI scheme.

Regarding the hash requirement, my opinion is that it is important to ensure that the hashes referenced in build.zig.zon stay in-sync with code changes. This problem might be best solved by a build flag, something like -Dauto-update-hashes that automatically edits the hash entries in build.zig.zon with the new hash values for any changed packages.

@AdamGoertz
Copy link
Contributor Author

AdamGoertz commented Feb 17, 2023

Paths are now a possible field for dependencies inside build.zig.zon. So the example above would now look like this:

.{
    .name = "base",
    .version = "0.0.0",
    .dependencies = .{
        .dep = .{
            .path = "/home/adam/dep/",
            // alternatively, .url = "file:///home/adam/dep/",
            .hash = "1220ebfc0074d94ecdab83cef08dae264eeb0c8ae6ee9734fd9bcd3fe83a6236e8a1",
        },
    },
}
.{
    .name = "dep",
    .version = "0.0.0",
    .dependencies = .{
        .dep2 = .{
            .path = "dep2",
            .hash = "1220abdec7efb3042b27db55066eae97088b527be45ab7dda0fe706d6a6f64d3cf60",
        }
    }
}

This removes the special casing inside std.Uri and has the added benefit of allowing relative paths without a leading ..

@AdamGoertz
Copy link
Contributor Author

Apologies, I'm not that familiar with Zig(so take the following question with that caveat).

Is this prone to a path traversal attack? If I clone a git repo that has a build.zig.zon with .url = "file:///etc/passwd" and then I run zig build, will it actually read that file?

This depends on the permissions the zig compiler is running with. For example, if I run zig build as a normal, non-root, user, attempting to access files like /etc/passwd or /etc/shadow fails with error.AccessDenied. If for some reason you were running a build as root, then it may be able to open and read them. Either way, currently the build system doesn't do anything with open files other than compute their hash. It's possible there's a way of extracting some useful information or primitive from that, but I'm not knowledgeable enough about cybersecurity to say. What it does not do is copy the listed files anywhere more easily accessible, such as the global cache.

@TUSF
Copy link
Contributor

TUSF commented Feb 18, 2023

Yeah, you can right now try to access system files like /etc/passwd from a build.zig, and do much worse than whatever having them in a .zon will do. Hence, you shouldn't be running builds as root.

@praschke
Copy link
Contributor

praschke commented Feb 18, 2023

this discussion should take place on #14339.

  • running zig as root is obviously an issue on least privilege grounds, but there are plenty of user-readable files with user secrets, several at well-known paths modulo your username
  • dependencies that are files/directories/archives without any zig or zon in them should be made available to build.zig in a manipulable way, or the package manager isn't actually a replacement for git submodules, see package management: support downloading arbitrary archives ala git submodules #14488
  • build.zig currently can do these things without build.zig.zon, but absolute paths in build.zig.zon might become a way to escape sandboxing if it is implemented, see Run build.zig logic in a WebAssembly sandbox #14286

@jiacai2050
Copy link
Contributor

@andrewrk and @Vexu Would you mind give this a review?

I think local path support is very important for users to try package manager.

@andrewrk
Copy link
Member

andrewrk commented Mar 3, 2023

I will have a look soon.

@buzmeg
Copy link

buzmeg commented Mar 7, 2023

I think local path support is very important for users to try package manager.

If there are issues with recursion and directory trees, it would be good to at least allow file:/// access to archive files (.tar.gz, .tgz, .bgz, etc.) which shouldn't allow people to escape sandboxing.

@buzmeg
Copy link

buzmeg commented Mar 25, 2023

I will have a look soon.

Can we get this pulled into the main branch? Given all the changes in the build system, it really needs to get pulled in so people can start trying it out.

@andrewrk
Copy link
Member

Sorry, not until I look at it. The timing is a bit unfortunate since I was focusing on the build-parallel branch and now I am having a much needed vacation. Please be patient.

@AdamGoertz
Copy link
Contributor Author

Take your time. I don’t want this merged until it’s been reviewed either. We appreciate all the work you’ve been doing, and you deserve a break.

@buzmeg
Copy link

buzmeg commented Mar 26, 2023

Sorry, not until I look at it. The timing is a bit unfortunate since I was focusing on the build-parallel branch and now I am having a much needed vacation. Please be patient.

Understandable.

I tried to pull the fork to try it out, but I clearly just don't understand the build.zig.zon stuff clearly enough to provide credible feedback. :(

@haze
Copy link
Contributor

haze commented May 22, 2023

As a solution to the last bullet point in the OP, does it make sense to have a switch to ignore the hash check when referencing a directory (either absolute or relative) when building? I can see the hash check being important for people that vendor their dependencies, but for someone working in a monorepo style (as noted) it can become a hassle quickly. I can't think of a reason to have the same switch for tarball/remote packages, but I digress.

EDIT: just realized I regurgitated #14339 (comment)

@buzmeg
Copy link

buzmeg commented May 22, 2023

At this point, we need to get something into people's hands so everybody can start iterating on it. It's not going to be perfectly correct, that's fine. But this is currently holding up the possibility of exploring the space as well as people sending in patches to make things work better as well as people writing tutorials about it.

@HTGAzureX1212
Copy link

HTGAzureX1212 commented Jun 16, 2023

Any plans to merge this pull request into master? I've tried to compile Zig with this patch to no avail and quite some trouble as well - it would be nice to have this be merged into master and for an official Zig build to be provided that supports this very-well needed feature.

Thanks in advance (the conflicts should also be resolved, too).

@AdamGoertz
Copy link
Contributor Author

Any plans to merge this pull request into master? I've tried to compile Zig with this patch to no avail and quite some trouble as well - it would be nice to have this be merged into master and for an official Zig build to be provided that supports this very-well needed feature.

Thanks in advance (the conflicts should also be resolved, too).

I should have a chance to resolve the conflicts this coming week (currently traveling). It still needs review before it can be merged, hopefully now that intern pool is landed and SYCL is over, the core team will have some more time.

@HTGAzureX1212
Copy link

@AdamGoertz how;'s everything coming along?

@HTGAzureX1212
Copy link

The CI seemingly is failing on Windows regarding the usage of allocator.dupeZ

@HTGAzureX1212
Copy link

The Windows release workflow failed with a weird AccessDenied error. Perhaps rerun that workflow?

@AdamGoertz AdamGoertz force-pushed the file-uris branch 2 times, most recently from d39cc1c to 7d279c9 Compare July 12, 2023 02:46
@buzmeg
Copy link

buzmeg commented Jul 19, 2023

Did this get integrated? It really needs to be in 0.11 so people can start banging on it ...

@getcisher
Copy link

I'm old rust dev but new here.
I just wonder why we dont use Toml/yaml standard format for fetch depencecies instead of weird *.zon format?
In the future when new feature added and much more complex for zig ecosystem especially when come to depend on contrib deps + config file on building process and be less scare for new face like me.

@AdamGoertz AdamGoertz force-pushed the file-uris branch 3 times, most recently from 1cc26fc to 48e95b9 Compare September 24, 2023 22:38
@jmrico01
Copy link
Contributor

I'm old rust dev but new here. I just wonder why we dont use Toml/yaml standard format for fetch depencecies instead of weird *.zon format?

@getcisher Probably best not to use this PR to discuss higher-level Zig decisions like this. But if you wanna read more, issue #14290 and the linked PR #14265 document some of this decision process.

@AdamGoertz AdamGoertz force-pushed the file-uris branch 3 times, most recently from 551628a to af1e5bf Compare September 26, 2023 03:45
@andrewrk andrewrk mentioned this pull request Sep 26, 2023
src/Package.zig Outdated
Comment on lines 502 to 514
/// The absolute path to a file or directory.
/// This may be a file that requires unpacking (such as a .tar.gz),
/// or the path to the root directory of a package.
file: []const u8,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute paths are not portable, please avoid absolute paths entirely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolute paths have been squashed 👍

src/Package.zig Outdated
Comment on lines 1129 to 1147
/// Make a OS-specific file system path
/// This performs the inverse operation of normalizePath,
/// converting forward slashes into backslashes on Windows
fn unnormalizePath(arena: Allocator, fs_path: []const u8) ![]const u8 {
const canonical_sep = '/';

const unnormalized = try arena.dupe(u8, fs_path);
if (fs.path.sep == canonical_sep)
return unnormalized;

for (unnormalized) |*byte| {
switch (byte.*) {
canonical_sep => byte.* = fs.path.sep,
else => continue,
}
}
return unnormalized;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this function should be used. Why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The question here is whether we want relative paths to have a consistent path separator across platforms or not. Do windows users need to write all of their paths as ‘path\to\some\package’ while non-windows users write ‘path/to/some/package’? This will make dependency paths platform-dependent. Unless there’s some other way this is handled already that I’m not aware of.

Copy link
Member

@andrewrk andrewrk Sep 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think forward slashes should always be used in build.zig.zon files. What happens if you don't call this function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. On windows, this function converts forward slashes (from build.zig.zon) to backslashes (for opening the file).

If you don’t call this function, the file will fail to open on windows because the path will be invalid.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how that can be the case, since Windows supports forward slashes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is that the case? I thought I encountered this issue during my original tests on windows, but I can double check and remove this if it’s not needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsurprisingly, you are correct. Must have been something absolute path or URI related that I was remembering. Regardless, unnecessary path modifications have been eliminated.

@andrewrk
Copy link
Member

andrewrk commented Sep 29, 2023

I appreciate the conflict fix but, don't worry, you can just let this sit on green and I'll take care of the merge 👍

(I'm going to rebase it so I have to redo the conflict fix anyway)

@AdamGoertz
Copy link
Contributor Author

Ok, sounds good, I’ll let you handle it. Thanks!

@andrewrk
Copy link
Member

Nice work, thank you for being patient with the long review process on this one. I think we landed somewhere much better than originally planned.

I've done a more thorough review and built the changes locally and ran a few tests, and I think I'm ready to call this good enough to take for a spin in master branch.

However, I want to discuss briefly what's going on with populating the Dependency.hash field based on a hash of the (absolute?) file path. This seems a bit strange to me since there is no hash, and populating the field in this way is mutating something that I think should be immutable (the inputs from the build manifest file). On the other hand, I see that in std.Build.dependency there is this dependency on a "hash" which it is incorrectly calling a "name". I think that a few different contributors have already taken this logic in different directions that don't quite mesh perfectly together, and so we're already looking at some technical debt here.

So, I think I'm fine with merging this, and merging a few more contributions to the package manager code, why not, and then I'll need to do a deep dive and audit all the package management code and likely do some reworking to make this stuff coherent. That's fine, that's my job here anyway. It'll be a bit of a roundabout way to get there, but eventually we'll end up with something that continues to solve everyone's use cases, and also is more maintainable and ready for some of the more advanced toolchain features that are planned.

This commit makes the following changes:
* Disallow file:/// URIs
* Allow only relative paths in the .path field of build.zig.zon
* Remote now-unneeded shlwapi dependency
No need to create paths with windows-specific path separators
In addition to improving readability, this also fixes a subtle bug
where the Progress node could display the wrong total number of packages to fetch.
@TUSF
Copy link
Contributor

TUSF commented Sep 29, 2023

I had been checking up on this PR on and off, and missed when absolute paths were dropped, but I'll point out that if packages can only reference each other by relative paths, then on Windows, packages on different drive letters (C:, D:, etc…) are pretty much inaccessible to each other.

Granted, I think this might be a fairly narrow edge case, but maybe still something to keep in mind, given I'm sure there's probably someone out there that likes keeping all their code organized across 5+ different drives.

@buzmeg
Copy link

buzmeg commented Oct 1, 2023

So, what is the mechanism for fetching from a local git repository if we don't have any absolute paths?

There seems to have been a bifurcation of "path" and "url" and what their purposes are.

I would really like to have some way to access "file://some/local/path/gitarchive.tgz" short of standing up a full blown http server just to get to "http://localhost/some/local/path/gitarchive.tgz". ie. "url" is effectively always an absolute path anyway so it being able to do that with "file:" instead of "http:" really isn't much of a shift.

@andrewrk
Copy link
Member

andrewrk commented Oct 1, 2023

Can you explain more this use case where you want to hard-code paths to tarballs on your own file system? How would you coordinate working on your own project on different computers, or collaborate with someone else?

@buzmeg
Copy link

buzmeg commented Oct 2, 2023

By having a known root? And then either installing stuff into that root via replication or network mount?

It is not at all uncommon to have something like "C:\sdks" on Windows and "/home/sdks" on Linux/OS X and then reference everything off of them for embedded development. And, if you get a particularly annoying vendor, they will build chunks of the library assuming that root exists and reference everything to it.

And, while I said embedded, even projects like Java and Vulkan do things like this and define an SDK root. So, it doesn't seem to be limited to sprawling embedded projects like Zephyr.

I know some of this discussion hovers around security. However, I'd like to assert something to challenge one of the baseline assumptions: making people stand up a web server/repo server to access a filesystem local repository actually decreases security WAY more than giving people a way to access the file system from a build file. Local access means you can only do what the user can do. However, forcing a random end-user to stand up a web server almost always results in something that can turn into a remote exploit. An end user will invariably do the easiest thing--not the safest.

I would like to have actual filesystem access, but, barring that, at the very least the ability to let the package manager pull in a verifiable "archive file" from Git, Mercurial, etc. on the local filesystem is a sufficient workaround (it can be checked, it can be hashed, it can be signed, etc.).

Side note: I also have strong opinions about how forcing people through this kind of hoop entrenches and privileges the "centralized access model" (GitHub, GitLab, etc.) rather than encouraging a "federated model", but that's a complicated discussion that should be had in person over alcoholic beverages.

@ikskuh
Copy link
Contributor

ikskuh commented Oct 2, 2023

Can you explain more this use case where you want to hard-code paths to tarballs on your own file system? How would you coordinate working on your own project on different computers, or collaborate with someone else?

One use case would be rewriting the URLs in a nix derivation so you can smoothly integrate zig packages into the nix ecosystem, where the paths are the same on each machine

@andrewrk
Copy link
Member

andrewrk commented Oct 2, 2023

#17364

@plajjan
Copy link

plajjan commented Dec 13, 2023

@andrewrk FWIW, I am using zig build in a way where I think absolute paths would be beneficial.

The use case is as the low-level build system of the Acton programming language. The Acton compiler generates C code (one day we'll generate zig code instead 😄) which is then compiled with zig to libraries & executable binaries. The zig build is invoked at build time of Acton itself, like when we build stdlib of Acton we produce a .a library and ship in the Acton distribution so it can be quickly linked in with Acton programs when the Acton user compiles their Acton program. However, the Acton distribution also ships with source code for e.g. stdlib which is used for cross-compilation or when using custom compilation options.

Acton is normally installed in /usr/lib/acton, so like the stdlib C code is in /usr/lib/acton/base/out/types/. Let's say the user has an Acton project in ~/work/my-code/something/blargh/foo, so we invoke zig build with working directory set to that directory. Now we have anonymousDependencies which uses a relative path to get up to the root and then go to /usr/lib/acton/base/out/types, i.e. we have something like ../../../../../../../../../../usr/lib/acton/base/out/types. It would just feel more straightforward with an absolute path, I think. Also note how this path, which we call the "system path", is an input argument to the zig build system that is fed in by the Acton compiler.

I recognize this is a very niche use case, so I'm fine with absolute paths not being supported, but at least I'd like to add a datapoint where I think absolute paths do make sense 😄

@andrewrk
Copy link
Member

@plajjan can you open a new issue with this use case explanation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support package dependencies via paths relative to the build root