Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fetching dependencies over git+http(s) #17277

Merged
merged 1 commit into from
Oct 1, 2023

Conversation

ianprime0509
Copy link
Contributor

Closes #14298

This commit adds support for fetching dependencies over git+http(s) using a minimal implementation of the Git protocols and formats relevant to fetching repository data.

Git URLs can be specified in build.zig.zon as follows:

.xml = .{
    .url = "git+https://github.com/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
    .hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},

The fragment part of the URL may specify a commit ID (SHA1 hash), branch name, or tag. It is an error to omit the fragment: if this happens, the compiler will prompt the user to add it, using the commit ID of the HEAD commit of the repository (that is, the latest commit of the default branch):

Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
            .url = "git+https://github.com/ianprime0509/zig-xml",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https://github.com/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",

This implementation currently supports only version 2 of Git's wire protocol (documented in
protocol-v2), which was first introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports, such as SSH and the "Git protocol" (git:// URLs), so it should be reasonably straightforward to support fetching dependencies from such URLs if the necessary transports are implemented (e.g. #14295).


The linked issue suggests implementing this first as a third-party fetch plugin, but that capability does not yet exist. If that part is mandatory, it should be straightforward to adapt this change to that framework once it's available.

Some projects where this has been tested:

Since all these projects use GitHub dependencies, I also tried out fetching random repositories from a few other Git hosts, including GitLab, SourceHut, and Gitea, to identify any protocol differences between them (fortunately, the only one found was in the initial capabilities response, and is documented in the comments of getCapabilities).

Other projects which would be useful to test with this (large numbers of dependencies, large dependency sizes, uncommon Git hosts, etc.) would be welcome.

@andrewrk
Copy link
Member

Wow, nice work! Zig can definitely carry a few thousand lines of code if that's all it takes to support downloading from the git protocol! Looking forward to reviewing this soon...

@andrewrk
Copy link
Member

I know you just rebased but I merged #14603 before this one since it was open for quite some time. Would you mind rebasing once more?

Closes ziglang#14298

This commit adds support for fetching dependencies over git+http(s)
using a minimal implementation of the Git protocols and formats relevant
to fetching repository data.

Git URLs can be specified in `build.zig.zon` as follows:

```zig
.xml = .{
    .url = "git+https://github.com/ianprime0509/zig-xml#7380d59d50f1cd8460fd748b5f6f179306679e2f",
    .hash = "122085c1e4045fa9cb69632ff771c56acdb6760f34ca5177e80f70b0b92cd80da3e9",
},
```

The fragment part of the URL may specify a commit ID (SHA1 hash), branch
name, or tag. It is an error to omit the fragment: if this happens, the
compiler will prompt the user to add it, using the commit ID of the HEAD
commit of the repository (that is, the latest commit of the default
branch):

```
Fetch Packages... xml... /var/home/ian/src/zig-gobject/build.zig.zon:6:20: error: url field is missing an explicit ref
            .url = "git+https://github.com/ianprime0509/zig-xml",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: try .url = "git+https://github.com/ianprime0509/zig-xml#dfdc044f3271641c7d428dc8ec8cd46423d8b8b6",
```

This implementation currently supports only version 2 of Git's wire
protocol (documented in
[protocol-v2](https://git-scm.com/docs/protocol-v2)), which was first
introduced in Git 2.19 (2018) and made the default in 2.26 (2020).

The wire protocol behaves similarly when used over other transports,
such as SSH and the "Git protocol" (git:// URLs), so it should be
reasonably straightforward to support fetching dependencies from such
URLs if the necessary transports are implemented (e.g. ziglang#14295).
@ianprime0509
Copy link
Contributor Author

No worries, I'm definitely happy to see #14603 land 😄 I just rebased once more on top of that, fitting the new Git dependency scheme into the framework that change created.

@andrewrk
Copy link
Member

I'm playing with this now - trying it out on git+https://git.handmade.network/hmn/orca.git#90dd12a69fe8cfeb0834971c3e0c0131f880c295 but I ran into this problem:

error: SymlinkNotSupported
/home/andy/Downloads/zig/src/git.zig:101:29: 0x650e759 in checkoutTree (zig)
                .symlink => return error.SymlinkNotSupported,
                            ^

Couple notes here -

  1. it would be good to handle errors here:
/home/andy/Downloads/zig/src/Package.zig:739:38: 0x65294cf in unpack (zig)
                        .git_pack => try unpackGitPack(allocator, &prog_reader, git.parseOid(rr.path) catch unreachable, tmp_directory.handle),
                                     ^

However, I see that unpackTarball has the same lack of error handling, so that could be done as a follow-up enhancement.

  1. let's chat about symlinks (related std.tar: handle tar files with symbolic links #16678)

Looking at that issue, I think I spelled it out pretty reasonably in #16678 (comment). So, once again I think that should not block this PR. I should go implement symlink support in package management, and add it to tar file extraction, and then it should be fairly obvious what to do with this git.zig code.

Still playing with it / reviewing the code :-)

Copy link
Member

@andrewrk andrewrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I really like the way you implemented only the bits of the protocol needed to fetch files for the selected commit, and you handled the case of the user not making a stable URL. This is ready to land.

@@ -548,7 +555,7 @@ const FetchLocation = union(enum) {
pub fn deinit(f: *FetchLocation, gpa: Allocator) void {
switch (f.*) {
inline .file, .directory => |path| gpa.free(path),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not your code, I know, but this inline is not required and should be removed.

@@ -617,7 +690,7 @@ const ReadableResource = struct {
pkg_prog_node: *std.Progress.Node,
) !PackageLocation {
switch (rr.resource) {
inline .file, .http_request => |*r| {
inline .file, .http_request, .git_fetch_stream => |*r| {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a bit concerned with how much code is inside this inline switch case, causing bloat, and that sent me on a wild yak chase, ultimately resulting in #17344 which got blocked by #17343.

Conclusion? idk. This can be merged, but I want to spend some time de-bloatifying things later.

@andrewrk andrewrk merged commit 9a001e1 into ziglang:master Oct 1, 2023
@andrewrk
Copy link
Member

andrewrk commented Oct 3, 2023

playing with this a little, and I noticed that the current working directory changed:

andy@ark ~/D/orca-ui-hello-zig (main)> zig-dev build -Dtarget=aarch64-macos
Fetch PFetch Packages... orca [8MiB] Checkout... /home/andy/Downloads/orca-ui-hello-zig/build.zig.zon:7:20: error: dependency is missing hash field
            .url = "git+https://git.handmade.network/hmn/orca.git#6e4943c2fa7bebcd6682a7b589d680d08634db77",
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
note: expected .hash = "122055608ef2919e346ca49d43daf962a552ab49ee8a66ab4e07bed6b77946a26600",
andy@ark ~ [1]> 

How could this happen? I don't see any calls to std.os.chdir or std.process.changeCurDir

@ianprime0509
Copy link
Contributor Author

That's quite odd, I don't recall seeing anything like that happening when I was testing. It's particularly strange that it's changing the working directory of your shell, since even chdir (if used) should be limited in scope to the current process (zig) and shouldn't affect the parent process. I'll try this again from my end when I get back from work today and let you know if I find anything out.

@andrewrk
Copy link
Member

andrewrk commented Oct 3, 2023

Could very well be something unrelated to your changes too, I haven't figured it out yet.

@ianprime0509
Copy link
Contributor Author

I tried again with a build of 0.12.0-dev.758+4df7f7c86 on the latest version of this change: kristoff-it/orca-ui-hello-zig#1 (commit 4dcab4fc731ddd30e3f8933547d422b234293722) and unfortunately I wasn't able to reproduce anything similar to what you're experiencing. I tried both the scenario where the package was not already present in the global cache and where it is, just in case that made a difference, with no change.

I also tried using strace to see if there were any suspicious-looking unlink or rename syscalls happening, in case something weird might be happening to the containing directory which could confuse the shell, but I didn't notice anything out of the ordinary at a glance. No calls to chdir reported in the trace either.

For what it's worth, I tested this on Linux, though it looks like that's what you're using too. Does your zig-dev alias/symlink/binary apply any other special configuration (e.g. environment variables) that I should try as well to see if I can get closer to reproducing this?

@andrewrk
Copy link
Member

andrewrk commented Oct 3, 2023

Thanks for taking a look! It's gotta be something weird with my environment. Maybe a bug in my terminal?? For me zig-dev is just a symlink to ~/dev/zig/build-release/stage3/bin/zig

@ianprime0509
Copy link
Contributor Author

Cool, that's the same setup I have with my zig command as well, so at least that difference is ruled out. I did try with both Bash and ZSH as the shell and that didn't make a difference, and for completeness, I'm using GNOME Terminal. The main "weird" thing about my setup is that I'm using Fedora Silverblue, so all my actual development is done inside a container; hopefully that doesn't make any relevant difference to this issue.

@mitchellh
Copy link
Contributor

mitchellh commented Oct 4, 2023

I know Andrew at least sometimes uses my terminal (Ghostty). I don't see how its possible for the directory to change at the terminal emulator layer, but the first step in debugging is denial so maybe? It seems more likely it was somehow the shell... Either way if you find its somehow the terminal I am here paying attention 😄

@melhindi
Copy link

Hey guys, just a quick question: should it also be possible to fetch from private repositories?
At the moment I get the following: error: unable to discover remote git server capabilities: ProtocolError

@ianprime0509
Copy link
Contributor Author

@melhindi no, that isn't currently supported, as the Zig package manager doesn't have a way to obtain the necessary credentials securely (hard-coding them into the URL technically works, but is not at all secure). I haven't seen any proposal for such a design; it would require being able to securely associate HTTP basic auth credentials (username and password) to a package URL. Another option would be to implement #14295, but that is considerably more involved and would also raise its own design questions (e.g. password-protected keys).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ability to fetch dependencies via git+http and git+https protocol
4 participants