-
Notifications
You must be signed in to change notification settings - Fork 372
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
git: spawn a separate git process for network operations #5228
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
That seems okay. It's also good to reorganize the current error enum as needed.
I think it's best to add a config knob to turn the shelling-out backend on. If the implementation gets stable enough, we can change the default, and eventually remove the git2 backend. I think the flag can be checked by CLI layer, but that's not a requirement. Regarding the code layout, maybe better to add new module for shelling-out implementation? The current git.rs isn't small, and there wouldn't be many codes to be shared. Thanks! |
Hi! Thanks for the feedback! In the meantime, I was trying to make CI pass and did away with the feature flag (which was always my intention anyways but wanted to receive feedback).
EDIT: I misread a comment where I thought it was saying no config knob. sorry about that |
Can you also squash it down to one commit and change the commit message/PR title to |
Getting rid of libgit2 is likely inevitable at this rate. It is only used for push/pull and is a large 3rd party dependency that we have probably outgrown by now, I'm almost certain the number one most-repeated bug report is "push does not work", with many users compiling themselves to fix. In reality I think it will probably remain a long tail of issues, things that won't work right, because the existing Git code, tools, and installers have a lot of quirks figured out for many various OS/workflow combinations, e.g. something like Git for Windows Credential Manager, or auto-ssh key unlock via secure enclaves, etc. In the future if it supported this stuff really well, bringing back the code probably wouldn't be too bad. Or maybe Gitoxide will get good support, which would be even better. But in the meantime, assuming this improves the user experience now, and assuming we turn it on by default at some point, it's probably no longer worth keeping git2 around. |
Thanks for taking up this task! I’ll try to do a more thorough review later but for now I wanted to mention that this should use the fancy Git (Also +1 to getting rid of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exciting work!
I looked into this, but am unsure this completely mimics the behaviour as it stands right now. The patch uses |
I agree. In fact, there is a current clippy warning about a function with too many arguments that is hard to go around without it. PS: I intend to work on the remaining issues on top of |
The version with the argument lets you specify the exact expected commit for each remote ref, so you should be able to implement the current logic on top of it. I think |
daa498e
to
5696825
Compare
Does this share functionality with #4759 ? At a surface level, it seems similar. |
I don't think it does. This PR is about using |
Re: The particular test case that fails regards if we are pushing a branch deletion, and the branch was already deleted on the remote (and as such, is not at the expected commit)
Would be great to get perspective on what to do here! cc @emilazy |
9a733d2
to
bb5e58c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just dropping random comments. I haven't reviewed the code thoroughly.)
The issue with clippy is that
jj_lib::git::fetch
takes too many arguments.I think there are three ways to fix this:
- Silence the warning
- Add a
GitFetchArgs
struct that has the logical things we want to find and fetch (the remote and the branch names). This would get it back under the limit- Do this on top of git: update
jj git clone|fetch
to use new GitFetch api directly. #4960
I think 1 is good assuming it's temporary. Thanks.
9fbd714
to
d455d71
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not done reviewing, but this is a pretty large PR so I'll have to do it in increments
9a0ab4d
to
271c05d
Compare
for full_refspec in refspecs { | ||
let remote_ref = full_refspec.split(":").last(); | ||
let expected_remote_location = remote_ref.and_then(|remote_ref| { | ||
qualified_remote_refs_expected_locations | ||
.get(remote_ref) | ||
.and_then(|x| x.map(|y| y.to_string())) | ||
}); | ||
git_ctx.spawn_push( | ||
remote_name, | ||
remote_ref, | ||
full_refspec, | ||
expected_remote_location.as_deref(), | ||
&mut failed_ref_matches, | ||
)?; | ||
if let Some(remote_ref) = remote_ref { | ||
remaining_remote_refs.remove(remote_ref); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as for git fetch
above: This seems to run one git push
per refspec. We should be able to call it just once for all updates, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With git push it makes it very much easier to call it once per ref to collect the failing ones into the vec
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it’ll result in one SSH auth per ref in many common configurations, which doesn’t seem shippable. I think we need to find a way to do the fetches and pushes in one go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree with the sentiment, but my experience with it was that it was fundamentally hard to match the existing behaviour when doing it in one go.
However, now that the codebase is in a better state maybe it's time to look back at it (i tried with fetch and it wasn't great).
One possible strategy is to reach for --atomic
, parse the error to figure out which branches are failing and then retry without those. But I'll look into it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think slightly different semantics are better than multiple SSH connections, as long as the semantics don’t seem too unreasonable.
for refspec in refspecs { | ||
git_ctx.spawn_fetch(remote_name, self.depth, &refspec, &mut prunes)?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to run git fetch
once per branch. We should be able to fetch all requested branches in a single call, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, here's what i've tried;
- getting all the refs in
git fetch
- adding
--atomic
They all fail in test_git_fetch_bookmarks_some_missing
, particularly when we're fetching two branches (rem1
and rem2
), each only existing on one of two remotes present (named after the branches). I can't say I understand the particulars of how this works exactly, would be happy to push a separate commit with this for people to try out.
Log of cargo test test_git_fetch
:
[...]
failures:
---- test_git_fetch::test_git_fetch_bookmarks_some_missing::spawn_a_git_subprocess_for_remote_calls stdout ----
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Snapshot Summary ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Snapshot: git_fetch_bookmarks_some_missing
Source: cli/tests/test_git_fetch.rs:1064
─────────────────────────────────────────────────────────────────────────────────────────────────────────
Expression: stderr
─────────────────────────────────────────────────────────────────────────────────────────────────────────
-old snapshot
+new results
────────────┬────────────────────────────────────────────────────────────────────────────────────────────
0 │-bookmark: rem1@rem1 [new] tracked
1 │-bookmark: rem2@rem2 [new] tracked
0 │+Warning: No branch matching `rem1` found on any specified/configured remote
1 │+Warning: No branch matching `rem2` found on any specified/configured remote
2 │+Nothing changed.
────────────┴────────────────────────────────────────────────────────────────────────────────────────────
7b73aa3
to
b85e3cc
Compare
if e.kind() == std::io::ErrorKind::NotFound { | ||
if self.git_path.is_absolute() { | ||
GitSubprocessError::GitCommandNotFound(self.git_path.to_path_buf()) | ||
} else { | ||
GitSubprocessError::GitCommandNotFoundInPath(self.git_path.to_path_buf()) | ||
} | ||
} else { | ||
GitSubprocessError::Spawn(e) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: another simpler (but less user-friendly-message) option is:
#[error("Failed to execute git command at path {path}")]
GitSubprocessError::Spawn {
path: PathBuf,
// this will be printed in the next line
#[source] error: io::Error,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's important to be as user friendly with this error as possible: this will undoubtedly cause a lot of pain on systems where the git installation is peculiar (e.g.: windows). The error message should be as crisp as possible and the current error structure enables that. The tradeoff of a slightly more complex error variant seems worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We often add user-friendly error message as a hint in command_error.rs
. It might be nicer than splitting the same error into three variants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been trying this out and by the time we get to handling this error in CLI the error is now a variant of SubprocessError which seems weird to parse out.
The main thing I think is important here is to be able to report back to the user that some PATH resolution failed vs. some particular path did not work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't think it's bad to generate error hint based on property of inner error, but I don't have a strong feeling that we should deduplicate error variants, either.
FWIW, it's better to add executable path to the Spawn
error because execution may fail due to other path-related reasons (permission, executable format, etc.)
e0075e2
to
fd77709
Compare
Reasoning: `jj` fails to push/fetch over ssh depending on the system. Issue jj-vcs#4979 lists over 20 related issues on this and proposes spawning a `git` subprocess for tasks related to the network (in fact, just push/fetch are enough). This PR implements this. Users can either enable shelling out to git in a config file: ```toml [git] subprocess = true ``` Implementation Details: This PR implements shelling out to `git` via `std::process::Command`. There are 2 sharp edges with the patch: - it relies on having to parse out git errors to match the error codes (and parsing git2's errors in one particular instance to match the error behaviour). This seems mostly unavoidable - to ensure matching behaviour with git2, the tests are maintained across the two implementations. This is done using test_case, as with the rest of the codebase Testing: Run the rust tests: ``` $ cargo test ``` Build: ``` $ cargo build ``` Clone a private repo: ``` $ path/to/jj git clone --shell <REPO_SSH_URL> ``` Create new commit and push ``` $ echo "TEST" > this_is_a_test_file.txt $ path/to/jj describe -m 'test commit' $ path/to/jj git push --shell -b <branch> ``` <!-- There's no need to add anything here, but feel free to add a personal message. Please describe the changes in this PR in the commit message(s) instead, with each commit representing one logical change. Address code review comments by rewriting the commits rather than adding commits on top. Use force-push when pushing the updated commits (`jj git push` does that automatically when you rewrite commits). Merge the PR at will once it's been approved. See https://github.com/jj-vcs/jj/blob/main/docs/contributing.md for details. Note that you need to sign Google's CLA to contribute. --> Issues Closed With a grain of salt, but most of these problems should be fixed (or at least checked if they are fixed). They are the ones listed in jj-vcs#4979 . SSH: - jj-vcs#63 - jj-vcs#440 - jj-vcs#1455 - jj-vcs#1507 - jj-vcs#2931 - jj-vcs#2958 - jj-vcs#3322 - jj-vcs#4101 - jj-vcs#4333 - jj-vcs#4386 - jj-vcs#4488 - jj-vcs#4591 - jj-vcs#4802 - jj-vcs#4870 - jj-vcs#4937 - jj-vcs#4978 - jj-vcs#5120 - jj-vcs#5166 Clone/fetch/push/pull: - jj-vcs#360 - jj-vcs#1278 - jj-vcs#1957 - jj-vcs#2295 - jj-vcs#3851 - jj-vcs#4177 - jj-vcs#4682 - jj-vcs#4719 - jj-vcs#4889 - jj-vcs#5147 - jj-vcs#5238 Notable Holdouts: - Interactive HTTP authentication (jj-vcs#401, jj-vcs#469) - libssh2-sys dependency on windows problem (can only be removed if/when we get rid of libgit2): jj-vcs#3984
Reasoning:
jj
fails to push/fetch over ssh depending on the system.Issue #4979 lists over 20 related issues on this and proposes spawning
a
git
subprocess for tasks related to the network (in fact, just push/fetchare enough).
This PR implements this.
Users can either enable shelling out to git in a config file:
Implementation Details:
This PR implements shelling out to
git
viastd::process::Command
.There are 2 sharp edges with the patch:
it relies on having to parse out git errors to match the error codes
(and parsing git2's errors in one particular instance to match the
error behaviour). This seems mostly unavoidable
to ensure matching behaviour with git2, the tests are maintained across the
two implementations. This is done using test_case, as with the rest
of the codebase
Testing:
Run the rust tests:
Build:
Clone a private repo:
Create new commit and push
Issues Closed
With a grain of salt, but most of these problems should be fixed (or at least checked if they are fixed). They are the ones listed in #4979 .
SSH:
ssh://
remote paths not supported. #2931jj git push
is not working for me on Windows #3322jj
can't set up new gitcredential.helper
entries #4101jj git fetch ...
/jj git clone ...
/ etc. with a FIDO2 (resident) key #4591jj git push
to GitHub repository: can't authenticate on macOS #4870Clone/fetch/push/pull:
jj git push
#1957jj git fetch --branch main
should fetch tags #2295SSL error: unknown error
#3851sso
origin fail withError: invalid argument: 'port'; class=Invalid (3)
#4177Notable Holdouts:
Checklist
If applicable:
CHANGELOG.md