-
Notifications
You must be signed in to change notification settings - Fork 97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse index: integrate with git stash
#428
Conversation
I have reproduced this on my machine. Very strange. I'll look into it a bit more tomorrow. |
Fix wildcard in pathspec reset test
I re-ran the tests with I'll start looking into why the |
False alarm: the build directories for |
I figured it out! It's a bug in |
The GIT_TEST_INSTALLED was moved from perf-lib.sh to run in df0f502 (perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.sh, 2019-05-07) and that included a change to how it inspected the existence of a bin-wrappers directory. However, that included a typo that made the match of bin-wrappers never work. Specifically, the assignment was mydir_abs_wrappers="$mydir_abs_wrappers/bin-wrappers" which uses the same variable before it is initialized. By changing it to mydir_abs_wrappers="$mydir_abs/bin-wrappers" We can correctly use the bin-wrappers directory. This is critical to successfully computing performance of commands that execute subcommands. The bin-wrappers ensure that the --exec-path is set correctly. Signed-off-by: Derrick Stolee <[email protected]>
Signed-off-by: Victoria Dye <[email protected]>
Most of the functionality of `git stash` exists in already-sparse aware subcommands, so enabling sparse index in `cmd_stash` allows the index to remain sparse in most usages of `git stash`. Tests added to `t1092` reflect commonly-used commands that are already sparse index-compatible. Signed-off-by: Victoria Dye <[email protected]>
The `merge_recursive_generic` function is a wrapper that allows running a merge on trees (rather than commits) by wrapping the trees as virtual commits. This functionality is valuable not only for use with `merge_recursive`, but also `merge_ort_recursive`. Having the merge function used by `merge_recursive_generic` be an argument allows greater flexibility of usage. Signed-off-by: Victoria Dye <[email protected]>
Because `merge_recursive` forces the index to always be expanded, switching to `merge_ort_recursive` is needed to allow the index to remain sparse when applying a stash. Signed-off-by: Victoria Dye <[email protected]>
Strangely, none of the other builtin changes seem to have had an affect, sadly:
But at least we are getting that huge performance benefit at the end. Edit: That second |
That is strange, and I confirmed it on my end as well. Digging into
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like your commit ordering here. I left a note on the one about the function pointer, which is a great strategy for now. We will want to present that to the mailing list as something we are less sure about. They might want to do a fully-configurable approach, but they might also want to do a full replacement in one go. It's unclear.
repo_hold_locked_index(opt->repo, &lock, LOCK_DIE_ON_ERROR); | ||
clean = merge_recursive(opt, head_commit, next_commit, ca, | ||
result); | ||
clean = merge_fn(opt, head_commit, next_commit, ca, result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is odd that this doesn't already allow the ORT strategy if configured to do so.
I went hunting for how this was configured in builtin/merge.c
, but that seems like a very complicated procedure.
We could use something like the pull.twohead
config here as a way to allow a user to modify the merge strategy. That gives us a safety valve in case the ORT strategy has a problem in one of these consumers.
The other approach could be to just use the ORT algorithm here no matter what. It's a big change, but it might improve things more rapidly.
After having all of these thoughts I think you've found a good middle ground: the function pointer presents a way to use ORT only when we really need it. This works especially well for our early adoption of sparse index in microsoft/git
, but we should call special attention to this when upstreaming the work.
@@ -74,7 +74,7 @@ set_git_test_installed () { | |||
mydir=$1 | |||
|
|||
mydir_abs=$(cd $mydir && pwd) | |||
mydir_abs_wrappers="$mydir_abs_wrappers/bin-wrappers" | |||
mydir_abs_wrappers="$mydir_abs/bin-wrappers" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just an FYI: I sent this upstream via gitgitgadget#1044, since it is so simple.
See https://lore.kernel.org/git/[email protected]/T/#u for the discussion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case, should I remove it from this branch? Or will it disappear once vfs-2.33.0
is rebased?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will disappear during the rebase onto v2.34.0
(assuming it gets in quickly).
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Sparse index: integrate with `git stash`
Changes
reset with wildcard pathspec
test int1092
*
needed to be escapedt/perf/run
(courtesy of @derrickstolee)ensure_not_expanded
tests forgit stash
merge_recursive_generic
to allow a custom merge functionmerge_recursive_generic
, swapped out the merge function used ingit stash apply
/git stash pop
cmd_stash
Performance
Measuring this via performance tests yielded some confusing results. When
p2000
was run on just this branch, there was a clear improvement in sparse vs full index (5 timing trials per test):However, when run comparing different commits (also with 5 trials each), there was basically no difference in the last column (testing the same as above):
The above tested commits are:
git reset
with sparse indexgit checkout-index
with sparse indexgit update-index
with sparse indexgit read-tree
with sparse indexThe
t1092
tests confirm that the index should not be expanded (and a manual check of the trace2 results from running the commands in thep2000
test do not show the index being expanded), so I'm not sure what's going on there. In any event, the index does not appear to be expanded, so in practice that should probably match up with what happens in the first set of performance test results (I think).