Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make repo-level merging rebase descendants of rewritten/abandoned commits #111

Closed
martinvonz opened this issue Mar 9, 2022 · 3 comments
Closed
Assignees

Comments

@martinvonz
Copy link
Member

Let's say you have commit A, which you rewrite as A'. You then add commit B on top. You then realize you want to undo the A->A' operation. Undoing that currently results in divergence because you get A back but A' is still visible because B is on top of it. I think we should rebase B onto A instead.

Another case is when you have pushed a branch to GitHub to create a PR. The PR then gets merged and the branch gets deleted from the remote. When you now jj git fetch from the remote, we pull down the new main branch and delete the old feature branch locally. However, we leave the commits in place and you have to manually abandon them. I think we should instead remove the commits and rebase any commits onto their closest visible ancestor.

These two cases are perhaps not obviously related, but I think they will both implemented in the repo-level merge. That means you'll get the same automatic rebasing from A' to A when you fetch that update from a remote, and the same automatic rebasing to ancestors if you undo an operation that added commits that your new branch is based on.

Note that the interaction with remotes will work much more smoothly once we can preserve Change IDs throughout the roundtrip to a remote. One option for achieving that would be to add the Change ID to the commit message like Gerrit does with its Change-Id: footer. If we do that, we'll want to make it optional so non-Jujutsu users won't be bothered by it. For now, we should do the automatic rebasing when we notice that a branch was moved in Git.

This is something I've meant to do for a year or something, but it's become more important to me personally now that I use GitHub PRs.

@martinvonz martinvonz self-assigned this Mar 9, 2022
@martinvonz
Copy link
Member Author

martinvonz commented Mar 9, 2022

A related cleanup I've meant to make is to the internal representation of branches and remotes. A branch currently knows where it points locally and on all remotes. I'm hoping to change that so the branch doesn't know about its target on remotes and to instead have a per-remote view object containing the branches on that remote.

After restructuring it like that, I think it will be easier and more natural to implement pull (fetch) as a 3-way repo-level merge: take the local repo's state and apply the diff between the remote's previously recorded state and the remote's new state. The same applies when importing from an underlying Git repo.

I'm documenting these thoughts here in the hope that it's useful to others, so please ask if the above didn't make any sense to you. I'll try to find time to document it in the repo as well some day...

martinvonz added a commit that referenced this issue Mar 20, 2022
I want to make it so when we apply a repo-level change that removes a
head, we rebase descendants of that head onto the closes visible
ancestor, or onto its successor if the head has been rewritten (see
#111 for details). The view itself doesn't have enough information for
that; we need repo-level information (to figure out relationships
between commits).

The view doesn't have enough information to do the.
martinvonz added a commit that referenced this issue Mar 20, 2022
…111)

We had a few lines of similar code where we added a new of the
operation log and then removed the old heads. By moving that code into
a new type, we prepare for further refactorings.
martinvonz added a commit that referenced this issue Mar 20, 2022
It's useful for the UI layer to know that there's been concurrent
operations, so it can inform the user that that happened. It'll be
even more useful when we soon start making the resolution involve
rebasing commits, since that's even more important for the UI layer to
present to the user. This patch gets us a bit closer to that by moving
the resolution to the repo level.
martinvonz added a commit that referenced this issue Mar 23, 2022
I want to make it so when we apply a repo-level change that removes a
head, we rebase descendants of that head onto the closes visible
ancestor, or onto its successor if the head has been rewritten (see
#111 for details). The view itself doesn't have enough information for
that; we need repo-level information (to figure out relationships
between commits).

The view doesn't have enough information to do the.
martinvonz added a commit that referenced this issue Mar 23, 2022
…111)

We had a few lines of similar code where we added a new of the
operation log and then removed the old heads. By moving that code into
a new type, we prepare for further refactorings.
martinvonz added a commit that referenced this issue Mar 23, 2022
It's useful for the UI layer to know that there's been concurrent
operations, so it can inform the user that that happened. It'll be
even more useful when we soon start making the resolution involve
rebasing commits, since that's even more important for the UI layer to
present to the user. This patch gets us a bit closer to that by moving
the resolution to the repo level.
martinvonz added a commit that referenced this issue Mar 23, 2022
Certain commands should never rewrite commits, or they take care of
rebasing descendants themselves. We have an optimization in
`commands.rs` for those commands, so they skip the usual automatic
rebasing before committing the transaction. That's risky to have to
remember and `MutableRepo` already knows if any commits have been
rewritten (that wasn't the case before, in the Evolution-based
code). So let's just have `MutableRepo` do the check instead.
martinvonz added a commit that referenced this issue Mar 24, 2022
I want to make it so when we apply a repo-level change that removes a
head, we rebase descendants of that head onto the closes visible
ancestor, or onto its successor if the head has been rewritten (see
#111 for details). The view itself doesn't have enough information for
that; we need repo-level information (to figure out relationships
between commits).

The view doesn't have enough information to do the.
martinvonz added a commit that referenced this issue Mar 24, 2022
…111)

We had a few lines of similar code where we added a new of the
operation log and then removed the old heads. By moving that code into
a new type, we prepare for further refactorings.
martinvonz added a commit that referenced this issue Mar 24, 2022
It's useful for the UI layer to know that there's been concurrent
operations, so it can inform the user that that happened. It'll be
even more useful when we soon start making the resolution involve
rebasing commits, since that's even more important for the UI layer to
present to the user. This patch gets us a bit closer to that by moving
the resolution to the repo level.
martinvonz added a commit that referenced this issue Mar 24, 2022
Certain commands should never rewrite commits, or they take care of
rebasing descendants themselves. We have an optimization in
`commands.rs` for those commands, so they skip the usual automatic
rebasing before committing the transaction. That's risky to have to
remember and `MutableRepo` already knows if any commits have been
rewritten (that wasn't the case before, in the Evolution-based
code). So let's just have `MutableRepo` do the check instead.
martinvonz added a commit that referenced this issue Mar 24, 2022
)

If we have recorded in `MutableRepo` that commits have been abandoned
or rewritten, we should always rebase descendants before committing
the transaction (otherwise there's no reason to record the
rewrites). That's not much of a risk in the CLI because we already
have that logic in a central place there (`finish_transaction()`), but
other users of the library crate could easily miss it. Perhaps we
should automatically do any necessary rebasing we commit the
transaction in the library crate instead, but for now let's just have
a check for that to catch such bugs.
martinvonz added a commit that referenced this issue Mar 24, 2022
This is yet another refactoring step to allow the UI layer to inspect
and control how concurrent operations are resolved.
@martinvonz
Copy link
Member Author

The same code is used for resolving conflicts between concurrent operations. For example, if you create commit B on top of commit A on one machine and rewrite commit A as A' on another machine, and you then rsync the two repos, you should now (with the commits I just pushed) see B automatically get rebased on top of A' next time you run a command.

Similarly, if you create commit C on top of B on one machine and abandon commit B on another machine, you'll see C get automatically rebased onto A (B's parent) next time you run a command.

That's quite unorthodox, but I think it's what the user wants. I think we just need to inform them that the rebase happened. I think it'll be neat when it behaves the same way whether you ran two concurrent commands on the same machine, on two different machines (after e.g. rsync), or on two different clones (after fetching from the other repo).

martinvonz added a commit that referenced this issue Mar 26, 2022
…111)

We already have the operation objects, and keeping them will allow
some cleanup (see next change).
martinvonz added a commit that referenced this issue Mar 26, 2022
It's the transaction's job to create a new operation, and that's where
the knowledge of parent operations is. By moving the logic for merging
in another operation there, we can make it contiuously update its set
of parent operations. That removes the risk of forgetting to add the
merged-in operation as a parent. It also makes it easier to reuse the
function from the CLI so we can inform the user about the process
(which is what I was investigating when I noticed that this cleanup
was possible).
martinvonz added a commit that referenced this issue Mar 26, 2022
It's cheap to sort them, so let's have them always sorted, so all
callers can rely on that order.
martinvonz added a commit that referenced this issue Mar 26, 2022
The function is now pretty simple, and there's only one caller, so
let's inline it. It probably makes sense to move the code out of
`repo.rs` at some point.
martinvonz added a commit that referenced this issue Mar 27, 2022
)

This involved copying `UnresolvedHeadRepo::resolve()` into the CLI
crate (and modifying it a bit to print number of rebased commit),
which is unfortunate.
martinvonz added a commit that referenced this issue Mar 27, 2022
…111)

We already have the operation objects, and keeping them will allow
some cleanup (see next change).
martinvonz added a commit that referenced this issue Mar 27, 2022
It's the transaction's job to create a new operation, and that's where
the knowledge of parent operations is. By moving the logic for merging
in another operation there, we can make it contiuously update its set
of parent operations. That removes the risk of forgetting to add the
merged-in operation as a parent. It also makes it easier to reuse the
function from the CLI so we can inform the user about the process
(which is what I was investigating when I noticed that this cleanup
was possible).
martinvonz added a commit that referenced this issue Mar 27, 2022
It's cheap to sort them, so let's have them always sorted, so all
callers can rely on that order.
martinvonz added a commit that referenced this issue Mar 27, 2022
The function is now pretty simple, and there's only one caller, so
let's inline it. It probably makes sense to move the code out of
`repo.rs` at some point.
martinvonz added a commit that referenced this issue Mar 27, 2022
…111)

I want to do the rebasing a bit later, so the CLI can do it and print
how many commits got rebased.
martinvonz added a commit that referenced this issue Mar 27, 2022
)

This involved copying `UnresolvedHeadRepo::resolve()` into the CLI
crate (and modifying it a bit to print number of rebased commit),
which is unfortunate.
martinvonz added a commit that referenced this issue Mar 27, 2022
I want to make it so when we apply a repo-level change that removes a
head, we rebase descendants of that head onto the closes visible
ancestor, or onto its successor if the head has been rewritten (see
#111 for details). The view itself doesn't have enough information for
that; we need repo-level information (to figure out relationships
between commits).

The view doesn't have enough information to do the.
martinvonz added a commit that referenced this issue Mar 27, 2022
…111)

We had a few lines of similar code where we added a new of the
operation log and then removed the old heads. By moving that code into
a new type, we prepare for further refactorings.
martinvonz added a commit that referenced this issue Mar 27, 2022
It's useful for the UI layer to know that there's been concurrent
operations, so it can inform the user that that happened. It'll be
even more useful when we soon start making the resolution involve
rebasing commits, since that's even more important for the UI layer to
present to the user. This patch gets us a bit closer to that by moving
the resolution to the repo level.
martinvonz added a commit that referenced this issue Mar 27, 2022
Certain commands should never rewrite commits, or they take care of
rebasing descendants themselves. We have an optimization in
`commands.rs` for those commands, so they skip the usual automatic
rebasing before committing the transaction. That's risky to have to
remember and `MutableRepo` already knows if any commits have been
rewritten (that wasn't the case before, in the Evolution-based
code). So let's just have `MutableRepo` do the check instead.
martinvonz added a commit that referenced this issue Mar 27, 2022
)

If we have recorded in `MutableRepo` that commits have been abandoned
or rewritten, we should always rebase descendants before committing
the transaction (otherwise there's no reason to record the
rewrites). That's not much of a risk in the CLI because we already
have that logic in a central place there (`finish_transaction()`), but
other users of the library crate could easily miss it. Perhaps we
should automatically do any necessary rebasing we commit the
transaction in the library crate instead, but for now let's just have
a check for that to catch such bugs.
martinvonz added a commit that referenced this issue Mar 27, 2022
This is yet another refactoring step to allow the UI layer to inspect
and control how concurrent operations are resolved.
martinvonz added a commit that referenced this issue Mar 27, 2022
…111)

We already have the operation objects, and keeping them will allow
some cleanup (see next change).
martinvonz added a commit that referenced this issue Mar 27, 2022
It's the transaction's job to create a new operation, and that's where
the knowledge of parent operations is. By moving the logic for merging
in another operation there, we can make it contiuously update its set
of parent operations. That removes the risk of forgetting to add the
merged-in operation as a parent. It also makes it easier to reuse the
function from the CLI so we can inform the user about the process
(which is what I was investigating when I noticed that this cleanup
was possible).
martinvonz added a commit that referenced this issue Mar 27, 2022
It's cheap to sort them, so let's have them always sorted, so all
callers can rely on that order.
martinvonz added a commit that referenced this issue Mar 27, 2022
The function is now pretty simple, and there's only one caller, so
let's inline it. It probably makes sense to move the code out of
`repo.rs` at some point.
martinvonz added a commit that referenced this issue Mar 27, 2022
…111)

I want to do the rebasing a bit later, so the CLI can do it and print
how many commits got rebased.
martinvonz added a commit that referenced this issue Mar 27, 2022
)

This involved copying `UnresolvedHeadRepo::resolve()` into the CLI
crate (and modifying it a bit to print number of rebased commit),
which is unfortunate.
@martinvonz
Copy link
Member Author

The repo-level merging improvements are done, so I'll close this. I still need to make jj git fetch (and jj git import) take advantage of it, but I should probably create a different issue for that since there are different considerations there.

martinvonz added a commit that referenced this issue Dec 2, 2022
…ible

It seems that we didn't have a test for this simple case. I wrote this
test case while working on #111 but I don't know why I didn't push it
back then.
martinvonz added a commit that referenced this issue Dec 2, 2022
…sible

It seems that we didn't have a test for this simple case. I wrote this
test case while working on #111 but I don't know why I didn't push it
back then.
martinvonz added a commit that referenced this issue Dec 2, 2022
…sible

It seems that we didn't have a test for this simple case. I wrote this
test case while working on #111 but I don't know why I didn't push it
back then.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant