Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When pulling from remote, propagate rewritten/abandoned commits #241

Open
martinvonz opened this issue Apr 25, 2022 · 5 comments
Open

When pulling from remote, propagate rewritten/abandoned commits #241

martinvonz opened this issue Apr 25, 2022 · 5 comments

Comments

@martinvonz
Copy link
Member

Description

Old commits from a remote remain in the repo after they have been deleted on the remote.

Steps to Reproduce the Problem

  1. Pull from a remote (jj git fetch)
  2. Create new commits off of some branch pulled from the remote
  3. Go to the remote repo and update or delete the branch (i.e. in the remote repo)
  4. Pull from the remote again

Expected Behavior

The commits that were removed from the remote (because no branches point to them anymore) should be abandoned locally as well. The new commits that were created off of the updated/deleted branch should be rebased to the new location (if updated) or to the closes still-existing commit (if deleted).

This is closely related to #111.

Actual Behavior

The old commits stay in place.

Specifications

  • Platform: All
  • Version: 0.4.0
@martinvonz martinvonz self-assigned this Apr 25, 2022
martinvonz added a commit that referenced this issue Apr 28, 2022
Let's say we have a simple history like this:

```
B C D
 \|/
  A
```

Branch `main` initially points to commit B. Two concurrent operations
then move the branch to commits C and D. When the two concurrent
operations get merged, the branch will be recorded as pointing to
"C+D-B". If a subsequent operation now abandons commit B, we would
update the "removed" side of the branch conflict. That seems a little
dishonest. I think the reason I did it that way was in order to not
keep B visible back when having it present in the "removed" side would
keep it visible (before 33bf6ce).

I noticed this issue while working on #241 because
`test_import_refs_reimport()` started failing. That test case is
pretty much exactly the case above.
martinvonz added a commit that referenced this issue Apr 28, 2022
Now that I'm using GitHub PRs instead of pushing directly to the main
branch, it's quite annoying to have to abandon the old commits after
GitHub rebases them. This patch makes it so we compare the remote's
previous heads to the new heads and abandons any commits that were
removed on the remote. As usual, that means that descendants get
rebased onto the closest remaining commit.

This is half of #241. The other half is to detect rewritten branches
and rebase on top.
martinvonz added a commit that referenced this issue Apr 28, 2022
Now that I'm using GitHub PRs instead of pushing directly to the main
branch, it's quite annoying to have to abandon the old commits after
GitHub rebases them. This patch makes it so we compare the remote's
previous heads to the new heads and abandons any commits that were
removed on the remote. As usual, that means that descendants get
rebased onto the closest remaining commit.

This is half of #241. The other half is to detect rewritten branches
and rebase on top.
martinvonz added a commit that referenced this issue Apr 28, 2022
Let's say we have a simple history like this:

```
B C D
 \|/
  A
```

Branch `main` initially points to commit B. Two concurrent operations
then move the branch to commits C and D. When the two concurrent
operations get merged, the branch will be recorded as pointing to
"C+D-B". If a subsequent operation now abandons commit B, we would
update the "removed" side of the branch conflict. That seems a little
dishonest. I think the reason I did it that way was in order to not
keep B visible back when having it present in the "removed" side would
keep it visible (before 33bf6ce).

I noticed this issue while working on #241 because
`test_import_refs_reimport()` started failing. That test case is
pretty much exactly the case above.
martinvonz added a commit that referenced this issue Apr 28, 2022
Now that I'm using GitHub PRs instead of pushing directly to the main
branch, it's quite annoying to have to abandon the old commits after
GitHub rebases them. This patch makes it so we compare the remote's
previous heads to the new heads and abandons any commits that were
removed on the remote. As usual, that means that descendants get
rebased onto the closest remaining commit.

This is half of #241. The other half is to detect rewritten branches
and rebase on top.
martinvonz added a commit that referenced this issue Apr 28, 2022
Let's say we have a simple history like this:

```
B C D
 \|/
  A
```

Branch `main` initially points to commit B. Two concurrent operations
then move the branch to commits C and D. When the two concurrent
operations get merged, the branch will be recorded as pointing to
"C+D-B". If a subsequent operation now abandons commit B, we would
update the "removed" side of the branch conflict. That seems a little
dishonest. I think the reason I did it that way was in order to not
keep B visible back when having it present in the "removed" side would
keep it visible (before 33bf6ce).

I noticed this issue while working on #241 because
`test_import_refs_reimport()` started failing. That test case is
pretty much exactly the case above.
martinvonz added a commit that referenced this issue Apr 28, 2022
Now that I'm using GitHub PRs instead of pushing directly to the main
branch, it's quite annoying to have to abandon the old commits after
GitHub rebases them. This patch makes it so we compare the remote's
previous heads to the new heads and abandons any commits that were
removed on the remote. As usual, that means that descendants get
rebased onto the closest remaining commit.

This is half of #241. The other half is to detect rewritten branches
and rebase on top.
@martinvonz martinvonz removed their assignment Nov 2, 2022
@martinvonz
Copy link
Member Author

The current status is that we rebase descendants off of commits that were removed from the remote, but we don't rebase onto the new branch target if the branch had been rewritten.

@joyously
Copy link

Does this mean that a deletion removes the history of it ever existing? Is that desirable?

@martinvonz
Copy link
Member Author

I'm not sure what you mean by "removes the history of it ever existing", so let me give an example. Let's say we clone a repo and get this history:

o aaa222 feature bbb222
| add a feature
o aaa111 main bbb111
:
o 000000 root 000000

So feature is a branch with a single commit on top of main. The first hash is the change id and the second hash is the commit id, similar to jj log output,

We then decide to fix a bug based on feature and end up with this:

@ aaa333 bbb333
| fix a bug
o aaa222 feature bbb222
| add a feature
o aaa111 main bbb111
:
o 000000 root 000000

Now the remote repo decided to delete the feature branch for some reason. After we jj git fetch, our local history will look like this:

@ aaa333 bbb444
| fix a bug
o aaa111 main bbb111
:
o 000000 root 000000

I.e., change aaa333 was rebased off of the commits in the deleted feature branch, resulting in commit id bbb444. Of course, if our bug fix actually depended on the feature, then we'll have conflicts in bbb444.

Does that answer your question? If not, can you clarify in the context of the example?

@joyously
Copy link

I suppose from the perspective that jj does rebasing, which changes history, making it reflect the remote is the desired outcome.
From the perspective of a Git user, where nothing is ever actually deleted, it seems odd.
In Git, branches can come and go, but not commits?
Would the op log show all the intervening steps, or just the fetch? In other words, can you revert (undo?) the fetch or the delete?

@martinvonz
Copy link
Member Author

From the perspective of a Git user, where nothing is ever actually deleted, it seems odd.
In Git, branches can come and go, but not commits?

Git has garbage collection and will delete commits that are no longer referenced. However, references from reflogs count, and the reflog for HEAD gets updated quite often, so many commits are kept after losing branches pointing to them. Reflogs entries expire after 90 days by default. Note that deleting a branch leads to the reflog also getting deleted.

Would the op log show all the intervening steps, or just the fetch? In other words, can you revert (undo?) the fetch or the delete?

Yes, the operation log will still including references to deleted commits, so you can go back as many steps as you like. We actually don't have any support for GC yet (#12), so there's no expiration period like in Git.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants