Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend :Gbrowse to support line anchors for commits #1171

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

odnoletkov
Copy link
Contributor

Look for and use path of the +++ side when it exists. For removals it is
/dev/null, so use path of the --- side. For the line range only line1
is set, corresponding to the first line of selected region. If this line
is in the removed part of the hunk, the number is relative to the ---
side and is negative to indicate that to adapters.

@odnoletkov
Copy link
Contributor Author

A few things I don't like:

  • The logic deciding which side to pick for the path is hardcoded in vim-fugitive. It matches what Github and Bitbucket currently use but other providers can do something else
  • Ideally s:cfile should be updated to support this use case – work for context lines and provide required path. That would allow to jump not only from hunk contents, but from various meta lines.
  • Negating line number to indicate base diff side is kinda ugly

Fixing this would require expanding the handlers interface, not sure what is right here

@odnoletkov
Copy link
Contributor Author

Relevant vim-rhubarb pull request: tpope/vim-rhubarb#37

@shumphrey
Copy link
Contributor

If it's useful, here's a gitlab commit url
https://gitlab.com/shumphrey/fugitive-gitlab.vim/commit/935fc68e88630ea3ededa53f73b82fe615b91f75#bf0da39e5391a3e3059698066351b6fabb26a3af_104_104

@odnoletkov
Copy link
Contributor Author

It seems like Gitlab format is:

#<SHA1 of path>_L_R

where L and R are left and right line numbers. For ambiguous line numbers like L for added lines the (last + 1) number is used. It looks like it doesn't take any other format and we have to always give these two numbers exactly. Which in turn means abusing line1 like I did wouldn't work and we need to pass both line numbers to adapters and figure out a different way to indicate which side to use.

@tpope
Copy link
Owner

tpope commented Feb 1, 2019

Probably for the best, I think jamming things into existing fields was a bit too clever.

I'm not sure I understand how opposing line number is computed when you click a + or - line. Can we solve that first?

@odnoletkov
Copy link
Contributor Author

I've made a sample repo and summarized format for GitHub, Bitbucket and GitLab.

$PATH is result of applying the path function to the path from the specified side of the diff
$LN is left line number (corresponds to the --- side of the diff)
$RN is right line number (corresponds to the +++ side of the diff)

GitHub

Path function: MD5

File Anchor Path side
format #diff-$PATH
added #diff-e5ffe45809e26a5936f4b5aedd3f9abc +++
removed #diff-f6af5b3f285d6bdf2c72ca6b275d4b33 ---
renamed #diff-949e4e0aa2c9c41c5b6da230e51e42a2 +++
modified #diff-3c1b078733a5ed0277a059b90eecf685 +++
Line Anchor
format #diff-$PATH(L$LN|R$RN)
added #diff-3c1b078733a5ed0277a059b90eecf685R18
removed #diff-3c1b078733a5ed0277a059b90eecf685L16
context #diff-3c1b078733a5ed0277a059b90eecf685L12
#diff-3c1b078733a5ed0277a059b90eecf685R14

Bitbucket

Path function: literal string

File Anchor Path side
format #chg-$PATH
added #chg-added-file +++
removed #chg-removed-file ---
renamed #chg-renamed-file +++
modified #chg-modified-file +++
Line Anchor
format #L$PATH[F$LN][T$RN]
added #Lmodified-fileT18
removed #Lmodified-fileF16
context #Lmodified-fileF12T14

GitLab

Path function: SHA1

File Anchor Path side
format #$PATH
added #3eebb48e8a65993c8af2d734b7283af29e4e66a7 +++
removed #4832278151f916b39769b7b63c5035529cb3e3d4 ---
renamed #1b9342021c949c8885eb0bb11bf0d7d7912542bb +++
modified #87ad87a1bcca8a41e8bbf17d7a0047f098ab4ea2 +++
Line Anchor
format #$PATH_$LN_$RN
added #87ad87a1bcca8a41e8bbf17d7a0047f098ab4ea2_18_18
removed #87ad87a1bcca8a41e8bbf17d7a0047f098ab4ea2_16_17
context #87ad87a1bcca8a41e8bbf17d7a0047f098ab4ea2_12_14

@odnoletkov
Copy link
Contributor Author

So all three use the same logic for picking the side for the path: pick +++ if it is not /dev/null, otherwise pick ---. The inverse logic (pick --- if available first, then +++) also produces unambiguous results for addressing files in commits so in theory can be used in another provider. Or any other scheme you can think of. If we want to support that, passing a single path is not enough, and we need to pass both paths to providers. We can add something like old_path to have a old_path/path pair or add both old_path/new_path.

As for line numbers, to support GitLab we need to pass both – line number in the --- side and line number in the +++ side. Can use line1 and line2 for that but they are used for ranges in other URL types so maybe it's better to not mix them up.

With passing two line numbers a few things need to be resolved. First is for some providers (GitHub and Bitbucket) we still need to pass which side line is selected. Can add extra field like side with values ' '/'-'/'+' for that.

Another problem is for added or removed lines the number on the opposite side is kinda ambiguous – can pick valid number before or after the region. Whatever strategy you pick if the change region is at the beginning or at the end of the file you end up with line numbers out of bounds (0 or NR+1).

GitLab logic for ambiguous line numbers is very bad. In general it picks numbers after the region, even if the region is in the end and the number will be out of bounds. But for some reason it special cases completely added and removed files and for corresponding lines uses 0 instead of 1 which would be a correct value with the logic above.

removed line, +++ line number is after the region
added line, --- line number is after the region
removed line in the end region, +++ line number is after the region
added line in the end region, --- line number is after the region
removed line the removed file, +++ line number is 0
added line in the new file, --- line number is 0

We can pass line numbers using the after logic – having the side parameter from above even before providers can calculate properly. And for that special casing in GitLab for added/removed files it looks like we are forced to pass both +++ and --- paths explicitly as suggested above.

So to summarize, we need

  • both old (---) and new (+++) paths
  • both line numbers
  • a field to indicate which side was actually selected - old/new/context

@tpope
Copy link
Owner

tpope commented Feb 4, 2019

One last case to look at: If you rename a file, and then add a new file with the old name but completely different content, how does that look? I'm guessing the reason the +++ filename is preferred is because the --- filename might in fact refer to a different file. (Or more likely, it will never show a rename diff in that case, but it's still theoretically possible.) For this reason, I think we're probably okay to just send one filename.

Thanks for all the research! I don't think I want to commit to a new API until after 3.0 (hopefully this month), but don't let that stop you from going ahead and taking a stab at adding these new fields.

@odnoletkov
Copy link
Contributor Author

If you rename a file, and then add a new file with the old name but completely different content, how does that look?

This is displayed as plain modification and addition by all providers – GitHub, Bitbucket, GitLab – and by Git itself:

$ git diff --name-status 201483^@
M       old-name-file
A       renamed-file

No heuristics work for that case by default. This is documented in great detail in git help diffcore. Basically only rename detection is enabled by default and it only considers removed paths. To include modified paths into consideration copy detection must be explicitly enabled with -C:

$ git diff --name-status -C 201483^@
M       old-name-file
C100    old-name-file   renamed-file

None of the providers enable that. In general it looks like they try to be as close to the original diff format as possible – at least I haven't noticed any discrepancies – which makes sense.

Anyway, as I described above GitLab has special case for ambiguous line numbers in removed and added paths (uses 0 instead of 1 for them) for some reason. Passing both --- and +++ paths is one way to cover that case and also feels like a safe bet to cover other possible provider quirks.

@tpope
Copy link
Owner

tpope commented Feb 6, 2019

So two different hypotheticals: a provider that supports -C (kind of cool) and a provider that favors the --- filename (kind of dumb). Guess which one I'd use for inspiration.

I think using 0 to mean "this side of the diff doesn't exist" is actually kind of elegant, and I would propose adapting it ourselves.

@odnoletkov
Copy link
Contributor Author

I was afraid of other hypotheticals which can combine paths in other unpredicted ways. Went through the first list of Git web hostings I could find. The first which worked for me is Beanstalk and it actually doesn't have rename detection in commits at all:
image

So supporting this one would definitely require both paths. Missing path could be represented by either missing key or empty value for the key.

As for line numbers, do you want to re-use existing 'line1' and 'line2' keys despite they have different meaning or add new keys?

@tpope
Copy link
Owner

tpope commented Feb 7, 2019

This screenshot doesn't tell me much, but I take it renames show up as a thousand deleted lines in the old filename and a thousand added lines in the new filename. And if you try to :.Gbrowse on a - line, you want the old filename so you can link to one of those thousand deleted lines. This seems utterly worthless but also the technically correct behavior so congrats I guess.

I'm not sure what the API should look like (hence my reluctance to rush it out the door) other than we should not reuse the path key, because current implementations will all take that to mean link to a blob instead. Just use your best judgment, I assume it will be easy to rename a field or two later.

@odnoletkov odnoletkov changed the title Provide path and line to :Gbrowse providers for commits Extend :Gbrowse to support line anchors for commits Feb 10, 2019
@tpope tpope mentioned this pull request Feb 21, 2019
@odnoletkov
Copy link
Contributor Author

Now this should support line ranges: https://github.blog/changelog/2019-07-23-multi-line-diff-selection

@tpope
Copy link
Owner

tpope commented Jul 23, 2019

Is this supposed to work by shift clicking a second line while viewing a commit? Because that's not working for me.

@odnoletkov
Copy link
Contributor Author

Yes, shift-clicking.

Works in:
Commit in pull request: https://github.com/tpope/vim-fugitive/pull/1171/commits/171bd8a18b485a81a544f68466a3ffff9fbd1cf6#diff-4c8459efc2f0fea43a9d26047a9e5e06R3974-R3977 (interesting that Github replaces this link to a standalone commit link when rendering comments)
All changes in pull request: https://github.com/tpope/vim-fugitive/pull/1171/files#diff-4c8459efc2f0fea43a9d26047a9e5e06R3974-R3977

Doesn't work in:
Standalone commit page: 994d1b5
Compare page: master...blame_color

Apparently not everything is 'diff view'.

I guess until it is not supported on the commit page it's not very relevant for us. Should we still future-proof the adapters interface and include lines range there?

Also this makes me wonder if 'commit in pull request' page may be more useful than 'standalone commit' page for :Gbrowse (when it exists). I wonder if it can be even found given just commit name.

@tpope
Copy link
Owner

tpope commented Jul 24, 2019

Yes, shift-clicking.

Works in:
Commit in pull request: https://github.com/tpope/vim-fugitive/pull/1171/commits/171bd8a18b485a81a544f68466a3ffff9fbd1cf6#diff-4c8459efc2f0fea43a9d26047a9e5e06R3974-R3977 (interesting that Github replaces this link to a standalone commit link when rendering comments)
All changes in pull request: https://github.com/tpope/vim-fugitive/pull/1171/files#diff-4c8459efc2f0fea43a9d26047a9e5e06R3974-R3977

Doesn't work in:
Standalone commit page: 994d1b5
Compare page: master...blame_color

Apparently not everything is 'diff view'.

I guess until it is not supported on the commit page it's not very relevant for us. Should we still future-proof the adapters interface and include lines range there?

If we've decided to send 2 filenames, may as well send 4 line numbers.

Also this makes me wonder if 'commit in pull request' page may be more useful than 'standalone commit' page for :Gbrowse (when it exists). I wonder if it can be even found given just commit name.

I think it should be consistent, not a "use this one page if it exists and this other page if not" deal.

@odnoletkov
Copy link
Contributor Author

Works in:
Commit in pull request: https://github.com/tpope/vim-fugitive/pull/1171/commits/171bd8a18b485a81a544f68466a3ffff9fbd1cf6#diff-4c8459efc2f0fea43a9d26047a9e5e06R3974-R3977 (interesting that Github replaces this link to a standalone commit link when rendering comments)
All changes in pull request: https://github.com/tpope/vim-fugitive/pull/1171/files#diff-4c8459efc2f0fea43a9d26047a9e5e06R3974-R3977

Doesn't work in:
Standalone commit page: 994d1b5
Compare page: master...blame_color

It looks like this has been fixed since, all these Github pages now support line ranges in url anchors

@odnoletkov
Copy link
Contributor Author

odnoletkov commented Aug 22, 2019

TODO:

  • support selection spanning multiple hunks in single file

autoload/fugitive.vim Outdated Show resolved Hide resolved
@odnoletkov
Copy link
Contributor Author

@tpope any other fixes here or is it good to go?

@tpope
Copy link
Owner

tpope commented Sep 18, 2019

Been struggling to put it into words but my gut reaction is that this change feels very alien, like it was blindly bolted on with zero knowledge of the existing API or of Fugitive in general. All 5 child keys are identifiers that appear nowhere else in the source code. It triples the size of the API, which I would frown upon even if it was necessary, but it doesn't even feel necessary.

I think a better design would be to use the "new" file to fill out the path, line1, and line2 fields, and put the same 3 attributes for the "old" file onto a child object, with a name like diff or against. I think "new" as the primary file makes sense because that's the file as it stands in the commit. (The compare view is then very naturally represented by adding a "commit" to the child object, though that's probably out of scope for this iteration.)

This proposal has 2 obvious consequences, both of which I am fine with:

  • Existing handlers will generate a link to the blob of the new file rather than the commit. I think this is fine because the old behavior was undefined.
  • We can't have a range that spans multiple files. This seems like a silly use case, and to my knowledge none of the existing hosting sites support this anyways.

@odnoletkov
Copy link
Contributor Author

As described above, having just paths and line numbers for both sides is not enough to unambiguously define position in a diff for all providers. We also need the side of the diff for the line – currently stored in the 'cursor' field with +, - and values. I can't think of a good way to represent it with your scheme.

I also feel like a separate and independent 'position in a diff' entity might have other uses – as a uniform way to address a line in a diff. But you know better of course.

@tpope
Copy link
Owner

tpope commented Sep 19, 2019

As described above, having just paths and line numbers for both sides is not enough to unambiguously define position in a diff for all providers. We also need the side of the diff for the line – currently stored in the 'cursor' field with +, - and values. I can't think of a good way to represent it with your scheme.

Yes, I was assuming cursor would stay, though I hadn't taken into account how it's actually 2 attributes. I'm using sigil elsewhere, so I guess line1_sigil and line2_sigil?

I also feel like a separate and independent 'position in a diff' entity might have other uses – as a uniform way to address a line in a diff. But you know better of course.

If it wasn't a potentially reusable abstraction I think I'd object less to it. The design smells weird with its "new" and "old" pseudo-nesting, but I can't really pinpoint how to fix it without another use case or two. So, I don't want the first use case to be an official API.

@odnoletkov
Copy link
Contributor Author

What about the internal use case in s:cfile()? It could look something like:

diff --git a/autoload/fugitive.vim b/autoload/fugitive.vim
index 551baa7..21a2fd2 100644
--- a/autoload/fugitive.vim
+++ b/autoload/fugitive.vim
@@ -5705,19 +5705,15 @@ function! s:cfile() abort
       elseif getline('.') =~# '^[+-]\{3\} [abciow12]\=/'
         let ref = getline('.')[4:]
 
-      elseif getline('.') =~# '^[+-]' && search('^@@ -\d\+\%(,\d\+\)\= +\d\+','bnW')
-        let type = getline('.')[0]
-        let lnum = line('.') - 1
-        let offset = 0
-        while getline(lnum) !~# '^@@ -\d\+\%(,\d\+\)\= +\d\+'
-          if getline(lnum) =~# '^[ '.type.']'
-            let offset += 1
-          endif
-          let lnum -= 1
-        endwhile
-        let offset += matchstr(getline(lnum), type.'\zs\d\+')
-        let ref = getline(search('^'.type.'\{3\} [abciow12]/','bnW'))[4:-1]
-        let dcmds = [offset, 'normal!zv']
+      elseif getline('.') =~# '^[+-]'
+        let pos = s:DiffPosition(line('.'))
+        if type(pos) == type({})
+          let side = {'+':'new','-':'old'}[pos.cursor]
+          let ref = {'+':'b/','-':'a/'}[pos.cursor] . pos[side . 'path']
+          let dcmds = [pos[side . 'line'], 'normal!zv']
+        else
+          let ref = ''
+        endif
 
       elseif getline('.') =~# '^rename from '
         let ref = 'a/'.getline('.')[12:]

This makes the new/old pseudo-nesting problem you mentioned stand out.

What about more nested structure (for single position):

{
  "sigil": "+",
  "+": {
    "path": "path/to/new/file",
    "line": 1
  },
  "-": {
    "path": "path/to/old/file",
    "line": 2
  }
}

This one would simplify use in both c:file and providers.

@tdroxler
Copy link

tdroxler commented Apr 7, 2020

Hola, any news or update about that feature?
Would be an awesome feature to quickly leave a comment when doing a PR review from vim.
Cheers.

@tpope
Copy link
Owner

tpope commented Apr 7, 2020

It's very much still on the agenda. It's prioritized below other Fugitive changes I've been banging my head against. This particular PR is too heavy and consequential to consider jumping the queue.

@tpope
Copy link
Owner

tpope commented Apr 5, 2021

Is this sufficient?

{
  "path": "new_path.txt",
  "line1": 5,
  "line2": 10,
  "diff": {
    "path": "old_path.txt",
    "line1": 4,
    "line2": 0,
  }
}

Instead of cursor or sigil fields, send a line number of 0 if there's no corresponding line on that side of the diff. Added/removed files would get 0 for both line number fields.

@odnoletkov
Copy link
Contributor Author

Yes this is sufficient, updated both PRs.

One remaining issue I see is when the range spans multiple files in the diff – we don't detect file name for the line2 properly. Need more sophisticated approach – probably extend iteration logic which calculates offsets to also look further for the file name?

@tpope
Copy link
Owner

tpope commented Apr 6, 2021

To be clear, we don't actually want to do anything with that second filename, correct? We just want to avoid pretending that second file's line number belongs to the first.

@tpope
Copy link
Owner

tpope commented Apr 6, 2021

Now that I think about it, a better key name than diff would be ancestor. That way there's zero ambiguity with any future compare view between 2 arbitrary commits.

Either that, or add a key like "commit": "whatever^" to the inner dictionary, but I don't want to design a generalized compare API without actually implementing it.

@odnoletkov
Copy link
Contributor Author

To be clear, we don't actually want to do anything with that second filename, correct? We just want to avoid pretending that second file's line number belongs to the first.

Yes, I don't think we can do anything reasonable with multiple files. We just need to bail out clearly in that case, fallback to main commit view same as :.GBrowse outside of diff hunk

Now that I think about it, a better key name than diff would be ancestor

Done

@tpope
Copy link
Owner

tpope commented Apr 8, 2021

I plan to finally tackle converting commit buffers to the fugitive file type in the near future. When that's done, we can hopefully handle this by building on s:Selection().

In commit buffers :GBrowse with range will include 'diff' dictionary
with 'path', 'line1' and 'line2' values for the range on the '---' side
of the diff. Root 'path', 'line1' and 'line2' values specify range on
the '+++' side. Line number is 0 when position on the corresponding side
of the diff is ambiguous because it is inside deleted hunk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants