Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve diff readability for eg "Bob," -> "Bob" #7

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

laughinghan
Copy link

@laughinghan laughinghan commented May 4, 2021

If a hunk consists of only deleting or only adding at the beginning
or end of a word, then combine them into one hunk.

Examples:

  • Changes from simplediff:

    - Alice, Bob and Charlie
    + Alice, Bob, and Charlie

    simplediff: Alice, BobBob, Charlie
    this diff: Alice, Bob, Charlie

    - Alice Bob Charlie's Angels And David
    + Alice Bob Charlie David

    simplediff: Alice Bob Charlie's Angels AndCharlie David
    this diff: Alice Bob Charlie's Angels And David

  • Same as simplediff:

    • hunks you wouldn't want simplified:

      - Alice Bob Charlie
      + Alice Robert Charlie

      diff: Alice Bob Robert Charlie

    • if the change isn't only at the beginning or end:

      - Alice Bob Charlie
      + Alice Blob Charlie

      diff: Alice Bob Blob Charlie

      - Alice Bobby Charlie
      + Alice bb Charlie

      diff: Alice Bobby bb Charlie

      - Alice Zeneca Charlie
      + Alice AstraZeneca Charlie's

      diff: Alice Zeneca Charlie AstraZeneca Charlie's

If a hunk consists of only deleting or only adding at the beginning
or end of a word, then combine them into one hunk.

Examples:
  Changes from simplediff:
    - Alice, Bob and Charlie
    + Alice, Bob, and Charlie
    simplediff: Alice, <del>Bob</del><ins>Bob,</ins> Charlie
    this diff:  Alice, Bob<del>,</del> Charlie

    - Alice Bob Charlie's Angels And David
    + Alice Bob Charlie David
    simplediff: Alice Bob <del>Charlie's Angels And</del><ins>Charlie</ins> David
    this diff:  Alice Bob Charlie<del>'s Angels And</del> David

    Same as simplediff:
      hunks you wouldn't want simplified:
        - Alice Bob Charlie
        + Alice Robert Charlie
        diff: Alice <del>Bob</del> <ins>Robert</ins> Charlie

        if the change isn't only at the beginning or end:
          - Alice Bob Charlie
          + Alice Blob Charlie
          diff: Alice <del>Bob</del> <ins>Bob</ins> Charlie

          - Alice Bobby Charlie
          + Alice bb Charlie
          diff: Alice <del>Bobby</del> <ins>bb</ins>

          - Alice Zeneca Charlie
          + Alice AstraZeneca Charlie's
          diff: Alice <del>Zeneca Charlie</del> <ins>AstraZeneca Charlie's</ins>
@j-e-d
Copy link
Owner

j-e-d commented May 5, 2021

Thanks for the PR @laughinghan, let me find some time to test it locally and I will merge it if it looks good.

@j-e-d
Copy link
Owner

j-e-d commented May 5, 2021

@laughinghan the problem I have with your proposed patch is with cases like the following:

old = "Provide them Clues"
new = "Provide the Clues"
    
assert html_diff_original(old, new) == 'Provide <del>them</del> <ins>the</ins> Clues'
assert html_diff_proposed(old, new) == 'Provide <del>them</del> <ins>the</ins> Clues'

The assert with the current html_diff is what I would expect, the word was changed from "them" to "the" and I want the diff to show the whole word, what I get when I run that test with the proposed change is:

E       AssertionError: assert 'Provide the<...m</del> Clues' == 'Provide <del...e</ins> Clues'
E         - Provide <del>them</del> <ins>the</ins> Clues
E         + Provide the<del>m</del> Clues

I would be OK and like that the algorithm didn't consider punctuation marks as part of a word, but I don't want words to be cut into parts when they are changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants