Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unittest assertEqual difference output foiled by newlines #68968

Closed
cjerdonek opened this issue Aug 2, 2015 · 18 comments
Closed

unittest assertEqual difference output foiled by newlines #68968

cjerdonek opened this issue Aug 2, 2015 · 18 comments
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@cjerdonek
Copy link
Member

cjerdonek commented Aug 2, 2015

BPO 24780
Nosy @tim-one, @loewis, @rbtcollins, @ezio-melotti, @bitdancer, @voidspace, @cjerdonek, @elenaoat, @AnishShah, @tirkarthi, @nanjekyejoannah
PRs
  • bpo-24780: unittest assertEqual difference output foiled by newlines #11548
  • bpo-24780: unittest assertEqual difference output foiled by newlines #11548
  • bpo-24780: unittest assertEqual difference output foiled by newlines #11548
  • Files
  • test.py: Test case to reproduce the bug
  • test2.py
  • issue24780.patch
  • fix_24780.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-08-02.16:49:29.240>
    labels = ['3.7', '3.8', 'type-bug', 'library', 'easy']
    title = 'unittest assertEqual difference output foiled by newlines'
    updated_at = <Date 2019-01-14.15:30:04.394>
    user = 'https://github.com/cjerdonek'

    bugs.python.org fields:

    activity = <Date 2019-01-14.15:30:04.394>
    actor = 'xtreak'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2015-08-02.16:49:29.240>
    creator = 'chris.jerdonek'
    dependencies = []
    files = ['41636', '41639', '41782', '44679']
    hgrepos = []
    issue_num = 24780
    keywords = ['patch', 'easy']
    message_count = 14.0
    messages = ['247883', '247947', '258132', '258466', '259375', '259403', '261713', '275942', '276633', '333277', '333289', '333295', '333584', '333627']
    nosy_count = 13.0
    nosy_names = ['tim.peters', 'loewis', 'rbcollins', 'ezio.melotti', 'r.david.murray', 'michael.foord', 'chris.jerdonek', 'Elena.Oat', 'anish.shah', 'pynewbie', 'adchanw', 'xtreak', 'nanjekyejoannah']
    pr_nums = ['11548', '11548', '11548']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue24780'
    versions = ['Python 2.7', 'Python 3.7', 'Python 3.8']

    Linked PRs

    @cjerdonek
    Copy link
    Member Author

    When newlines are present, the error message displayed by unittest's self.assertEqual() to show where strings differ can be nonsensical. For example, the caret symbol can show up in a strange location.

    The first example below shows a case where things work correctly. The second shows a newline case with the confusing display.

    ======================================================================
    FAIL: test1
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/Users/chris/***/test.py", line 66, in test1
        self.assertEqual("abc", "abd")
    AssertionError: 'abc' != 'abd'
    - abc
    ?   ^
    + abd
    ?   ^

    ======================================================================
    FAIL: test2
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "/Users/chris/***/test.py", line 69, in test2
        self.assertEqual("\nabcx", "\nabdx")
    AssertionError: '\nabcx' != '\nabdx'
      
    - abcx?   ^
    + abdx?   ^

    @cjerdonek cjerdonek added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Aug 2, 2015
    @bitdancer
    Copy link
    Member

    In this particular case the problem is the lack of a trailing newline on the input string. It might be possible to improve the extended display algorithm by making sure there is a new line before the carrot line, but care must be taken to account for the cases where one string ends with newline and the other doesn't. I think this problem only applies to strings that have no trailing newline.

    @pynewbie
    Copy link
    Mannequin

    pynewbie mannequin commented Jan 13, 2016

    There is another case where the error message displayed by self.assertEqual() is weird.

    ======================================================================
    FAIL: test_newline_1 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 9, in test_newline_1
        self.assertEqual("\n abc", "\n abd")
    AssertionError: '\n abc' != '\n abd'
      
    -  abc?    ^
    +  abd?    ^

    ======================================================================
    FAIL: test_newline_2 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 12, in test_newline_2
        self.assertEqual("\nabc", "\nabd")
    AssertionError: '\nabc' != '\nabd'
      
    - abc+ abd

    There is a difference in between "\nabc" and "\n abc" and hence the difference between output

    @Vgr255 Vgr255 mannequin added the tests Tests in the Lib/test dir label Jan 17, 2016
    @elenaoat
    Copy link
    Mannequin

    elenaoat mannequin commented Jan 17, 2016

    The issue is not related only to the caret. In fact, as seen in the below output, the issue occurs anytime there's a newline character in the beginning or middle of the string to be compared.

    In short, if a newline is present in the string and it's in the beginning or middle, a newline character should be put at the end of the string, too. This will make the output look sensible. If, however, the newline is not present at the end, the output is not really readable (the new line is missing).

    As we (me and Manvi B.) understand, the caret appears in the output only when the strings are similar enough, i.e. their similarity ratio is high enough. Otherwise, compare function doesn't show the carets in places of difference. This can also be seen in test case test_trailingnewline_2.

    This issue occurs, probably, due to using splitlines method.

    FFFFFFFF
    ======================================================================
    FAIL: test_notrailingnewline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 8, in test_notrailingnewline_0
        self.assertEqual("abcDefehiJkl", "abcdefGhijkl")
    AssertionError: 'abcDefehiJkl' != 'abcdefGhijkl'
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_notrailingnewline_1 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 14, in test_notrailingnewline_1
        self.assertEqual("a\nbcdf", "a\nbddf")
    AssertionError: 'a\nbcdf' != 'a\nbddf'
      a
    - bcdf?  ^
    + bddf?  ^

    ======================================================================
    FAIL: test_notrailingnewline_2 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 18, in test_notrailingnewline_2
        self.assertEqual("a\nbcdf", "a\nbddg")
    AssertionError: 'a\nbcdf' != 'a\nbddg'
      a
    - bcdf+ bddg

    ======================================================================
    FAIL: test_starting_and_ending_newline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 12, in test_starting_and_ending_newline_0
        self.assertEqual("\nabcDefehiJkl\n", "\nabcdefGhijkl\n")
    AssertionError: '\nabcDefehiJkl\n' != '\nabcdefGhijkl\n'
      
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_startingnewline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 10, in test_startingnewline_0
        self.assertEqual("\nabcDefehiJkl", "\nabcdefGhijkl")
    AssertionError: '\nabcDefehiJkl' != '\nabcdefGhijkl'
      
    - abcDefehiJkl?    ^  ^  ^
    + abcdefGhijkl?    ^  ^  ^

    ======================================================================
    FAIL: test_trailingnewline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 6, in test_trailingnewline_0
        self.assertEqual("abcDefehiJkl\n", "abcdefGhijkl\n")
    AssertionError: 'abcDefehiJkl\n' != 'abcdefGhijkl\n'
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_trailingnewline_1 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 16, in test_trailingnewline_1
        self.assertEqual("a\nbcdf\n", "a\nbddf\n")
    AssertionError: 'a\nbcdf\n' != 'a\nbddf\n'
      a
    - bcdf
    ?  ^
    + bddf
    ?  ^

    ======================================================================
    FAIL: test_trailingnewline_2 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 20, in test_trailingnewline_2
        self.assertEqual("a\nbcdf\n", "a\nbddg\n")
    AssertionError: 'a\nbcdf\n' != 'a\nbddg\n'
      a
    - bcdf
    + bddg

    Ran 8 tests in 0.007s

    FAILED (failures=8)

    @AnishShah
    Copy link
    Mannequin

    AnishShah mannequin commented Feb 2, 2016

    I would like to work on this..

    @AnishShah
    Copy link
    Mannequin

    AnishShah mannequin commented Feb 2, 2016

    The problem is in difflib.ndiff function. When the string does not have a trailing newline, we get an unreadable output.
    After applying my patch, the following is the output of test2.py (submitted by Elena.Oat).

    FFFFFFFF
    ======================================================================
    FAIL: test_notrailingnewline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 8, in test_notrailingnewline_0
        self.assertEqual("abcDefehiJkl", "abcdefGhijkl")
    AssertionError: 'abcDefehiJkl' != 'abcdefGhijkl'
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_notrailingnewline_1 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 14, in test_notrailingnewline_1
        self.assertEqual("a\nbcdf", "a\nbddf")
    AssertionError: 'a\nbcdf' != 'a\nbddf'
      a
    - bcdf
    ?  ^
    + bddf
    ?  ^

    ======================================================================
    FAIL: test_notrailingnewline_2 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 18, in test_notrailingnewline_2
        self.assertEqual("a\nbcdf", "a\nbddg")
    AssertionError: 'a\nbcdf' != 'a\nbddg'
      a
    - bcdf
    + bddg

    ======================================================================
    FAIL: test_starting_and_ending_newline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 12, in test_starting_and_ending_newline_0
        self.assertEqual("\nabcDefehiJkl\n", "\nabcdefGhijkl\n")
    AssertionError: '\nabcDefehiJkl\n' != '\nabcdefGhijkl\n'
      
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_startingnewline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 10, in test_startingnewline_0
        self.assertEqual("\nabcDefehiJkl", "\nabcdefGhijkl")
    AssertionError: '\nabcDefehiJkl' != '\nabcdefGhijkl'
      
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_trailingnewline_0 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 6, in test_trailingnewline_0
        self.assertEqual("abcDefehiJkl\n", "abcdefGhijkl\n")
    AssertionError: 'abcDefehiJkl\n' != 'abcdefGhijkl\n'
    - abcDefehiJkl
    ?    ^  ^  ^
    + abcdefGhijkl
    ?    ^  ^  ^

    ======================================================================
    FAIL: test_trailingnewline_1 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 16, in test_trailingnewline_1
        self.assertEqual("a\nbcdf\n", "a\nbddf\n")
    AssertionError: 'a\nbcdf\n' != 'a\nbddf\n'
      a
    - bcdf
    ?  ^
    + bddf
    ?  ^

    ======================================================================
    FAIL: test_trailingnewline_2 (main.AssertEqualTest)
    ----------------------------------------------------------------------

    Traceback (most recent call last):
      File "test.py", line 20, in test_trailingnewline_2
        self.assertEqual("a\nbcdf\n", "a\nbddg\n")
    AssertionError: 'a\nbcdf\n' != 'a\nbddg\n'
      a
    - bcdf
    + bddg

    Ran 8 tests in 0.004s

    FAILED (failures=8)

    @rbtcollins
    Copy link
    Member

    Thanks for the patch; reviewed in rietvald.

    @adchanw
    Copy link
    Mannequin

    adchanw mannequin commented Sep 12, 2016

    Is this still being worked on? I have a potential fix for this.

    @pppery pppery mannequin removed the tests Tests in the Lib/test dir label Sep 15, 2016
    @pppery pppery mannequin changed the title unittest assertEqual difference output foiled by newlines difflib.ndiff produces unreadable output when input missing trailing newline Sep 15, 2016
    @adchanw
    Copy link
    Mannequin

    adchanw mannequin commented Sep 15, 2016

    I've attached a potential fix for this issue.

    While trying to fix this, I noticed that I coudn't assume that I just need to ensure that each line has a newline. If I always ensure each line in diffline has a newline, then the fourth test in testAssertMultilineEqual (in Lib/unittest/test/test_assertions.py) fails because standardMsg in assertMultiLineEqual in Lib/unittest/case.py is just one line without a newline. To sidestep this problem, I made it so that I only ensure there is a newline for each line if and only if there is more than one line in difflines. However, I'm not sure that I can assume there should be a newline in cases similar to the fourth test (where longMessage is set to true and a 'msg' is passed) in testAssertMultilineEqual but where there is more than one line in standardMsg in assertMultiLineEqual.

    @tirkarthi tirkarthi added 3.7 (EOL) end of life 3.8 (EOL) end of life labels Jan 9, 2019
    @tirkarthi
    Copy link
    Member

    Looking at the patch and the relevant function this doesn't seem to be a problem with difflib.ndiff but with unittest's display algorithm. This causes confusion about the issue and I propose changing the subject to reflect this unless difflib maintainers think this is an issue with ndiff.

    @cjerdonek
    Copy link
    Member Author

    When I first created the issue, the title I chose was about unittest ("unittest assertEqual difference output foiled by newlines"), but someone else changed it for some reason. You're welcome to change it back to something more like the original.

    @tirkarthi
    Copy link
    Member

    Thanks @chris.jerdonek. I have reverted the title to original report. Since CPython now accepts PR if any one of the original authors can convert their patch to a PR with tests then it will be great.

    @tirkarthi tirkarthi changed the title difflib.ndiff produces unreadable output when input missing trailing newline unittest assertEqual difference output foiled by newlines Jan 9, 2019
    @nanjekyejoannah
    Copy link
    Contributor

    I have opened a PR for this.

    @tirkarthi
    Copy link
    Member

    Sorry, I just stumbled upon bpo-2142 which is a similar report for unique_diff producing wrong output due to missing trailing newlines and could have been the original reason where the title was changed. But since there is a PR now towards adding a newline I think it's good to fix this on unittest side.

    @mblahay
    Copy link
    Contributor

    mblahay commented Apr 26, 2023

    The root cause of the issue is the way lines of text must be submitted to the difflib.ndiff function. The documentation states that ndiff is to receive two lists, each containing the lines to be compared. What is not stated, rather inferred from the example provided with the documentation, is that each line MUST end with a newline. For what we are doing here with assertEquals, the problem appears when comparing strings comprised of two or more lines, and the final line is not terminated by a newline.

    Why doesn't this happen when comparing single lines? Surprisingly, in the case when there is one line and the newline is missing, a newline is added. The addition of the newline simply doesn't occur when there are more than one line; I don't know why. The fix will be a change the logic to check the final line of each group for the newline and add if necessary.

    But why does the comparison of '\nabc' and '\nabd' result in the garbled output? I know it may be hard to believe, but each of those strings represent two lines. Lets take a closer look at '\nabc'; if your split the text using the str.splitlines function, and the keepends=True option, you will end up with '\n' and 'abc'. Because there are two lines, the missing newline is not appended to the 'abc', and garbling ensues.

    I do hope this helps with the understanding of the issue.

    @mblahay
    Copy link
    Contributor

    mblahay commented Apr 26, 2023

    @adchanw, I took a look at your proposed fix. You are checking each line ndiff output and adding a newline if missing. Since the root of the problem is the input into the ndiff function, it is better to handle the problem there.

    @mblahay
    Copy link
    Contributor

    mblahay commented Apr 26, 2023

    @AnishShah, Your proposed fix seems to be more in line with what the fix should be. I'm going to create a pull request to accomplish something similar. There are opportunities to optimize your changes.

    carljm added a commit to carljm/cpython that referenced this issue May 5, 2023
    * main: (61 commits)
      pythongh-64595: Argument Clinic: Touch source file if any output file changed (python#104152)
      pythongh-64631: Test exception messages in cloned Argument Clinic funcs (python#104167)
      pythongh-68395: Avoid naming conflicts by mangling variable names in Argument Clinic (python#104065)
      pythongh-64658: Expand Argument Clinic return converter docs (python#104175)
      pythonGH-103092: port `_asyncio` freelist to module state (python#104196)
      pythongh-104051: fix crash in test_xxtestfuzz with -We (python#104052)
      pythongh-104190: fix ubsan crash (python#104191)
      pythongh-104106: Add gcc fallback of mkfifoat/mknodat for macOS (pythongh-104129)
      pythonGH-104142: Fix _Py_RefcntAdd to respect immortality (pythonGH-104143)
      pythongh-104112: link from cached_property docs to method-caching FAQ (python#104113)
      pythongh-68968: Correcting message display issue with assertEqual (python#103937)
      pythonGH-103899: Provide a hint when accidentally calling a module (pythonGH-103900)
      pythongh-103963: fix 'make regen-opcode' in out-of-tree builds (python#104177)
      pythongh-102500: Add PEP 688 and 698 to the 3.12 release highlights (python#104174)
      pythonGH-81079: Add case_sensitive argument to `pathlib.Path.glob()` (pythonGH-102710)
      pythongh-91896: Deprecate collections.abc.ByteString (python#102096)
      pythongh-99593: Add tests for Unicode C API (part 2) (python#99868)
      pythongh-102500: Document PEP 688 (python#102571)
      pythongh-102500: Implement PEP 688 (python#102521)
      pythongh-96534: socketmodule: support FreeBSD divert(4) socket (python#96536)
      ...
    @itamaro
    Copy link
    Contributor

    itamaro commented Sep 10, 2023

    it seems gh-103937 fixes this issue.
    closing, but please reopen if there's additional work to be done here!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 (EOL) end of life easy stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: Done
    Development

    No branches or pull requests

    8 participants