-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unittest assertEqual difference output foiled by newlines #68968
Comments
When newlines are present, the error message displayed by unittest's self.assertEqual() to show where strings differ can be nonsensical. For example, the caret symbol can show up in a strange location. The first example below shows a case where things work correctly. The second shows a newline case with the confusing display. ====================================================================== Traceback (most recent call last):
File "/Users/chris/***/test.py", line 66, in test1
self.assertEqual("abc", "abd")
AssertionError: 'abc' != 'abd'
- abc
? ^
+ abd
? ^ ====================================================================== Traceback (most recent call last):
File "/Users/chris/***/test.py", line 69, in test2
self.assertEqual("\nabcx", "\nabdx")
AssertionError: '\nabcx' != '\nabdx'
- abcx? ^
+ abdx? ^ |
In this particular case the problem is the lack of a trailing newline on the input string. It might be possible to improve the extended display algorithm by making sure there is a new line before the carrot line, but care must be taken to account for the cases where one string ends with newline and the other doesn't. I think this problem only applies to strings that have no trailing newline. |
There is another case where the error message displayed by self.assertEqual() is weird. ====================================================================== Traceback (most recent call last):
File "test.py", line 9, in test_newline_1
self.assertEqual("\n abc", "\n abd")
AssertionError: '\n abc' != '\n abd'
- abc? ^
+ abd? ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 12, in test_newline_2
self.assertEqual("\nabc", "\nabd")
AssertionError: '\nabc' != '\nabd'
- abc+ abd There is a difference in between "\nabc" and "\n abc" and hence the difference between output |
The issue is not related only to the caret. In fact, as seen in the below output, the issue occurs anytime there's a newline character in the beginning or middle of the string to be compared. In short, if a newline is present in the string and it's in the beginning or middle, a newline character should be put at the end of the string, too. This will make the output look sensible. If, however, the newline is not present at the end, the output is not really readable (the new line is missing). As we (me and Manvi B.) understand, the caret appears in the output only when the strings are similar enough, i.e. their similarity ratio is high enough. Otherwise, compare function doesn't show the carets in places of difference. This can also be seen in test case test_trailingnewline_2. This issue occurs, probably, due to using splitlines method. FFFFFFFF Traceback (most recent call last):
File "test.py", line 8, in test_notrailingnewline_0
self.assertEqual("abcDefehiJkl", "abcdefGhijkl")
AssertionError: 'abcDefehiJkl' != 'abcdefGhijkl'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 14, in test_notrailingnewline_1
self.assertEqual("a\nbcdf", "a\nbddf")
AssertionError: 'a\nbcdf' != 'a\nbddf'
a
- bcdf? ^
+ bddf? ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 18, in test_notrailingnewline_2
self.assertEqual("a\nbcdf", "a\nbddg")
AssertionError: 'a\nbcdf' != 'a\nbddg'
a
- bcdf+ bddg ====================================================================== Traceback (most recent call last):
File "test.py", line 12, in test_starting_and_ending_newline_0
self.assertEqual("\nabcDefehiJkl\n", "\nabcdefGhijkl\n")
AssertionError: '\nabcDefehiJkl\n' != '\nabcdefGhijkl\n'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 10, in test_startingnewline_0
self.assertEqual("\nabcDefehiJkl", "\nabcdefGhijkl")
AssertionError: '\nabcDefehiJkl' != '\nabcdefGhijkl'
- abcDefehiJkl? ^ ^ ^
+ abcdefGhijkl? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 6, in test_trailingnewline_0
self.assertEqual("abcDefehiJkl\n", "abcdefGhijkl\n")
AssertionError: 'abcDefehiJkl\n' != 'abcdefGhijkl\n'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 16, in test_trailingnewline_1
self.assertEqual("a\nbcdf\n", "a\nbddf\n")
AssertionError: 'a\nbcdf\n' != 'a\nbddf\n'
a
- bcdf
? ^
+ bddf
? ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 20, in test_trailingnewline_2
self.assertEqual("a\nbcdf\n", "a\nbddg\n")
AssertionError: 'a\nbcdf\n' != 'a\nbddg\n'
a
- bcdf
+ bddg Ran 8 tests in 0.007s FAILED (failures=8) |
I would like to work on this.. |
The problem is in FFFFFFFF Traceback (most recent call last):
File "test.py", line 8, in test_notrailingnewline_0
self.assertEqual("abcDefehiJkl", "abcdefGhijkl")
AssertionError: 'abcDefehiJkl' != 'abcdefGhijkl'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 14, in test_notrailingnewline_1
self.assertEqual("a\nbcdf", "a\nbddf")
AssertionError: 'a\nbcdf' != 'a\nbddf'
a
- bcdf
? ^
+ bddf
? ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 18, in test_notrailingnewline_2
self.assertEqual("a\nbcdf", "a\nbddg")
AssertionError: 'a\nbcdf' != 'a\nbddg'
a
- bcdf
+ bddg ====================================================================== Traceback (most recent call last):
File "test.py", line 12, in test_starting_and_ending_newline_0
self.assertEqual("\nabcDefehiJkl\n", "\nabcdefGhijkl\n")
AssertionError: '\nabcDefehiJkl\n' != '\nabcdefGhijkl\n'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 10, in test_startingnewline_0
self.assertEqual("\nabcDefehiJkl", "\nabcdefGhijkl")
AssertionError: '\nabcDefehiJkl' != '\nabcdefGhijkl'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 6, in test_trailingnewline_0
self.assertEqual("abcDefehiJkl\n", "abcdefGhijkl\n")
AssertionError: 'abcDefehiJkl\n' != 'abcdefGhijkl\n'
- abcDefehiJkl
? ^ ^ ^
+ abcdefGhijkl
? ^ ^ ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 16, in test_trailingnewline_1
self.assertEqual("a\nbcdf\n", "a\nbddf\n")
AssertionError: 'a\nbcdf\n' != 'a\nbddf\n'
a
- bcdf
? ^
+ bddf
? ^ ====================================================================== Traceback (most recent call last):
File "test.py", line 20, in test_trailingnewline_2
self.assertEqual("a\nbcdf\n", "a\nbddg\n")
AssertionError: 'a\nbcdf\n' != 'a\nbddg\n'
a
- bcdf
+ bddg Ran 8 tests in 0.004s FAILED (failures=8) |
Thanks for the patch; reviewed in rietvald. |
Is this still being worked on? I have a potential fix for this. |
I've attached a potential fix for this issue. While trying to fix this, I noticed that I coudn't assume that I just need to ensure that each line has a newline. If I always ensure each line in diffline has a newline, then the fourth test in testAssertMultilineEqual (in Lib/unittest/test/test_assertions.py) fails because standardMsg in assertMultiLineEqual in Lib/unittest/case.py is just one line without a newline. To sidestep this problem, I made it so that I only ensure there is a newline for each line if and only if there is more than one line in difflines. However, I'm not sure that I can assume there should be a newline in cases similar to the fourth test (where longMessage is set to true and a 'msg' is passed) in testAssertMultilineEqual but where there is more than one line in standardMsg in assertMultiLineEqual. |
Looking at the patch and the relevant function this doesn't seem to be a problem with difflib.ndiff but with unittest's display algorithm. This causes confusion about the issue and I propose changing the subject to reflect this unless difflib maintainers think this is an issue with ndiff. |
When I first created the issue, the title I chose was about unittest ("unittest assertEqual difference output foiled by newlines"), but someone else changed it for some reason. You're welcome to change it back to something more like the original. |
Thanks @chris.jerdonek. I have reverted the title to original report. Since CPython now accepts PR if any one of the original authors can convert their patch to a PR with tests then it will be great. |
I have opened a PR for this. |
Sorry, I just stumbled upon bpo-2142 which is a similar report for unique_diff producing wrong output due to missing trailing newlines and could have been the original reason where the title was changed. But since there is a PR now towards adding a newline I think it's good to fix this on unittest side. |
The root cause of the issue is the way lines of text must be submitted to the difflib.ndiff function. The documentation states that ndiff is to receive two lists, each containing the lines to be compared. What is not stated, rather inferred from the example provided with the documentation, is that each line MUST end with a newline. For what we are doing here with assertEquals, the problem appears when comparing strings comprised of two or more lines, and the final line is not terminated by a newline. Why doesn't this happen when comparing single lines? Surprisingly, in the case when there is one line and the newline is missing, a newline is added. The addition of the newline simply doesn't occur when there are more than one line; I don't know why. The fix will be a change the logic to check the final line of each group for the newline and add if necessary. But why does the comparison of '\nabc' and '\nabd' result in the garbled output? I know it may be hard to believe, but each of those strings represent two lines. Lets take a closer look at '\nabc'; if your split the text using the str.splitlines function, and the keepends=True option, you will end up with '\n' and 'abc'. Because there are two lines, the missing newline is not appended to the 'abc', and garbling ensues. I do hope this helps with the understanding of the issue. |
@adchanw, I took a look at your proposed fix. You are checking each line ndiff output and adding a newline if missing. Since the root of the problem is the input into the ndiff function, it is better to handle the problem there. |
@AnishShah, Your proposed fix seems to be more in line with what the fix should be. I'm going to create a pull request to accomplish something similar. There are opportunities to optimize your changes. |
* main: (61 commits) pythongh-64595: Argument Clinic: Touch source file if any output file changed (python#104152) pythongh-64631: Test exception messages in cloned Argument Clinic funcs (python#104167) pythongh-68395: Avoid naming conflicts by mangling variable names in Argument Clinic (python#104065) pythongh-64658: Expand Argument Clinic return converter docs (python#104175) pythonGH-103092: port `_asyncio` freelist to module state (python#104196) pythongh-104051: fix crash in test_xxtestfuzz with -We (python#104052) pythongh-104190: fix ubsan crash (python#104191) pythongh-104106: Add gcc fallback of mkfifoat/mknodat for macOS (pythongh-104129) pythonGH-104142: Fix _Py_RefcntAdd to respect immortality (pythonGH-104143) pythongh-104112: link from cached_property docs to method-caching FAQ (python#104113) pythongh-68968: Correcting message display issue with assertEqual (python#103937) pythonGH-103899: Provide a hint when accidentally calling a module (pythonGH-103900) pythongh-103963: fix 'make regen-opcode' in out-of-tree builds (python#104177) pythongh-102500: Add PEP 688 and 698 to the 3.12 release highlights (python#104174) pythonGH-81079: Add case_sensitive argument to `pathlib.Path.glob()` (pythonGH-102710) pythongh-91896: Deprecate collections.abc.ByteString (python#102096) pythongh-99593: Add tests for Unicode C API (part 2) (python#99868) pythongh-102500: Document PEP 688 (python#102571) pythongh-102500: Implement PEP 688 (python#102521) pythongh-96534: socketmodule: support FreeBSD divert(4) socket (python#96536) ...
it seems gh-103937 fixes this issue. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: