-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-92734: Add indentation feature to reprlib.Repr #92735
Conversation
Add an indentation feature to the reprlib.Repr class. This includes the actual implementation as well as added tests and an attribute description for the docs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't found any issues with the code or the tests. The docs are my only concern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I provided a suggested revision of the documentation section, as you requested on Discourse.
Additionally, some quick testing of a few simple cases locally reveals several apparent issues and odd behaviors, which really should be handled properly (and captured in the tests, which currently are very limited):
-
Setting
indent = 0
makes any recursive objects silently not print anything, instead of being a natural extension of the positive integer case as it should be and I'd expect, with line breaks but no indents (or at the absolute minimum raising an error). Example:>>> example = [1, 'spam', {'eggs': 1, 'ham': 2}] >>> reprlib.aRepr.indent = 0 >>> print(reprlib.repr(example)) []
What I'd expect instead, of course, is
[ 1, 'spam', { 'eggs': 1, 'ham': 2, }, ]
-
Even more unexpectedly, the same happens for an empty string
''
, which from the perspective of a user is even more confusing, and likewise should work as one would expect -
Furthermore, negative integers also produce the same behavior, instead of either being an extension of the non-negative case (indenting less nested objects more to the left, with the indent level zero being the most deeply nested object—which I don't think would be useful enough to justify the implementation complexity) or raising an error at any point (which it probably should).
-
Also, passing
False
produces the same result, whileTrue
is equivalent to the integer1
—presuming the former case is fixed to work like0
, a seasoned Python user might anticipate and understand this, but I worry that it might confuse users who expectFalse
to act likeNone
andTrue
to use some default/automatic value, e.g. 4 spaces. My suggestion would be to raise an explicit error in this edge case rather than trying to guess at the user's intent, though I'm not sure what's most idiomatic here. -
Objects that clearly won't work for
indent
, like[]
,{"test": 1}
,4.2
and arbitraryobject()
s produce two different possibleTypeError
s (one for objects that don't support__mul__
with anint
, and one for those that can but cannot be__add__
ed with astr
), neither of which are necessarily clear to the user as to the root cause, and only when actually used to indent. IMO, it would be better to catch this error andraise from
a more helpful one, e.g.Repr.indent must be int or str, not {type(indent)}
. You could also use a@property
to do validation onindent
at set-time, but that's not very idiomatic, at least for the stdlib.
Thank you very much for the extensive feedback.
I agree. Do you have a suggestion in which way to add more cases? Because the best solution I can come up with, would be to simply add more explicit tests for sample cases and edge cases and compare the expected string to the return value. You can see that I tried to add a more "sophisticated" test, based on the idea that after removing whitespace and commas the output should be identical to the output produced with
Good catch! :) I didn't think of this edge case.
The other attributes don't have any explicit error handling either. That's why I kept it out of the code, but I am in favor of adding it, due to exactly the situations you described. 👍 If
I agree with this, too. Therefore, I wrote "non-negative int" in the |
Co-authored-by: CAM Gerlach <[email protected]> Co-authored-by: CAM Gerlach <[email protected]>
To be honest, every other project I work on uses Pytest, where I'd just use the That said, here's a stab at it, based on skimming over a few other examples in the CPython test suite (e.g. Putting it all together, that would look something like this: import textwrap
...
def test_valid_indent(self):
test_obj = [1, 'spam', {'eggs': True, 'ham': []}]
test_cases = {
None: """[1, 'spam', {'eggs': True, 'ham': []}]""",
4: ("""
[
1,
'spam',
{
'eggs': True,
'ham': [],
},
]
"""),
'-->': ("""
[
-->1,
-->'spam',
-->{
-->-->'eggs': True,
-->-->'ham': [],
-->},
]
"""),
# Etc.
}
for indent, expected_repr in test_cases.items():
with self.subTest(indent=indent):
repr_obj = Repr()
repr_obj.indent = indent
actual_repr = repr_obj.repr(test_obj).strip('\n')
expected_repr = textwrap.dedent(expected_repr).strip('\n')
self.assertEqual(actual_repr, expected_repr)
def test_invalid_indent(self):
test_obj = [1, 'spam', {'eggs': True, 'ham': []}]
test_cases = {
-2: (ValueError, '[Nn]egative|[Pp]ositive'),
(4,): (TypeError, None),
# Etc.
}
for indent, (expected_error, expected_msg) in test_cases.items():
with self.subTest(indent=indent):
repr_obj = Repr()
repr_obj.indent = indent
expected_msg = expected_msg or f'{indent!r}'
with self.assertRaisesRegex(expected_error, expected_msg):
repr_obj.repr(test_obj)
Its really helpful to get in the habit of doing so. Pretty much any time you're testing functions like this, even just as a quick manual sanity check, you should try common cases like
Indeed; LGTM in manual testing.
Why not? They are two very different cases that need to be handled separately, an invalid type (TypeError) and an invalid value (ValueError), and you already have a |
Co-authored-by: CAM Gerlach <[email protected]>
I think we can reasonably skip this, in accordance with
But I don't feel strongly about it. So if you disagree, I'm happy to open a thread and ask for opinions if there's a different idiomatic solution that would be best to implement. (For both
I apologize for the misunderstanding. I am aware of My question was meant more like this: Theoretically, we could reverse the indentation process and test if the result matches the result that regular # Results are the same after removing whitespace and commas
y = [1, [2, "foo", b"bar", {"a": 1, "b": "abc def ghi", "c": {1: 2, 3: 4, 5: [], 6: {}}}], 3]
self.assertNotEqual(r1.repr(y), r2.repr(y))
self.assertEqual(
"".join(r1.repr(y).replace(",", "").split()),
"".join(r2.repr(y).replace(",", "").split())
) But, like I said before, I don't think it's worth it. Explicit tests of example cases should be fine? I was simply inquiring if you agreed with this or if you think I should work on the "indentation reversal test". :) Anyway, I've taken your suggestions and added some tests in 1d652a5. I also split it into |
I think you're right; its consistent with the behavior elsewhere and its not that big of a deal. You should, however, include
No need to apologize!
All looks good; aside from including |
👍 Tests added in 7cc9fc8. I assume, the rationale behind testing this undefined behavior - since the docs don't mention
The example in
That's a very good suggestion, thank you. :) Implemented in 61a6b5a. |
Looks good. It has a dual purpose, in my mind—it both tests the current behavior, and serves as de-facto CPython-developer-facing, machine-checkable verification of it; if it is changed, intentionally or not, then the test would provide clear indication of that and would ensure the tests are updated to cover the new behavior instead.
I don't have a problem with the size, though personally I'd separate the test data from the test code to make it cleaner to follow (in Pytest, I'd made it a fixture or at least a module-level constant). I was mostly concerned about any objections from others with more authority about it, since I'm not a core dev myself. But its ultimately up to you and whatever core dev does the final review and merge here. Otherwise, LGTM! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all your hard work and responsiveness to feedback, @finefoot . LGTM now and looking forward to see this merged; this should be ready for core dev review now (with one potential concern about mixing test data and code outstanding, though that can be left to a core dev for final judgment).
Tagging @rhettinger for review, since there's no one listed in the Experts Index or CODEOWNERS and he seems to be de-facto responsible for this module given his code contributions and his review of the most recent significant PR affecting it. |
Alright, makes sense. Let's see. :) @CAM-Gerlach Thanks for your patience, explanations and helpful feedback. I learned a lot 👍 |
Thank you to, so did I! I'm pretty new to most of this stuff as well and have only just started contributing to CPython proper in the past couple weeks, and pretty much all of that has been docs stuff so far. I'm also not even a "real" software developer, I just play one on the internet—in real life, I'm a NASA satellite scientist, heh. |
Thanks, and sorry for the delay; I've reached out to the core devs to take a look at this... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I find it icky that indent
is a field here and not an argument to the repr()
method. However, since the rest of the functionality in reprlib
provides a precedent for this design, I think we can live with that.
That being said, I have a few comments around the current implementation.
Misc/NEWS.d/next/Core and Builtins/2022-05-12-15-19-00.gh-issue-92734.d0wjDt.rst
Outdated
Show resolved
Hide resolved
…e-92734.d0wjDt.rst Co-authored-by: Łukasz Langa <[email protected]>
…e-92734.d0wjDt.rst to Misc/NEWS.d/next/Library/2022-05-12-15-19-00.gh-issue-92734.d0wjDt.rst
I guess this would make sense to add to the docs once it's getting close to getting merged and the Python version is foreseeable?
It doesn't get added automatically, right? |
Oops, I completely forgot the mention that—great catch. No, its not automatic, and as this is essentially certain to go in Python 3.12 now, we should just go ahead and added it. I'll make a suggestion to that effect. |
Co-authored-by: CAM Gerlach <[email protected]>
FYI, right now things are very intense due to the planned release of 3.11rc1, so if @ambv doesn't respond in a week or two, ping again and I can ping him or another core dev if there still isn't a response. |
#92734