gh-92734: Add indentation feature to reprlib.Repr #92735

finefoot · 2022-05-12T15:23:57Z

Add an indentation feature to the reprlib.Repr class. This includes the actual implementation as well as added tests and an attribute description for the docs.

zmievsa

I haven't found any issues with the code or the tests. The docs are my only concern.

Doc/library/reprlib.rst

CAM-Gerlach

I provided a suggested revision of the documentation section, as you requested on Discourse.

Additionally, some quick testing of a few simple cases locally reveals several apparent issues and odd behaviors, which really should be handled properly (and captured in the tests, which currently are very limited):

Setting indent = 0 makes any recursive objects silently not print anything, instead of being a natural extension of the positive integer case as it should be and I'd expect, with line breaks but no indents (or at the absolute minimum raising an error). Example:
```
>>> example = [1, 'spam', {'eggs': 1, 'ham': 2}]
>>> reprlib.aRepr.indent = 0
>>> print(reprlib.repr(example))
[]
```
What I'd expect instead, of course, is
```
[
1,
'spam',
{
'eggs': 1,
'ham': 2,
},
]
```
Even more unexpectedly, the same happens for an empty string '', which from the perspective of a user is even more confusing, and likewise should work as one would expect
Furthermore, negative integers also produce the same behavior, instead of either being an extension of the non-negative case (indenting less nested objects more to the left, with the indent level zero being the most deeply nested object—which I don't think would be useful enough to justify the implementation complexity) or raising an error at any point (which it probably should).
Also, passing False produces the same result, while True is equivalent to the integer 1—presuming the former case is fixed to work like 0, a seasoned Python user might anticipate and understand this, but I worry that it might confuse users who expect False to act like None and True to use some default/automatic value, e.g. 4 spaces. My suggestion would be to raise an explicit error in this edge case rather than trying to guess at the user's intent, though I'm not sure what's most idiomatic here.
Objects that clearly won't work for indent, like [], {"test": 1}, 4.2 and arbitrary object()s produce two different possible TypeErrors (one for objects that don't support __mul__ with an int, and one for those that can but cannot be __add__ed with a str), neither of which are necessarily clear to the user as to the root cause, and only when actually used to indent. IMO, it would be better to catch this error and raise from a more helpful one, e.g. Repr.indent must be int or str, not {type(indent)}. You could also use a @property to do validation on indent at set-time, but that's not very idiomatic, at least for the stdlib.

Doc/library/reprlib.rst

Lib/test/test_reprlib.py

finefoot · 2022-05-14T19:17:16Z

Thank you very much for the extensive feedback.

and captured in the tests, which currently are very limited

I agree. Do you have a suggestion in which way to add more cases? Because the best solution I can come up with, would be to simply add more explicit tests for sample cases and edge cases and compare the expected string to the return value. You can see that I tried to add a more "sophisticated" test, based on the idea that after removing whitespace and commas the output should be identical to the output produced with Repr.indent = None. But I guess that's not really helping much and I can just remove it again, in favor of more explicit tests?

Setting indent = 0 makes any recursive objects silently not print anything

Good catch! :) I didn't think of this edge case. Repr.indent = "" (and subsequently Repr.indent = 0) should be fixed by f8a2db1. The reason was -len(indent) evaluating to -0, which is 0, which obviously doesn't mean "cut zero characters from end of sequence" but "cut everything until character position 0". Instead, None works for zero-length indentation strings.

IMO, it would be better to catch this error and raise from a more helpful one, e.g. Repr.indent must be int or str, not {type(indent)}. You could also use a @property to do validation on indent at set-time, but that's not very idiomatic, at least for the stdlib.

The other attributes don't have any explicit error handling either. That's why I kept it out of the code, but I am in favor of adding it, due to exactly the situations you described. 👍 If @property isn't idiomatic for the stdlib as you say, I've implemented simple LBYL error handling in 721357c. I noticed that you mentioned raise from, yet I don't think it's necessary (i.e. helpful) to catch the exception from an EAFP approach; and we'd have to LBYL due to disallowing negative integers anyway. Am I overlooking something?

Furthermore, negative integers (...) indenting less nested objects more to the left, with the indent level zero being the most deeply nested object—which I don't think would be useful enough to justify the implementation complexity

I agree with this, too. Therefore, I wrote "non-negative int" in the ValueError message, above.

Co-authored-by: CAM Gerlach <[email protected]> Co-authored-by: CAM Gerlach <[email protected]>

Lib/reprlib.py

CAM-Gerlach · 2022-05-15T01:49:16Z

I agree. Do you have a suggestion in which way to add more cases?

To be honest, every other project I work on uses Pytest, where I'd just use the @pytest.parameterize() decorator for this, and I'm not terribly familiar with the idiomatic unittest approach—especially since the construct that I believe is most appropriate here, subtests, is the one main unittest feature that Pytest does not natively support.

That said, here's a stab at it, based on skimming over a few other examples in the CPython test suite (e.g. test_codecs.py). To summarize, you want methods for each major distinction in behavior (e.g. formatted output, raising an exception), and within each test method, you can use unittest subtests, with one subtest per test case. iterating over a mapping of your inputs to expected outputs, with a key you can use to name the subtests.

Putting it all together, that would look something like this:

import textwrap

...

    def test_valid_indent(self):
        test_obj = [1, 'spam', {'eggs': True, 'ham': []}]
        test_cases = {
            None: """[1, 'spam', {'eggs': True, 'ham': []}]""",
            4: ("""
                [
                    1,
                    'spam',
                    {
                        'eggs': True,
                        'ham': [],
                    },
                ]
                """),
            '-->': ("""
                [
                -->1,
                -->'spam',
                -->{
                -->-->'eggs': True,
                -->-->'ham': [],
                -->},
                ]
                """),
            # Etc.
        }

        for indent, expected_repr in test_cases.items():
            with self.subTest(indent=indent):
                repr_obj = Repr()
                repr_obj.indent = indent
                actual_repr = repr_obj.repr(test_obj).strip('\n')
                expected_repr = textwrap.dedent(expected_repr).strip('\n')
                self.assertEqual(actual_repr, expected_repr)

    def test_invalid_indent(self):
        test_obj = [1, 'spam', {'eggs': True, 'ham': []}]
        test_cases = {
            -2: (ValueError, '[Nn]egative|[Pp]ositive'),
            (4,): (TypeError, None),
            # Etc.
        }
        for indent, (expected_error, expected_msg) in test_cases.items():
            with self.subTest(indent=indent):
                repr_obj = Repr()
                repr_obj.indent = indent
                expected_msg = expected_msg or f'{indent!r}'
                with self.assertRaisesRegex(expected_error, expected_msg):
                    repr_obj.repr(test_obj)

Good catch! :) I didn't think of this edge case.

Its really helpful to get in the habit of doing so. Pretty much any time you're testing functions like this, even just as a quick manual sanity check, you should try common cases like 1, 0, -1, True, False, floats, empty strings, empty collections, arbitrary objects, etc. as appropriate for your use case, and make sure they behave how you expect (whether that's normal output or raising an appropriate error).

Repr.indent = "" (and subsequently Repr.indent = 0) should be fixed by f8a2db1.

Indeed; LGTM in manual testing.

I noticed that you mentioned raise from, yet I don't think it's necessary (i.e. helpful) to catch the exception from an EAFP approach; and we'd have to LBYL due to disallowing negative integers anyway. Am I overlooking something?

Why not? They are two very different cases that need to be handled separately, an invalid type (TypeError) and an invalid value (ValueError), and you already have a if isinstance(indent, int) block anyway for the invalid value (negative int) case. See above for a better approach.

Co-authored-by: CAM Gerlach <[email protected]>

Lib/reprlib.py

Co-authored-by: CAM Gerlach <[email protected]>

finefoot · 2022-05-15T22:22:36Z

Also, passing False produces the same result, while True is equivalent to the integer 1—presuming the former case is fixed to work like 0, a seasoned Python user might anticipate and understand this, but I worry that it might confuse users who expect False to act like None and True to use some default/automatic value, e.g. 4 spaces. My suggestion would be to raise an explicit error in this edge case rather than trying to guess at the user's intent, though I'm not sure what's most idiomatic here.

I think we can reasonably skip this, in accordance with json.dumps, for example:

>>> print(json.dumps([[1]], indent=True))
[
 [
  1
 ]
]
>>> print(json.dumps([[1]], indent=False))
[
[
1
]
]

But I don't feel strongly about it. So if you disagree, I'm happy to open a thread and ask for opinions if there's a different idiomatic solution that would be best to implement. (For both json.dumps and reprlib.Repr in that case, I guess.)

I agree. Do you have a suggestion in which way to add more cases?

To be honest, every other project I work on uses Pytest, where I'd just use the @pytest.parameterize() decorator for this (...) That said, here's a stab at it, based on skimming over a few other examples in the CPython test suite

I apologize for the misunderstanding. I am aware of unittest.TestCase.subTest but simply didn't use it because I wanted to keep the indent tests in accordance with the preexisting tests in test_reprlib.py. I was under the impression that document coherence is more important. I am fine with using subtests. 👍

My question was meant more like this: Theoretically, we could reverse the indentation process and test if the result matches the result that regular Repr.indent = None produces. The rough idea was like this:

# Results are the same after removing whitespace and commas
y = [1, [2, "foo", b"bar", {"a": 1, "b": "abc def ghi", "c": {1: 2, 3: 4, 5: [], 6: {}}}], 3]
self.assertNotEqual(r1.repr(y), r2.repr(y))
self.assertEqual(
    "".join(r1.repr(y).replace(",", "").split()),
    "".join(r2.repr(y).replace(",", "").split())
)

But, like I said before, I don't think it's worth it. Explicit tests of example cases should be fine? I was simply inquiring if you agreed with this or if you think I should work on the "indentation reversal test". :)

Anyway, I've taken your suggestions and added some tests in 1d652a5. I also split it into test_valid_indent and test_invalid_indent, like you proposed.

CAM-Gerlach · 2022-05-16T00:43:45Z

I think we can reasonably skip this, in accordance with json.dumps, for example:

I think you're right; its consistent with the behavior elsewhere and its not that big of a deal. You should, however, include True and False in your indent value test cases, which otherwise look fairly complete.

I apologize for the misunderstanding.

No need to apologize!

Anyway, I've taken your suggestions and added some tests in 1d652a5. I also split it into test_valid_indent and test_invalid_indent, like you proposed.

All looks good; aside from including True and False as test cases for indent, to avoid readability impacts due to the space it consumes, I might might your cases cases to a module-level variable, and you could also use somewhat smaller test objects, though its fine if others don't complain. You could also consider using nested subtests (object ---> settings). Other than that, LGTM at least from my end.

finefoot · 2022-05-16T10:47:39Z

You should, however, include True and False in your indent value test cases,

👍 Tests added in 7cc9fc8. I assume, the rationale behind testing this undefined behavior - since the docs don't mention bool for Repr.indent - is less about verifying that indent = True produces the same as indent = 1, but more about verifying that no error happens if a user unknowingly uses said value producing undefined behavior?

to avoid readability impacts due to the space it consumes, I might might your cases cases to a module-level variable, and you could also use somewhat smaller test objects, though its fine if others don't complain

The example in test_pprint.QueryTestCase.test_set_of_sets_reprs is fairly big, too, for example. I guess that means, it's okay?

You could also consider using nested subtests (object ---> settings)

That's a very good suggestion, thank you. :) Implemented in 61a6b5a.

CAM-Gerlach · 2022-05-16T19:38:16Z

+1 Tests added in 7cc9fc8. I assume, the rationale behind testing this undefined behavior - since the docs don't mention bool for Repr.indent - is less about verifying that indent = True produces the same as indent = 1, but more about verifying that no error happens if a user unknowingly uses said value producing undefined behavior?

Looks good. It has a dual purpose, in my mind—it both tests the current behavior, and serves as de-facto CPython-developer-facing, machine-checkable verification of it; if it is changed, intentionally or not, then the test would provide clear indication of that and would ensure the tests are updated to cover the new behavior instead.

The example in test_pprint.QueryTestCase.test_set_of_sets_reprs is fairly big, too, for example. I guess that means, it's okay?

I don't have a problem with the size, though personally I'd separate the test data from the test code to make it cleaner to follow (in Pytest, I'd made it a fixture or at least a module-level constant). I was mostly concerned about any objections from others with more authority about it, since I'm not a core dev myself. But its ultimately up to you and whatever core dev does the final review and merge here. Otherwise, LGTM!

CAM-Gerlach

Thanks for all your hard work and responsiveness to feedback, @finefoot . LGTM now and looking forward to see this merged; this should be ready for core dev review now (with one potential concern about mixing test data and code outstanding, though that can be left to a core dev for final judgment).

CAM-Gerlach · 2022-05-16T19:43:17Z

Tagging @rhettinger for review, since there's no one listed in the Experts Index or CODEOWNERS and he seems to be de-facto responsible for this module given his code contributions and his review of the most recent significant PR affecting it.

finefoot · 2022-05-16T21:47:01Z

with one potential concern about mixing test data and code outstanding, though that can be left to a core dev for final judgment

Alright, makes sense. Let's see. :)

@CAM-Gerlach Thanks for your patience, explanations and helpful feedback. I learned a lot 👍

CAM-Gerlach · 2022-05-16T22:17:13Z

Thank you to, so did I! I'm pretty new to most of this stuff as well and have only just started contributing to CPython proper in the past couple weeks, and pretty much all of that has been docs stuff so far. I'm also not even a "real" software developer, I just play one on the internet—in real life, I'm a NASA satellite scientist, heh.

CAM-Gerlach · 2022-06-28T05:36:37Z

Thanks, and sorry for the delay; I've reached out to the core devs to take a look at this...

ambv

In general I find it icky that indent is a field here and not an argument to the repr() method. However, since the rest of the functionality in reprlib provides a precedent for this design, I think we can live with that.

That being said, I have a few comments around the current implementation.

Doc/library/reprlib.rst

Misc/NEWS.d/next/Core and Builtins/2022-05-12-15-19-00.gh-issue-92734.d0wjDt.rst

…e-92734.d0wjDt.rst Co-authored-by: Łukasz Langa <[email protected]>

…e-92734.d0wjDt.rst to Misc/NEWS.d/next/Library/2022-05-12-15-19-00.gh-issue-92734.d0wjDt.rst

…prlib.aRepr

finefoot · 2022-06-28T16:54:39Z

I guess this would make sense to add to the docs once it's getting close to getting merged and the Python version is foreseeable?

Add .. versionadded:: 3.xyz to docs

It doesn't get added automatically, right?

CAM-Gerlach · 2022-06-29T08:00:53Z

I guess this would make sense to add to the docs once it's getting close to getting merged and the Python version is foreseeable?

Add .. versionadded:: 3.xyz to docs

It doesn't get added automatically, right?

Oops, I completely forgot the mention that—great catch. No, its not automatic, and as this is essentially certain to go in Python 3.12 now, we should just go ahead and added it. I'll make a suggestion to that effect.

Doc/library/reprlib.rst

Co-authored-by: CAM Gerlach <[email protected]>

CAM-Gerlach · 2022-08-05T05:51:53Z

FYI, right now things are very intense due to the planned release of 3.11rc1, so if @ambv doesn't respond in a week or two, ping again and I can ping him or another core dev if there still isn't a response.

bedevere-bot added the awaiting review label May 12, 2022

finefoot force-pushed the reprlib-indent branch from 4fcb99f to c422a79 Compare May 12, 2022 18:20

Add reprlib.Repr.indent

237726b

Add an indentation feature to the reprlib.Repr class. This includes the actual implementation as well as added tests and an attribute description for the docs.

finefoot force-pushed the reprlib-indent branch from c422a79 to 237726b Compare May 12, 2022 18:23

zmievsa suggested changes May 14, 2022

View reviewed changes

Doc/library/reprlib.rst Outdated Show resolved Hide resolved

bedevere-bot added awaiting core review and removed awaiting review labels May 14, 2022

CAM-Gerlach added type-feature A feature request or enhancement docs Documentation in the Doc dir stdlib Python modules in the Lib dir labels May 14, 2022

CAM-Gerlach requested changes May 14, 2022

View reviewed changes

Doc/library/reprlib.rst Outdated Show resolved Hide resolved

Lib/test/test_reprlib.py Outdated Show resolved Hide resolved

finefoot added 2 commits May 14, 2022 20:07

Allow zero-length indentation string

f8a2db1

Add error handling for Repr.indent

721357c

Improve docs for Repr.indent

7cd769b

Co-authored-by: CAM Gerlach <[email protected]> Co-authored-by: CAM Gerlach <[email protected]>

CAM-Gerlach reviewed May 15, 2022

View reviewed changes

Lib/reprlib.py Outdated Show resolved Hide resolved

finefoot and others added 3 commits May 15, 2022 12:29

Differentiate between TypeError and ValueError

55381f6

Co-authored-by: CAM Gerlach <[email protected]>

Fix missing f-string indicator

b8e5c3b

Fix NoneType typo and max line length

2129722

CAM-Gerlach reviewed May 15, 2022

View reviewed changes

Lib/reprlib.py Outdated Show resolved Hide resolved

finefoot and others added 2 commits May 16, 2022 00:15

Add more tests and split them into valid and invalid

1d652a5

Make TypeError message idiomatic

5f39d2f

Co-authored-by: CAM Gerlach <[email protected]>

finefoot added 2 commits May 16, 2022 12:43

Add tests for bool values

7cc9fc8

Use nested calls to TestCase.subTest

61a6b5a

CAM-Gerlach approved these changes May 16, 2022

View reviewed changes

CAM-Gerlach requested a review from rhettinger May 16, 2022 19:41

finefoot mentioned this pull request Jun 27, 2022

Ease initialization of reprlib.Repr attributes #94343

Closed

ambv reviewed Jun 28, 2022

View reviewed changes

Doc/library/reprlib.rst Show resolved Hide resolved

Doc/library/reprlib.rst Outdated Show resolved Hide resolved

Misc/NEWS.d/next/Core and Builtins/2022-05-12-15-19-00.gh-issue-92734.d0wjDt.rst Outdated Show resolved Hide resolved

finefoot and others added 3 commits June 28, 2022 14:30

Update Misc/NEWS.d/next/Core and Builtins/2022-05-12-15-19-00.gh-issu…

d228f8f

…e-92734.d0wjDt.rst Co-authored-by: Łukasz Langa <[email protected]>

Rename Misc/NEWS.d/next/Core and Builtins/2022-05-12-15-19-00.gh-issu…

f7e91b6

…e-92734.d0wjDt.rst to Misc/NEWS.d/next/Library/2022-05-12-15-19-00.gh-issue-92734.d0wjDt.rst

Use fresh reprlib.Repr instance in docs examples instead of global re…

df43a67

…prlib.aRepr

CAM-Gerlach reviewed Jun 29, 2022

View reviewed changes

Doc/library/reprlib.rst Show resolved Hide resolved

finefoot and others added 2 commits June 29, 2022 12:48

Add versionadded to docs

a32032f

Co-authored-by: CAM Gerlach <[email protected]>

Add line breaks

8c26437

CAM-Gerlach requested a review from ambv June 29, 2022 20:08

finefoot added 2 commits July 8, 2022 00:21

Merge branch 'main' into reprlib-indent

871838b

Adjust to recent changes from python#94343

f8a5450

rhettinger merged commit c06c001 into python:main Sep 8, 2022

bedevere-bot removed the awaiting core review label Sep 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-92734: Add indentation feature to reprlib.Repr #92735

gh-92734: Add indentation feature to reprlib.Repr #92735

finefoot commented May 12, 2022

zmievsa left a comment

CAM-Gerlach left a comment •

edited

Loading

finefoot commented May 14, 2022

CAM-Gerlach commented May 15, 2022

finefoot commented May 15, 2022

CAM-Gerlach commented May 16, 2022

finefoot commented May 16, 2022

CAM-Gerlach commented May 16, 2022

CAM-Gerlach left a comment

CAM-Gerlach commented May 16, 2022

finefoot commented May 16, 2022

CAM-Gerlach commented May 16, 2022

CAM-Gerlach commented Jun 28, 2022

ambv left a comment

finefoot commented Jun 28, 2022 •

edited

Loading

CAM-Gerlach commented Jun 29, 2022 •

edited

Loading

CAM-Gerlach commented Aug 5, 2022

gh-92734: Add indentation feature to reprlib.Repr #92735

gh-92734: Add indentation feature to reprlib.Repr #92735

Conversation

finefoot commented May 12, 2022

zmievsa left a comment

Choose a reason for hiding this comment

CAM-Gerlach left a comment • edited Loading

Choose a reason for hiding this comment

finefoot commented May 14, 2022

CAM-Gerlach commented May 15, 2022

finefoot commented May 15, 2022

CAM-Gerlach commented May 16, 2022

finefoot commented May 16, 2022

CAM-Gerlach commented May 16, 2022

CAM-Gerlach left a comment

Choose a reason for hiding this comment

CAM-Gerlach commented May 16, 2022

finefoot commented May 16, 2022

CAM-Gerlach commented May 16, 2022

CAM-Gerlach commented Jun 28, 2022

ambv left a comment

Choose a reason for hiding this comment

finefoot commented Jun 28, 2022 • edited Loading

CAM-Gerlach commented Jun 29, 2022 • edited Loading

CAM-Gerlach commented Aug 5, 2022

CAM-Gerlach left a comment •

edited

Loading

finefoot commented Jun 28, 2022 •

edited

Loading

CAM-Gerlach commented Jun 29, 2022 •

edited

Loading