Raises section parsing of Google-style docstrings fails with multiple exceptions #17

thebigmunch · 2020-04-07T23:08:07Z

d6561f8 stopped build errors in regards to this. But the results of a raises section with multiple exceptions (like this) end up like:

The text was updated successfully, but these errors were encountered:

pawamoy · 2020-04-08T07:54:22Z

Oops 😨 I guess some tests are missing. I'll fix this asap and add tests. Thanks for the report!

pawamoy · 2020-04-08T08:25:43Z

It's working correctly on my end. What version of mkdocstrings and pytkdocs are you using?

thebigmunch · 2020-04-08T09:31:57Z

It's working correctly on my end. What version of mkdocstrings and pytkdocs are you using?

Newest releases of each, pytkdocs 0.2.1 and mkdocstrings 0.10.2.

pawamoy · 2020-04-08T12:43:29Z

Alright, found what's causing the issue.

We get the docstrings with inspect.getdoc, which cleans the docstring with inspect.cleandoc, which in turn expands tabs to spaces with str.expandtabs() with the default value of 8 spaces:

expandtabs(self, /, tabsize=8)
    Return a copy where all tab characters are expanded using spaces.
    
    If tabsize is not given, a tab size of 8 characters is assumed.

Then when the parser processes the docstring, since lines start with 6 or more spaces, it thinks it's a continuation line from the previous item / exception.

I was hoping str.expandtabs would use an environment variable for the tab width, unfortunately there's nothing like that in the docs: https://docs.python.org/3/library/stdtypes.html#str.expandtabs.

A temporary solution to this would be to use spaces instead of tabs in your code, though I know it's a delicate subject 😅

Of course pytkdocs code must be fixed to support these expanded tabs, though I'm not sure how right now. Maybe we could simply stop using inspect.getdoc, pick the docstring through the __doc__ attribute, and simply dedent it, leaving tabs untouched, and update the parser code to handle tabs.

thebigmunch · 2020-04-08T15:54:56Z

Then when the parser processes the docstring, since lines start with 6 or more spaces, it thinks it's a continuation line from the previous item / exception.

This seems like a poor way to determine line continuation. Using relative indentation levels between lines would be the proper way to solve this. Indeed, this seems to be how other docstring parsers tend to do it. pytkdocs also seems to assume indentation levels of 4, which is not necessarily the case.

pawamoy · 2020-04-09T09:37:45Z

Could you explain a bit more how these relative indentation levels between lines would be implemented? I'm all ears for better code but I'm not sure how to tackle this. Maybe you would even like to contribute a PR?

thebigmunch · 2020-04-09T09:49:03Z

So, after running through inspect.cleandoc, you'll end up with this:

"""
Load audio metadata from a filepath or file-like object.

Parameters:
    f (str, os.PathLike, or file-like object):
        A filepath, path-like object or file-like object of an audio file.

Returns:
    Format: An audio format object of the appropriate type.

Raises:
    FormatError: If the audio file is not valid.
    UnsupportedFormat: If the audio file is not of a supported format.
    ValueError: If ``f`` is not a valid str, path-like object,
        file-like object, or is unreadable.
"""

It's easy to get the section header through pattern. It has 0 indentation. The next line tells you the indentation level for all items in that section. In this case it's 4, but could be 2 or 6 or whatever indentation increment someone is using. Any indentation less than the section item indentation (empty line, unindented) marks the end of the section. Any line that has the section item indentation level is an item in that section. Any line that has an indentation greater than than the section item indentation is a continuation line.

pawamoy · 2020-04-09T09:52:21Z

OK thanks, great explanation 🙂

Seems way better indeed. Let's do this!

pawamoy · 2020-04-09T11:00:34Z

I wonder if we should enforce the same extra indentation for continuation lines:

Allowed:

"""
Raises:
    ValueError: If ``f`` is not a valid str, path-like object,
        file-like object, or is unreadable.
"""

Not allowed, or generates an error:

"""
Raises:
    ValueError: If ``f`` is not a valid str, path-like object,
      file-like object, or is unreadable.
"""

The use-case is that if we want to support more complex markup in items description (instead of concatenating as one line), we know we can simply join lines with \n by removing the first indent * 2 spaces:

current_item_lines.append(line[indent * 2:])

Or we could do a second pass on continuation lines, getting the initial continuation indent level, and remove as much from continuation lines:

current_item_lines.append(line[indent + cont_indent:])

pawamoy · 2020-04-09T11:10:12Z

Looking at the Google style example from Sphinx's Napoleon extension's documentation, I think I'll go with enforcing the same indentation for continuation lines: 4-space indentation means 8-space continuation indentation. Between 4 and 8 spaces will generate a parsing error but still append the line with leading spaces removed.

What do you think?

thebigmunch · 2020-04-09T11:29:03Z

What do you think?

Doesn't bother me for the docstrings I write. But I don't know what others might do/think. So long as it is done relatively, not absolutely, in code and in documentation. That is, not 4 and 8 but section_item_indent and section_item_indent * 2.

pawamoy · 2020-04-09T11:46:25Z

Of course, 4 and 8 was just an example. For others (me included since I was using 2 spaces for continuation indent until now), it will work exactly the same, but warnings will appear in mkdocs output, asking users to use the same indentation level for continuation lines.

pawamoy added bug Something isn't working docstrings Docstrings parsing labels Apr 9, 2020

pawamoy closed this as completed in 2f53082 Apr 10, 2020

pawamoy added the docstrings-format:google label Apr 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raises section parsing of Google-style docstrings fails with multiple exceptions #17

Raises section parsing of Google-style docstrings fails with multiple exceptions #17

thebigmunch commented Apr 7, 2020

pawamoy commented Apr 8, 2020

pawamoy commented Apr 8, 2020

thebigmunch commented Apr 8, 2020

pawamoy commented Apr 8, 2020 •

edited

Loading

thebigmunch commented Apr 8, 2020

pawamoy commented Apr 9, 2020

thebigmunch commented Apr 9, 2020

pawamoy commented Apr 9, 2020

pawamoy commented Apr 9, 2020

pawamoy commented Apr 9, 2020 •

edited

Loading

thebigmunch commented Apr 9, 2020

pawamoy commented Apr 9, 2020

Raises section parsing of Google-style docstrings fails with multiple exceptions #17

Raises section parsing of Google-style docstrings fails with multiple exceptions #17

Comments

thebigmunch commented Apr 7, 2020

pawamoy commented Apr 8, 2020

pawamoy commented Apr 8, 2020

thebigmunch commented Apr 8, 2020

pawamoy commented Apr 8, 2020 • edited Loading

thebigmunch commented Apr 8, 2020

pawamoy commented Apr 9, 2020

thebigmunch commented Apr 9, 2020

pawamoy commented Apr 9, 2020

pawamoy commented Apr 9, 2020

pawamoy commented Apr 9, 2020 • edited Loading

thebigmunch commented Apr 9, 2020

pawamoy commented Apr 9, 2020

pawamoy commented Apr 8, 2020 •

edited

Loading

pawamoy commented Apr 9, 2020 •

edited

Loading