Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal: go/doc: Support for bulleted lists #7873

Closed
robfig opened this issue Apr 26, 2014 · 51 comments
Closed

proposal: go/doc: Support for bulleted lists #7873

robfig opened this issue Apr 26, 2014 · 51 comments

Comments

@robfig
Copy link
Contributor

robfig commented Apr 26, 2014

Presently, the best practice when enumerating something in godoc is to use a
pre-formatted section.  For example:

"""
This library has the following caveats:
  * It requires the caller to invoke the method
    only on odd-numbered days
  * It may crash your computer.
"""

I think it would improve godoc (HTML) readability to recognize and format bulleted lists
as <ul>'s instead.  The recognition could be narrowly defined to avoid mistakenly
formatting blocks that were not intended to be lists.

I propose a list is recognized as:
- An indented (pre-formatted) block, 
- .. that consists of 2 or more list items
- .. and that consists only of list items

and a list item is defined as:
- Consecutive lines
- .. where the first line begins with "- " or "* "
- .. that terminate on blank lines.

This can be implemented with minimal disruption to the existing code.  Here is a CL:
https://golang.org/cl/91830044

Here are some before/after examples from popular package docs: 

http://godoc.org/code.google.com/p/gorilla/mux
http://192.241.149.161:8080/pkg/code.google.com/p/gorilla/mux

http://godoc.org/code.google.com/p/gorilla/sessions
http://192.241.149.161:8080/pkg/code.google.com/p/gorilla/sessions

http://godoc.org/code.google.com/p/gorilla/schema
http://192.241.149.161:8080/pkg/code.google.com/p/gorilla/schema

http://godoc.org/code.google.com/p/gogoprotobuf/gogoproto
http://192.241.149.161:8080/pkg/code.google.com/p/gogoprotobuf/gogoproto

http://godoc.org/github.com/davecgh/go-spew/spew
http://192.241.149.161:8080/pkg/github.com/davecgh/go-spew/spew

http://godoc.org/code.google.com/p/go.text/collate/colltab#pkg-constants
http://192.241.149.161:8080/pkg/code.google.com/p/go.text/collate/colltab#pkg-constants
@bradfitz
Copy link
Contributor

Comment 1:

Previous proposals similar to this have been rejected on grounds that it's a slippery
slope from this to Markdown or worse.

@robfig
Copy link
Contributor Author

robfig commented Apr 26, 2014

Comment 2:

I agree wholeheartedly with comments as plain text, free of presentation markup. 
This CL reflects how developers are already writing documentation and is a strict
improvement in the presentation of it.  I don't believe it has any cost to in-code
comment readability.

@ianlancetaylor
Copy link
Member

Comment 3:

The way to discuss a proposal like this is to raise it on the public lists.  Especially
if you have a patch.  See http://golang.org/doc/contribute.html .  Thanks.

Labels changed: added repo-main, release-none.

@frankandrobot
Copy link

Are you kidding me? You guys have an issue with markdown?

@bradfitz
Copy link
Contributor

@frankandrobot, that is not a constructive comment. We happily use Markdown on Github and elsewhere. We just don't want it to be Go's documentation format.

@frankandrobot
Copy link

@bradfitz i'm surprised that you guys decided to go with plain text. In practice in a large, complex app, you'll end up approximating a markup language for readability anyway (ex: caps for sections), and well you might as well pick a markup language for comments.

Alternatively, since you guys hate markdown, is there the ability to pick whatever markup you want (like a plugin for markdown)?

@bronger
Copy link

bronger commented Sep 28, 2016

While I sympathise with not supporting Markdown in godoc, I also think that bulleted lists are important enough to have them properly displayed in godoc. A PR is still welcome? Or has this been rejected somewhere else already?

@bradfitz
Copy link
Contributor

I think @adg and @robpike want to see a design before code.

@robpike
Copy link
Contributor

robpike commented Sep 28, 2016

And @griesemer.

@theherk
Copy link

theherk commented Mar 1, 2017

I understand and support the strive for simplicity. In this case, it seems, simplicity could be improved going either direction. Either have it be plain text or support a preexisting parser. To do it the way it is currently seems unnecessarily complex. As it stands, a simple parser is implemented (to heading sections and for preformatted text; documented here).

It seems that just supporting an already existing parser, or at least parser specification like Markdown, would be duck soup. Maybe it would hurt the elegance of the output, but a list is a key ingredient of documentation.

@griesemer griesemer modified the milestones: Go1.9Maybe, Unplanned Mar 1, 2017
@griesemer
Copy link
Contributor

Moving to 1.9Maybe to raise visibility. No guarantee we're getting to it for 1.9, though.

@awnumar
Copy link
Contributor

awnumar commented Mar 19, 2017

@bradfitz

Previous proposals similar to this have been rejected on grounds that it's a slippery slope from this to Markdown or worse.

This is an example of the slippery-slope fallacy and is not a valid argument.

@bradfitz
Copy link
Contributor

@libeclipse, thanks.

@bradfitz bradfitz added the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Jun 28, 2017
@bradfitz bradfitz modified the milestones: Go1.10, Go1.9Maybe Jun 28, 2017
@KernelDeimos
Copy link

In case anybody is interested, I made a modified version of Godoc that adds ordered and unordered lists with a fast and simple custom parser I wrote called slippery-slope-markdown.

Here's a link to my modified version: godoc-custom-fork

Admittedly, the way I threw it in there is kinda hacky, but it could be a starting point.

The parser follows this convention:

  • A line starting with - and a space is a list item
  • Any non-blank line following a list item belongs to that list item
  • A blank line terminates the list

A similar convention is applied to ordered lists. As an aside, it also allows bold text.

@gopherbot
Copy link
Contributor

Change https://golang.org/cl/167403 mentions this issue: builtin: make len's godoc less ambiguous

gopherbot pushed a commit that referenced this issue Mar 13, 2019
The len godoc uses a blockquote to list the rules for its semantics.

The item that describes channels is a bit long, so it's split in two
lines. However, the first line ends with a semicolon, and the second
line can be read as a sentence of its own, so it's easy to misinterpret
that the two lines are separate.

Making that easy mistake would lead to an incorrect understanding of
len:

	if v is nil, len(v) is zero.

This could lead us to think that len(nil) is valid and should return
zero. When in fact, that statement only applies to nil channels.

To make this less ambiguous, add a bit of indentation to the follow-up
line, to align with the channel body. If lists are added to godoc in the
future via #7873, perhaps this text can be simplified.

Fixes #30349.

Change-Id: I84226edc812d429493137bcc65c332e92d4e6c87
Reviewed-on: https://go-review.googlesource.com/c/go/+/167403
Run-TryBot: Daniel Martí <[email protected]>
TryBot-Result: Gobot Gobot <[email protected]>
Reviewed-by: Bryan C. Mills <[email protected]>
Reviewed-by: Brad Fitzpatrick <[email protected]>
@gopherbot gopherbot removed the NeedsDecision Feedback is required from experts, contributors, and/or the community before a change can be made. label Aug 16, 2019
@rwxrob
Copy link

rwxrob commented Apr 17, 2020

I find it extremely disturbing that this discussion has not once referenced the amazing efforts of the Pandoc Markdown project (which is the standard documentation method use by the R language as well as many text books). It is objectively the most widely adopted knowledge source standard.

@fosskers
Copy link

Can this move forward at all? It seems like many people are looking for it, and miss this docstring feature that is present in many other languages.

@dmitshur
Copy link
Contributor

dmitshur commented Aug 21, 2020

/cc @julieqiu This is a longstanding proposal (with lots of discussion above) to modify how Go packages can be documented. Perhaps it should be considered as part of other future work that's being done to improve documentation rendering on pkg.go.dev.

@tooolbox
Copy link

Just stumbled across this.

Current proposal is to go with @dsnet's suggestions in #7873 (comment) and #7873 (comment). Essentially, we'll take the current "standard" workaround for bulleted lists (indent so as to create a code block) and recognize it, turning it into a nice bullet list in HTML. We could optionally do something special for text (go doc) output but it's okay to leave that alone for now.

In addition to the asterisks in @dsnet's sketch, we should probably also accept at least Unicode bullets (because they express clear semantic intent) and hyphen-minus characters (because that's what we use throughout the Go codebase) as the leading symbol to recognize a list.

Sounds great! Would be neat to address this in pkg.go.dev.

@dsnet
Copy link
Member

dsnet commented Apr 15, 2021

Here's a concrete proposal for adding support for lists. It's more detailed than my original proposal #7873 (comment), has a prototype implementation, and in-depth analysis based on all public modules.

Goals

Let's put forth some goals:

  1. Lists should not require any special markup other than what users would naturally write.
  2. Lists should read naturally in both source code and in rendered documentation (e.g., such as on https://pkg.go.dev).
  3. List semantics should not disrupt existing usages of Go documentation.
  4. List semantics should be sufficiently explicit that lists aren't created when the user isn't expecting it.

Proposal

The proposal here is based on Markdown, but modified to comply with the goals mentioned above.

Grammar

A list is started by a line:

  • with zero or more whitespace characters,
  • followed by the marker for a list item (e.g. 1. or *),
  • followed by one or more whitespace characters,
  • followed by at least one non-whitespace character.

The start of the list must be preceded by either a blank line or be indented relative to the previous text span. The amount of leading whitespace in the first list item determines the baseline indentation for all subsequent items in the same list.

For example, this snippet:

The first paragraph.

* one
* two
* three

The second paragraph.

and this:

The first paragraph.
	* one
	* two
	* three
The second paragraph.

and this:

The first paragraph.

* one

* two

* three

The second paragraph.

will all render as:

The first paragraph.

  • one
  • two
  • three

The second paragraph.

However, this snippet:

The first paragraph.
* one
* two
* three
The second paragraph.

will render as:

The first paragraph. * one * two * three The second paragraph.

because the start of the list is not preceded by a blank line nor indented relative to the previous text span. See the analysis below for why we don't handle this case.

The text for a list item may span multiple lines. Each additional line must have the same baseline indentation as the starting list indentation. However, relative to the baseline indentation subsequent lines may have additional indentation, but it must be consistent for each subsequent line (i.e., have same indentation as the second line).

For example:

* The quick brown fox
jumped over the lazy dog
and ate my breakfast.

and

	* The quick brown fox
	jumped over the lazy dog
	and ate my breakfast.

and

  * The quick brown fox
    jumped over the lazy dog
    and ate my breakfast.

will all render as:

  • The quick brown fox jumped over the lazy dog and ate my breakfast.

The span continues until the next empty line or a line with less indentation than the baseline indentation and the optional indentation of the second line. For example,

* The quick brown fox
jumped over the lazy dog
and ate my breakfast.

Next paragraph.

and

  * The quick brown fox
    jumped over the lazy dog
    and ate my breakfast.
  Next paragraph.

will render as:

  • The quick brown fox jumped over the lazy dog and ate my breakfast.

Next paragraph.

Strangely formatted lists such as the following:

The first paragraph.
   * one
  * two
 * three
The second paragraph.

has undefined behavior, but will probably still render as expected:

The first paragraph.

  • one
  • two
  • three

The second paragraph.

Bulleted lists

A bulleted list item is denoted by a single character marker that is one of the following:

  • a star * (U+002A),
  • a plus + (U+002B),
  • a dash - (U+002D), or
  • a bullet (U+2022).

Numbered lists

A numbered list item is denoted by a non-negative integer followed by one of the following:

  • a period . (U+002E) or
  • a parenthesis ) (U+0029).

Numbered lists do not support automatic numbering. For example:

1. foo
1. bar
1. baz

would be rendered as a list with every item number still being 1. Similarly, out-of-order numbering or sparse numbering will be preserved as is. We don't use Markdown's "lazy list numbering" feature since it violates goal 2 as the rendered documentation would look different from natural source code.

Multi-span list items

A list item may have multiple spans. A subsequent span within the same list item must be separated from the previous span by one or more blank lines. Such spans must start with the same indentation as the previous span. The list item ends when encountering a text span with an indentation less than the current indentation. For example:

*  First paragraph 
   in the list item.

   Second paragraph
   in the list item

This is
another paragraph.

and

   * First paragraph 
   in the list item.

   Second paragraph
   in the list item
This is
another paragraph.

would both render as:

  • First paragraph in the list item.

    Second paragraph in the list item

This is another paragraph.

Subsequent spans can include pre-formatted text or other lists according to the expected grammar for such constructs (but with the leading common indent stripped). For example:

1.  The quick brown fox
    jumped over the lazy dog
    and ate my breakfast.
 
    *   This is some
        inner point.
 
        	// This is a pre-formatted block.
        	func main() {
        		...
        	}
 
    *   And another inner point.
 
    My breakfast looked delicious and
    I'm really sad that
    I couldn't enjoy it.
    	// This is a pre-formatted block.
    	// Surrounding it with empty lines is optional.
    	func main() {
    		...
    	}
2. Just another numbered point.
3. And another numbered point.

would render as:

  1. The quick brown fox jumped over the lazy dog and ate my breakfast.

    • This is some inner point.

      // This is a pre-formatted block.
      func main() {
      	...
      }
      
    • And another inner point.

    My breakfast looked delicious and I'm really sad that I couldn't enjoy it.

    // This is a pre-formatted block.
    // Surrounding it with empty lines is optional.
    func main() {
    	...
    }
    
  2. Just another numbered point.

  3. And another numbered point.

Multi-span list items and nested lists are a more complex addition, but not supporting it would interestingly violate goal 2. In the Go source code, it would already look naturally like a list item with multiple spans and nested lists, but would be rendered by Go documentation tools in a mangled manner. I don't expect nested lists to be used often and this assertion seems to hold empirically.

Analysis of existing source code

(The following analysis uses the latest version (as of 2021-03-21) of all public modules.)

Bullet marker characters

Bullet markers using:

  • * occurs ~806k times
  • - occurs ~440k times
  • + occurs ~5.8k times
  • occurs ~5.9k times

The number of * occurrences may be significantly inflated by Go documentation that uses:

/*
 * Some text
 * Some text
 */

since each of the * characters appear at the start of a paragraph. While a false positive, they are not interesting since such cases already render poorly in the Go documentation (Example Code and Doc).

The + and markers don't occur often and we could consider dropping support for it. However, use of is explicit enough of an intent to use bulleted items that there seemed to be no false positives.

Number marker characters

Number markers ending with:

  • . occurs ~66k times
  • ) occurs ~15k times

Lists in pre-formatted text

Due to the lack of support for lists, one convention is to pre-format the entire list to ensure that Go documentation doesn't render the lists as a regular paragraph. For example:

The following is a list:
	* one
	* two
	* three

Fortunately, the rules proposed above will allow these currently pre-formatted lists to be treated as a real list.

For all pre-formatted blocks, we detected ~241k cases where the first line starts with the list marker. Visual inspection of many of them seem to show that they genuinely are lists with no notable false positives:

Sample of findings from top 1000 modules.

Lists at the start of a paragraph

Due to the lack of support for lists, another convention treat each item in the list as an entirely new paragraph. For example:

The following is a list:

* one

* two

* three

The rules defined above will allow this situation to be treated as real list.

Our analysis detected ~123k paragraphs that start with a list marker. Visual inspection of many of them seem to show that they genuinely are lists with no notable false positives:

  • Code and Doc
  • Code and Doc
    • This is also a good example of why supporting nested pre-formatted blocks is useful.
  • Code and Doc

Sample of findings from top 1000 modules


In some cases, users don't realize that Go documentation doesn't support native lists and still write documentation that looks like:

* This is a list and
  this is the text for the second line.

Unfortunately, this currently renders as:

* This is a list and

this is the text for the second line.

Fortunately, the rules proposed above will allow this to be treated as a real list item.

To detect this situation, we search for all paragraphs that start with a list marker and are followed by pre-formatted block and found ~24k cases. Visual inspection of many of them seem to show that they genuinely are lists:

Sample of findings from top 1000 modules

Lists in the middle of a paragraph

Sometimes Go documentation contains lists in the middle of a paragraph:

Start of paragraph
* one
* two
* three
End of paragraph

Unfortunately, this currently renders as:

Start of paragraph * one * two * three End of paragraph

We detected ~77k cases of paragraphs with list markers in the middle. While most the cases do look like legitimate lists:

There were a few notable false positives:

  • due to spurious bullet characters appearing at the start of one of the lines:
  • due to an inlined list and one item happened to be at the start of the line:

Sample of findings from top 1000 modules

The rules above do not handle lists in the middle of a paragraph for several reasons:

  • The false positives (while a minority) would go against the 3rd goal of not disrupting existing usages.
  • From a user-experience perspective, once someone has started to write a normal paragraph, they shouldn't have to worry about the paragraph accidentally being broken in the middle unless there is a clear indication to do so (e.g., by indenting the list or adding a blank line). This seems to violate the 4th goal.
  • The other ways of writing a list are more explicit and seem superior (i.e., surrounding the list with blank lines or indenting it).

Implementation

I have working parser built. I don't think it would much work to integrate it into the pkgsite and I'm willing to do the work.

The existing doc.ToHTML and doc.ToText functions would be modified to understand list semantics. Note that modifiying doc.ToHTML will not help the pkgsite implementation since it uses it's own implementation of HTML rendering.

I believe there is benefit to a doc.ToSpans function that exposes parsing documentation as a lightweight AST. However, that can be a separate proposal.

@rsc
Copy link
Contributor

rsc commented Sep 10, 2021

See #48305.

@dsnet
Copy link
Member

dsnet commented Sep 10, 2022

Closing as done.

@dsnet dsnet closed this as completed Sep 10, 2022
@dmitshur dmitshur modified the milestones: Unplanned, Go1.19 Sep 10, 2022
@golang golang locked and limited conversation to collaborators Sep 10, 2023
@rsc rsc removed this from Proposals Oct 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests