Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harmonize formatting and capitalization of stdin/stdout/stderr #3114

Merged
merged 1 commit into from
Jun 17, 2019

Conversation

waldyrious
Copy link
Member

@waldyrious waldyrious commented Jun 17, 2019

Follow-up of this discussion at #3113.

(Sorry for the lack of checklist, I was trying the instructions in our hub page. In any case, since this edits multiple pages, the checklist is not that relevant anyway.)

@waldyrious waldyrious mentioned this pull request Jun 17, 2019
6 tasks
@waldyrious waldyrious added the mass changes Changes that affect multiple pages. label Jun 17, 2019
Copy link
Member

@owenvoke owenvoke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I just started doing this after you mentioned it. 😋

@waldyrious
Copy link
Member Author

Yeah, I commented on the other PR because I figured you might do that too :)

Copy link
Collaborator

@schneiderl schneiderl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@agnivade agnivade merged commit cf25745 into master Jun 17, 2019
@agnivade agnivade deleted the waldyrious/stdandard-streams branch June 17, 2019 16:40
@sbrl
Copy link
Member

sbrl commented Jun 17, 2019

Hold on. Why is stdin and stdout in backticks? Since it's the name of something and not a command, I'd argue that it shouldn't be in backticks.

@waldyrious
Copy link
Member Author

Hmm. I'm pretty sure we discussed that topic recently, but I can't find the discussion. Can anyone locate the relevant link?

@agnivade
Copy link
Member

Isn't it here #3113 (comment) ?

Actually, on further thought I am doubtful too. I thought that since this was like a reserved keyword, so it makes sense to be in backticks. But then should stuff like tcp and ssh be in backticks too ? I just made a quick search and it does not seem to be the case.

@waldyrious
Copy link
Member Author

Isn't it here #3113 (comment) ?

That one I recall — I'm not that forgetful! 😄

I was talking about a more general discussion regarding formatting of such specific technical words (filesystem paths, file formats, environment variables, executable names, that sort of thing). I recall that in that discussion we specifically spoke of acronyms like HTTP, SSH, etc., and IIRC we agreed that they don't need to be marked as code.

Unfortunately, the conversation probably happened in an unrelated PR, so I can't seem to find it again...

@sbrl
Copy link
Member

sbrl commented Jun 18, 2019

Ah, I see. I guess there's /dev/stdin, but that's not a thing on Windows (which used to be POSIX compliant) - it's more of a POSIX concept than an actual physical command or file.

Yeah, I sort of remember that discussion, but I can't pinpoint where it was.

I'd definitely say that stdin has more in common with HTTPS etc. than echo.

@agnivade
Copy link
Member

@waldyrious - Ah I see 😆 Yeah I remember vaguely. So, do you agree that stdin does not need to be in backticks ?

@waldyrious
Copy link
Member Author

Not exactly. I think it needs to be distinguished from regular prose somehow, so either backticks or all-caps (or something else, e.g. italics) would be fine by me, but using neither wouldn't be OK IMO.

@agnivade
Copy link
Member

I am fine with either but IMO it has to be consistent. What is the exact rule that determines when something is within backticks ?

@waldyrious
Copy link
Member Author

What is the exact rule that determines when something is within backticks ?

We can (indeed, we should) add a list to the guidelines, but I don't think it will be exhaustive. In general, it seems to me that any keyword with a specific technical meaning (again: filenames, variables, commands, etc.) should be marked as distinct from regular prose.

Using all-caps can work for this effect if they're acronyms or that is a standard way to represent them; otherwise, something like backticks or italics should be used. Are there any examples you can think of whose treatment under such a guideline would be ambiguous/unclear?

@agnivade
Copy link
Member

Keyword is a good way to differentiate. I guess in that sense stdin is a keyword and tcp is not. I can't think of something ambiguous with this definition. LGTM.

Somebody needs to add it in the guidelines.

@sbrl
Copy link
Member

sbrl commented Jun 19, 2019

tcp is an acronym, so I guess it would be caps: TCP?

stdin could be considered an abbreviation for standard input. I'd vote to put it in italics if it had to be distinguished somehow.

@waldyrious
Copy link
Member Author

Aha! I finally found where we had discussed this recently. It was in #2941. Paging @andrik and @mebeim who participated in that conversation, for feedback regarding what we've discussed above.

@waldyrious
Copy link
Member Author

stdin could be considered an abbreviation for standard input. I'd vote to put it in italics if it had to be distinguished somehow.

@sbrl true, but it's also true that e.g. "mkdir" is an abbreviation for "make directory", and so on, yet we still agree it should be marked as code. Besides, we were talking about acronyms (typically spelled in all-caps), not any sort of abbreviation :)

@sbrl
Copy link
Member

sbrl commented Jun 19, 2019

Right. But mkdir is also a command, so it makes sense that it's in backticks.

@andrik
Copy link
Collaborator

andrik commented Jun 24, 2019

Sorry guys, I'm started in a new job and my time is pretty short. Anyway, I agree with @sbrl , capitalize acronyms and italics for abbreviations.

@mebeim
Copy link
Member

mebeim commented Jun 24, 2019

Sorry, I missed this PR somehow, maybe because it was merged so fast.

I don't really like stdin/stdout/stderr enclosed in backticks. I'd say that's more of an abbreviation rather than a command, token or piece of code. Think of "pkg" (as an abbreviation for "package") for example. If anything, I'd transform it into "standard input" where it is possible. Other than that, I'd rather have it plain or CAPITALIZED, but without any accent or emphasis, as I've seen it in basically any man page that uses these words. To me, additional styling just makes the whole line look "heavier".

See for example:

@waldyrious
Copy link
Member Author

Think of "pkg" (as an abbreviation for "package") for example.

Depends on the context. Are you thinking of a program/binary called "pkg", or just an informal way to shorten the word "package"? Or something else?

@mebeim
Copy link
Member

mebeim commented Jun 24, 2019

Of course depends on the context, I was thinking about the second one. I used that almost everywhere in #3125 to make things more concise.

@owenvoke owenvoke mentioned this pull request Jun 25, 2019
3 tasks
@sbrl
Copy link
Member

sbrl commented Jun 25, 2019

Should we revert this PR then, or do something else?

@mebeim
Copy link
Member

mebeim commented Jun 25, 2019

Oh well, I don't know, I was just explaining my point of view. This PR brings consistency at least which is always good. We could discuss this in a separate issue maybe. I don't know everybody's opinion.

@agnivade
Copy link
Member

I am fine as long as things are consistent.

@waldyrious
Copy link
Member Author

waldyrious commented Jun 26, 2019

I was thinking about the second one

That's what I suspected. I wasn't suggesting marking informal abbreviations such as "package" → "pkg" as code. In fact, I think we should actively avoid using informal abbreviations like that if possible, since they count as jargon — e.g. "repo" instead of "repsitory", "dirs" instead of "directories", "PR" instead of "pull request", etc.

What I meant was that we should highlight keywords which have a specific technical meaning: environment variables, filenames and paths, file extensions, etc. For example, PWD is an actual variable name, not just an acronym of "print working directory"; Similarly, .txt isn't merely a shorter way to write for "text files", it's more specific than that. In both cases, that specific spelling carries technical meaning, rather than just being a shorthand, which IMO justifies the code formatting. The same is true for stdin and stdout — they're formally recognized and used even in technical manuals, rather than just convenient abbreviations used in informal contexts. Does that make any sense, @sbrl, @mebeim?

@agnivade I think I managed to convey to you the nuance of what I meant after some back and forth; do you think you could explain in your words the difference?

@agnivade
Copy link
Member

I think you have articulated better than what I could have said 😄

@mebeim
Copy link
Member

mebeim commented Jun 26, 2019

@waldyrious yes, totally understand the point. I agree that abbreviation should be avoided, that's why I was suggesting the replacement with "standard input/output/error", since there seem to be a lot of places where this could be done. I still stand by my original point, but of course if you guys prefer having those key words highlighted then that's fine, consistency is good.

@sbrl
Copy link
Member

sbrl commented Jun 26, 2019

Ah, yes @waldyrious. Highlighting them in some way because they carry technical meaning is certainly a good idea.

My concern is that putting them in backticks will confuse them with actual commands. There's always _italics_ as I suggested before, or perhaps **bold**.

I think @mebeim's point about the line looking heavier is worth considering too. Would the gain from highlighting them outweigh the potential decrease in readability?

I'm inclined to agree with @mebeim here.

@waldyrious
Copy link
Member Author

waldyrious commented Jun 26, 2019

I agree that abbreviation should be avoided, that's why I was suggesting the replacement with "standard input/output/error"

I'm not sure we're on the same page. What I was saying was that we should avoid informal abbreviations like "PR" and "repo", but actually encourage and highlight terms like stdin, .txt and POST. The former are jargon needed for communicating with humans in the tech industry, but we can afford more flexibility and be more explicit. The latter are specific keywords often used for communicating with machines; they don't work like regular words which we could e.g. replace with synonyms and retain the meaning.

Could it be that the root of the disagreement is about where stdin/stout/stderr, in particular, fit in that spectrum? Or would you say that even stuff like Ctrl + C, /etc/hosts or <div> should not be formatted as code, and backticks should be reserved for referring to other commands only?

My concern is that putting them in backticks will confuse them with actual commands.

I'm not sure I see why. Would you say that none of the examples above should be formatted as code? Please give some examples, so I can understand your vision better :)

@agnivade
Copy link
Member

Could it be that the root of the disagreement is about where stdin/stout/stderr, in particular, fit in that spectrum?

I think that is exactly it. I remember that we were previously using backticks only for commands and double-quotes for stuff like this. But then we moved to using backticks throughout. But now there are these gray areas.

The gist of the matter is that we are expanding the definition of backticks from only commands to "keywords". Now what exactly are these keywords, is something debatable. But @waldyrious has put it very nicely IMO.

@sbrl raised another point that they might be confused with commands. @sbrl - could you give some examples that clarify your point ?

@waldyrious
Copy link
Member Author

Just as additional food for thought, I should point out that we've discussed in the past the possibility of adopting a specific syntax for linking to other pages, which clients could render differently (see issue #784).

That would mean using something other than backticks for commands; for example, [[git commit]], as used in wiki markup, which is similar to how we handle {{tokens}}; or something lighter like [git commit]; or even just plain markdown links like [git commit](git-commit.md).

That would certainly limit the potential confusion between commands and other code words. That said, it would not work for all cases; and it certainly doesn't solve our problem now, since such a syntax is not yet in place nor in our short-term plans. So please don't let this aside derail the discussion! 😅

@mebeim
Copy link
Member

mebeim commented Jun 26, 2019

Could it be that the root of the disagreement is about where stdin/stout/stderr, in particular, fit in that spectrum?

Yes, that's exactly it. I would actually argue that "stdin" is indeed an informal abbreviation outside of a strictly programming-related context (i.e. we are not talking about C code and libc's stdin variable), so that's basically an abbreviation for "standard input". The line is kind of blurry here, since programs and programming are very closely related concepts. We are not writing manual pages for C library functions, where stdin (in backticks) could have a precise and literal meaning.

Could it be that the root of the disagreement is about where stdin/stout/stderr, in particular, fit in that spectrum? Or would you say that even stuff like Ctrl + C, /etc/hosts or <div> should not be formatted as code, and backticks should be reserved for referring to other commands only?

Apart from <div> (which actually is code, so yes that makes perfect sense to have in backticks), the other two would be ok to be formatted as code because that text has a different, more literal meaning: Ctrl + C is a shortcut to be performed as is, and /etc/hosts is a path to be written literally. This need in my opinion is not present for words like "stdin", "stdout", "stderr", "filesystem", "dir", or other technical terms and their abbreviations.

@agnivade
Copy link
Member

Great, so all of us are on the same page regarding what exactly the issue is.

I too, think "filesystem" and "dir" should not be in backticks. Those are general terms. But std(in,out,err) feel like special terms to me. But I do not feel strongly about it. As long as we can come to an agreement, I am good.

@waldyrious
Copy link
Member Author

I subscribe @agnivade's words exactly.

But the crux here seems to be that, as I suspected, we agree that backticks are appropriate for more than merely other commands -- the problem is in deciding how much more.

If from all the examples mentioned in this discussion and related ones, only std(in/out/err) raise controversy, I'd be happy to let go of those (although I'd then suggest either capitalizing them or deferring to the spelled-out versions).

But I'm afraid it won't be the only case that's not clear-cut. Let me try to compile examples listed in these discussions and others:

  1. Standard streams: stdin, stdout, stderr
  2. Tech acronyms: HTTP, IP, TCP, SSH, URL, POSIX...
  3. HTTP verbs and keywords: GET, POST, DELETE, User-Agent, 404...
  4. Commands, subcommands and flags: echo, rebase, --force...
  5. Filenames, paths and globs: filename.txt, /var/www, *.py...
  6. Key presses and keyboard shortcuts: Esc, Ctrl + C, q...
  7. Programming language keywords: SELECT, <div>, main()...
  8. Filesystem and network concepts: ext2, sdb1, ~, localhost...
  9. Environment variables: $HOME, $PATH, $PWD...
  10. Others I can't think of?

Can you guys comment on which of these should NOT be marked as code? And do you see the division as obvious and clear-cut?

I'll start: for me, number 2 clearly does not warrant backticks, and numbers 4, 5, 6, 7, and 9 definitely do. I'm flexible on the other ones, although those that we decide not to wrap in backticks should probably be capitalized at least (e.g. STDIN, POST, etc.)

@mebeim
Copy link
Member

mebeim commented Jun 26, 2019

Easy choices

Should NOT be marked as code: 1 as I already explained why before; 2 since acronyms clearly don't need backticks. Should be marked as code: 4, 5, 6, 7, and 9 since those are clearly parts of actual code (okay maybe filenames are not really code, but still have to be treated literally).

those that we decide not to wrap in backticks should probably be capitalized at least

I'm perfectly fine with that, regarding the two categories I just mentioned above.

Depends on the context

And do you see the division as obvious and clear-cut?

Nope, and that's exactly why I'm writing this section.

I'm expanding a little on the following ones since I believe they should be marked as code depending on the context. I don't think the following ones should be standardized as "always to be marked as code" or "never to be marked as code". Consistency is good, but as long as it doesn't hurt readability and clarity.

Number 3

As I already said in other occasions, I've never seen HTTP methods or status codes (200 "OK", 404 "Page Not Found", etc) marked as code anywhere, so to have those marked as code really feels strange to me. This is definitely more of a personal feeling, but I'm writing it down nonetheless. I would argue that whether or not these have to be marked as code entirely depends on the context. Sometimes those can be part of actual code, in that case it would make sense. As per the HTTP headers (like User-Agent in your example), that would probably be clearer marked as code most of the times, just because those are to be taken literally as is.

Number 8

Similarly for these, it depends on the context. Block devices are most of the times referred to as paths like sdb2 implicitly meaning /dev/sdb2, ~ implicitly meaning the home directory and so on. Those would probably be better marked as code most of the times. For others like localhost or similars, I don't have much of a preference and would still decide based on the context again.

@sbrl
Copy link
Member

sbrl commented Jun 27, 2019

Great list, @waldyrious!

  1. No, as they are not part of a command but technical terms.
  2. No again, because they are technical terms.
  3. No - doesn't really make sense for the reasons @mebeim has detailed above, though there are exceptions.
  4. Yes. We're putting commands in backticks already. As a general rule of thumb I try to avoid specifying flags literally --force in a description, but they would probably go in backticks because they are part of a command. More examples: mkdir, docker compose.
  5. Yes..... Probably. They are similar enough to Console output contains escaped HTML entities #4 to warrant backticks, but yet different enough that they are unlikely to be confused for a command.
  6. Not in descriptions (for examples themselves they have to be because of the linter). It's something the user should do, rather than part of a command. For reference, I believe we're discussing a standard format for keyboard shortcuts in question: inconsistencies in keyboard shortcut formatting #2408, IIRC.
  7. Tough one. I think it depends on the context.
  8. I agree with @mebeim here.
  9. Yes, as they are often part of a command, and are unlikely to be confused because they are uppercase.

@waldyrious
Copy link
Member Author

Thanks for the input 👍 it's very helpful. So what I'm trying to figure out is what would be a good set of guidelines (not hard rules) that we could follow and hopefully add to the documentation. I'd prefer to avoid adding language like "decide on the spot depending on context" if at all possible, so maybe we could list a set of keyword types that we'd recommend generally marking as code, and a set that we'd recommend generally not marking as code, leaving room for exceptions by discussion on specific pages. Does that sound viable? And if so, what would you suggest to place in each class?

@mebeim
Copy link
Member

mebeim commented Jun 28, 2019

@waldyrious yes that souds good 👍 I would suggest to not recommend marking as code [1, 2, 3] and to recommend marking as code [4, 5, 6, 7, 9] in such case. Number 8 is still a tough decision, I think single file names that are not special keywords (i.e. not ~, sdb2 etc) would still be ok without backticks, but full paths would be better with... doesn't really fall precisely in any of the two categories IMHO.

@waldyrious
Copy link
Member Author

We can split some of the entries if that helps discussion; Do you think you could split number 8 into two distinct groups 8.1 and 8.2, that you'd be comfortable placing in either category?

@mebeim
Copy link
Member

mebeim commented Jun 28, 2019

@waldyrious probably, yes: I'd say 8.1: simple file names (not recommended to be marked as code), vs 8.2: paths and special names (recommended to be marked as code).

@waldyrious
Copy link
Member Author

Can you add some examples of both categories just to be sure we're all on the same page regarding what these mean? From the description alone, both "simple file names" and "paths and special names" could easily mix up with number 5 above.

@mebeim
Copy link
Member

mebeim commented Jun 28, 2019

@waldyrious yes my last comment is unclear. Here it is:

8.1 (not recommended to be marked as code): Filesystem names: ext2, NTFS, tmpfs; simple networking related names: eth0, wlan0, localhost.

8.2 (recommended to be marked as code): Special tokens referring to paths: ~ (/path/to/home), sdb2 (/dev/sdb2), NUL: (Windows' /dev/null); IP networks or addresses: 192.168.1.9:1234, 10.0.0.0/16.

@waldyrious
Copy link
Member Author

Thanks for the clarification. I seems to me that the networking-related names would warrant being marked as code (I see them as similar to IP addresses and filenames/paths), but other than that, sounds reasonable.

@sbrl
Copy link
Member

sbrl commented Jun 29, 2019

I'd probably argue that IPs should be categorised as 8.1 and not 8.2, though with IPv6 I'm not sure.

@mebeim
Copy link
Member

mebeim commented Jun 29, 2019

@sbrl hmm 8 is a really strange category indeed.

@owenvoke owenvoke mentioned this pull request Feb 5, 2020
6 tasks
@sbrl sbrl mentioned this pull request Oct 1, 2020
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mass changes Changes that affect multiple pages.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants