Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial style guide #103

Merged
merged 11 commits into from
Sep 13, 2022
Merged

Initial style guide #103

merged 11 commits into from
Sep 13, 2022

Conversation

petervdonovan
Copy link
Contributor

@petervdonovan petervdonovan commented Aug 25, 2022

The following are required before this PR is merged:

  • Add a style guide.
  • Create a table listing abbreviations specific to our project.
  • Create templates on which any new header and source files should be based.

The following are left for future PRs:

  • Bring our code base into compliance with the selected style guide.
  • Partially automate the process of checking whether code complies with our style guide.
  • Set up a code formatter.

@petervdonovan
Copy link
Contributor Author

petervdonovan commented Aug 25, 2022

I would like to call attention to the following potential sources of controversy:

  • Placement of the constant on the LHS of ==. This is rule 8.6a in Barr. This rule gives me displeasure for no reason in particular, but I lean toward keeping it because it can prevent bugs. For consistency, I would be inclined to apply the same rule for !=.
  • Use of section headers in source files and header files. We do this now, but inconsistently.
  • Copying the BSD license into every file. I have no problem with this, but it might not be necessary: some projects just include a "LICENSE.md" file in the root of their repository and are done with it.
  • The list of recommended abbreviations. Out of inertia, I included some abbreviations from Barr that I doubt we shall use. I deliberately omitted some abbreviations such as "stat". "fd", "sock", "pos"... Notably, I also omitted "tag" for "tag advance grant", for obvious reasons. With these guidelines, we cannot use capital letters to distinguish acronyms from words because the style guide places strict rules about where upper- and lowercase letters can be used.
  • Horizontal alignment. This is rule 3.2 in Barr. The Google Java style guide provides compelling reasons not to do this, but I regard it as a matter of taste.
  • The use of Hungarian notation to mark globals, pointers, pointers to pointers, and non-pointer handles. This is rule 7.2 7.1 in Barr.

This PR currently includes guidelines that I find acceptable. I regard most of these as bikeshed issues, and I do not feel strongly about most of them. I'm happy to discuss them further, though, since more discussion and consensus now will make things go more smoothly later.

@petervdonovan petervdonovan marked this pull request as ready for review August 25, 2022 07:35
@petervdonovan
Copy link
Contributor Author

I would also appreciate opinions regarding where we put comments. It is a bad idea to duplicate comments between header and source files. My question is, "should documentation comments be placed in header files, or in source files?" This includes any file-level descriptions about what modules do.

I think they belong in source files, except in the case when multiple alternative source files implement the same API, and that API is described in a common header file. This is because

  • It is useful for header files to serve as concise reference sheets. If a header file contains documentation, then it is verbose, hence scarcely more convenient for quick reference than reading the code directly. Any questions not answered by the concise header file can be answered by reading the corresponding source file.
  • If documentation is close to the implementation, the two are less likely to fall out of sync, and it is easier to visually check that the documentation is not lying. It is difficult and important to keep the documentation in sync with the implementation, so we should do whatever we can to make this problem easier.

It should not be necessary to copy the license everywhere. It should
not be necessary for each file to contain a marker that specifies its
own name. It should not be necessary for author emails to be provided.
@petervdonovan petervdonovan requested a review from lhstrh August 31, 2022 15:20
@lhstrh lhstrh changed the title Add initial style guide. Add initial style guide Aug 31, 2022
@lhstrh lhstrh changed the title Add initial style guide Initial style guide Aug 31, 2022
@lhstrh
Copy link
Member

lhstrh commented Aug 31, 2022

I would like to call attention to the following potential sources of controversy:

* **Placement of the constant on the LHS of ==.** This is rule 8.6a in [Barr](https://barrgroup.com/sites/default/files/barr_c_coding_standard_2018.pdf). This rule gives me displeasure for no reason in particular, but I lean toward keeping it because it can prevent bugs. For consistency, I would be inclined to apply the same rule for `!=`.

I feel the same -- it looks unnatural to me, but the reasoning behind doing it as Barr prescribes makes sense. I'm OK with following Barr.

* **Use of section headers in source files and header files.** We do this now, but inconsistently.

I agree we need consistency above anything else. Removing the headers seems like the easiest way to achieve compliance. Barr gives the following guide line, which I think is sufficient and does not require headers, although it does not forbid them:

Each source file shall be comprised of some or all of the following sections, in
the order listed: comment block; include statements; data type, constant, and
macro definitions; static data declarations; private function prototypes; public
function bodies; then private function bodies.

* **Copying the BSD license into every file.** I have no problem with this, but it might not be necessary: [some projects just include a "LICENSE.md" file in the root of their repository and are done with it.](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/licensing-a-repository#determining-the-location-of-your-license)

I prefer to keep things simple and avoid redundancy. If we require authors to be listed and refer to the LICENSE.md file, then I think we should be covered.

* **The list of recommended abbreviations.** Out of inertia, I included some abbreviations from Barr that I doubt we shall use. I deliberately omitted some abbreviations such as "stat". "fd", "sock", "pos"... Notably, I also omitted "tag" for "tag advance grant", for obvious reasons. With these guidelines, **we cannot use capital letters to distinguish acronyms from words** because the style guide places strict rules about where upper- and lowercase letters can be used.

I don't mind this. The same is true in most Java style guides. The "TAG" acronym is cute because it is recursive, but there is seems no need to have the "T" as "advance grant" doesn't appear to be ambiguous.

* **Horizontal alignment.** This is rule 3.2 in [Barr](https://barrgroup.com/sites/default/files/barr_c_coding_standard_2018.pdf). The [Google Java style guide](https://google.github.io/styleguide/javaguide.html#s4.6.3-horizontal-alignment) provides compelling reasons _not_ to do this, but I regard it as a matter of taste.

I'm with Google on this one.

* **The use of Hungarian notation** to mark globals, pointers, pointers to pointers, and non-pointer handles. This is rule 7.2 in [Barr](https://barrgroup.com/sites/default/files/barr_c_coding_standard_2018.pdf).

I have never followed this convention and haven't seen it being used much. Should we? Seems a bit excessive to me.

This PR currently includes guidelines that I find acceptable. I regard most of these as bikeshed issues, and I do not feel strongly about most of them. I'm happy to discuss them further, though, since more discussion and consensus now will make things go more smoothly later.

There are two things I personally don't like about Barr:

  • new lines before opening curly braces; I prefer to keep them on the same line as the branch condition or function signature; and
  • multi-line block comments that start with /** not immediately being followed by a newline.

In other words, I (strongly) prefer:

/**
 * My foo function.
 * Bla bla bla.
 */
function foo() {
  // stuff
}

over

/** My foo function.
 *  Bla bla bla.
 */
function foo()
{
  // stuff
}

@lhstrh
Copy link
Member

lhstrh commented Aug 31, 2022

I think writing comments at the interface is preferable. Ideally, we'd leverage something like Doxygen's implements command to inherit the interface description and supplement it with implementation-specific details if needed.

@petervdonovan
Copy link
Contributor Author

I think that all of the comments @lhstrh gave above seem reasonable. As for Hungarian notation: Honestly, I don't know if it is a good idea. It's hard to determine whether things like this are good practices because formal evaluation of productivity in software doesn't seem to work.

However, I think it plausible that Barr is right about it because

  • I think I have in the past wanted to encode in a variable name whether that variable was a pointer, but been discouraged from doing so by the fact that writing pointer_ or worse, pointer_to_pointer_to in full is too verbose. This can matter. Suppose you want to check a pointer to some boolean field in a conditional, but you forget to dereference it. Then you are checking whether the pointer is null instead of checking the boolean that it points to.
  • We have enough global variables that it can be hard to keep track of which variables are static and which are global. This changes how far one must look to determine what code could modify the variable, and it also pins a sort of "mark of shame" to global variables that draws attention to their danger.

The b prefix might be unnecessary -- it strikes me as a "Systems Hungarian" sort of thing that is unnecessary if you use stdbool.h. It could save a character in some cases -- instead of writing a name like is_done, where "is" marks it as boolean, you could write b_done -- but this seems like a micro-optimization to me. I don't feel strongly about it.

I also am unsure why Barr doesn't include a prefix such as s that means "array." We occasionally work with pointers to arrays and arrays of pointers, and which kind of data structure you are working with really is not reflected in the data type. It also seems to fit the "making wrong code look wrong" mantra in this famous post because knowing whether something is an array vs. just a regular pointer determines whether it makes sense to do pointer arithmetic on it.

Given that this notation is reputed to be really annoying to those who don't like it, it might help for @edwardalee to chime in if he cares.

@edwardalee
Copy link
Contributor

I can't find a rule 7.2 in Barr nor any mention of Hungarian notation in Barr, so I'm not sure what is being proposed. Hungarian notation was developed by Charles Simonyi (who happens to be Hungarian), who was the lead developer of Microsoft Windows, not what I would consider a model piece of software. I've never been a fan of certain aspects of Hungarian notation, especially including types in the notation, and prefer to have a decent IDE, to heed compiler warnings, and to have a good type checker. In C, names can be used to compensate somewhat for limitations in scoping rules, e.g. to prefix globals with lf_ or _lf_ to designate globals that belong to the LF API and either are or are not intended to be used by programmers. But these are more ad-hoc local rules than global policies. What exactly is being proposed here?

@petervdonovan
Copy link
Contributor Author

Sorry for the confusion. I wrote 7.2, but I meant 7.1. It's on page 48, which is the 60th page appearing the the PDF. The idea is to use a g to mark globals, a p to mark pointers, a b to mark booleans, and an h to mark non-pointer handles such as file handles (in that order). These would appear as prefixes to names, so the the global pqueue pointer reaction_q would become gp_reaction_q and the global dynamically-allocated array of pointers to booleans would become gppb_lf_is_present_fields (where the double p to marks a pointer to a pointer).

Recently, I also commented that the b was unhelpful because it is Systems Hungarian (i.e., it denotes type, which as you say, is usually not helpful) and that it might be nice to distinguish arrays from pointers using a letter such as s, in which case we could end up with something like gsp_lf_is_present_fields for our global array of pointers to booleans. If we like the "lf" prefix instead of "g", then that's fine with me too.

Maybe this is too complicated, and I'm happy to just ignore this part of the style guide. There's really no good reason why I raised this issue, other than the fact that it was in this style guide which I picked arbitrarily, and that I was intrigued by the idea that marking what is a pointer or a pointer to a pointer could help in navigating certain data structures.


Incidentally, I'm not sure the _lf_ prefix is good for us to use because "reserved names include all external identifiers (global functions and variables) that begin with an underscore (‘_’)"; Barr forbids leading underscores, perhaps for this reason. I'm not sure if that applies to us.

@petervdonovan
Copy link
Contributor Author

(By the way, the same document that forbids leading underscores also forbids type names ending with _t; this is also discussed here, where the accepted answer notes that POSIX reserves the _t suffix.)

Since our code base is pretty small, the number of naming conflicts might be rare -- we don't seem to have had any yet.

@lhstrh
Copy link
Member

lhstrh commented Sep 1, 2022

(By the way, the same document that forbids leading underscores also forbids type names ending with _t; this is also discussed here, where the accepted answer notes that POSIX reserves the _t suffix.)

Since our code base is pretty small, the number of naming conflicts might be rare -- we don't seem to have had any yet.

I think we do have quite a few though: interval_t, reaction_t, etc.

@petervdonovan
Copy link
Contributor Author

I think we do have quite a few though: interval_t, reaction_t, etc.

Right. That means that if we followed that guideline, we would have to change a lot of code, and it would be a mess.
Furthermore, we have not experienced real problems (naming conflicts) due to our use of the _t suffix, and Barr actually recommends the use of the _t suffix. Therefore, we should decide whether it is worth the trouble to guard ourselves from naming conflicts. A similar concern applies to the leading underscore, which is discouraged for the same reason as the _t suffix (and also because Barr recommends against the leading underscore).

@petervdonovan
Copy link
Contributor Author

An unrelated observation is that if this discussion becomes too complicated, and if we end up creating a lot of exceptions to the Barr standard, we could just abandon this particular attempt at a style guide and start from scratch (while keeping an eye on resources like MISRA and this) so that our style guide is as close as possible to what we already do.

I really just want the process of adopting a style to be as painless as possible...

@lhstrh
Copy link
Member

lhstrh commented Sep 1, 2022

I think the most pragmatic thing is to point to a style guide for the meat of it and mention a short list of exceptions. Are there other style guides to consider that might be closer to what we want? If not, I think we're doing the right thing already.

@edwardalee
Copy link
Contributor

We have had some naming conflicts. We had defined macros named MIN and MAX, which I had to rename LF_MIN and LF_MAX to get our code to compile with the Nordic base for the nrf52.

As for the leading underscore, we could avoid it if we refactor header files so that user code never has these variables/functions in scope and so that we entirely eliminate any #include of C files (I think most of these are gone, but not all?).

@lhstrh
Copy link
Member

lhstrh commented Sep 6, 2022

It's funny that Barr uses the MIN/MAX macro as an example of what not to do (use an inline function instead).

This commit
- Forbids horizontal alignment
- Forbids all forms of Hungarian notation
- Forbids the use of comments as explicit section headers
- Defines a stance on doc comment placement
- Fixes two small mistakes in the templates: Private procedure
implementations should appear after public procedure implementations,
and globals must be declared (not defined) in header files.
@petervdonovan
Copy link
Contributor Author

I think my latest commit addresses the discussion we had above. The commit message itemizes the changes.

Copy link
Member

@lhstrh lhstrh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Outdated Show resolved Hide resolved
CONTRIBUTING.md Show resolved Hide resolved
@lhstrh lhstrh requested a review from edwardalee September 8, 2022 19:04
@lhstrh lhstrh requested a review from cmnrd September 9, 2022 04:56
Copy link
Contributor

@edwardalee edwardalee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. One downside of this is that it requires everyone to go read Barr, an 87 page document...

CONTRIBUTING.md Outdated Show resolved Hide resolved
@petervdonovan
Copy link
Contributor Author

One downside of this is that it requires everyone to go read Barr, an 87 page document...

I agree that this is a problem. My goal was not to create busywork, and I had hoped that it would be possible for people to mostly continue with the existing style and only sometimes refer to Barr to answer questions and resolve disagreements. If several people end up having to read it all cover-to-cover, that would be unfortunate, but maybe having a style guide is worth it to us.

@petervdonovan petervdonovan merged commit 79bdc4b into main Sep 13, 2022
@petervdonovan petervdonovan deleted the style-guide branch September 13, 2022 23:06
@lhstrh lhstrh added the enhancement Enhancement of existing feature label Feb 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement of existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants