-
Notifications
You must be signed in to change notification settings - Fork 932
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add doxygen-check pre-commit hook #11076
Add doxygen-check pre-commit hook #11076
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome @karthikeyann. This is going to work great with a few alterations. Comments below.
@@ -0,0 +1,18 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To my knowledge, there is currently no way to let pre-commit manage the doxygen
executable it uses, which means we have to use language: system
. (This is solved for clang-format by this project: https://github.com/ssciwr/clang-format-wheel)
I don't think we have any hard dependencies on packages outside of what pre-commit manages right now. I would adapt this script to fail cleanly (exit 0) if doxygen is not installed. It will still fail in CI. I believe that is how the cmake-format script works already (with some additional details that you can see by reading it).
ci/checks/style.sh
Outdated
@@ -55,13 +55,12 @@ HEADER_META_RETVAL=$? | |||
echo -e "$HEADER_META" | |||
|
|||
# Run doxygen and check for missing documentation | |||
DOXYGEN=`cd cpp/doxygen && { cat Doxyfile ; echo QUIET = YES; echo GENERATE_HTML = NO; } | doxygen - 2>&1 | tail -n +3` | |||
DOXYGEN=`ci/checks/doxygen.sh` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section can be removed. With this PR, the invocation of pre-commit above will call doxygen for us.
The clang-format check is also redundant with pre-commit now, but that hasn't been removed yet either. We should remove that in a separate PR -- we'll need to survey the team to find if anyone relies on the Python script to manually call clang-format. It is no longer needed by our CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit checks could be skipped pretty easily. so, we will need a CI check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that our CI style check already calls pre-commit
explicitly. The CI does not manually invoke black
, isort
, flake8
, etc. It just runs this pre-commit configuration.
Lines 23 to 24 in a00cca6
pre-commit run --hook-stage manual --all-files | |
PRE_COMMIT_RETVAL=$? |
If we add this as a pre-commit hook, we don't need to call it once via pre-commit
and again manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit will not run doxygen if there are no file changes in cpp/include
in that PR. It may not run always.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The CI command above is invoked with --all-files
for this reason. It will always run, as we intend.
Try pre-commit run --hook-stage manual --all-files
locally and you can see for yourself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed doxygen check in style.sh, updated pre-commit to run in manual mode too (it did not run earlier since stages 'commit' was specified).
@@ -43,6 +43,7 @@ dependencies: | |||
- black=22.3.0 | |||
- isort=5.6.4 | |||
- mypy=0.782 | |||
- doxygen=1.8.20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest on conda-forge is 1.9.2. Is that supported? If so, I would update this pinning. If not, we should try to make our code compatible with the latest release. https://anaconda.org/conda-forge/doxygen
- doxygen=1.8.20 | |
- doxygen=1.9.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I installed 1.9.2 and got these warnings.
warning: Tag 'OUTPUT_TEXT_DIRECTION' at line 102 of file '-' has become obsolete.
To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'COLS_IN_ALPHA_INDEX' at line 1089 of file '-' has become obsolete.
To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'LATEX_SOURCE_CODE' at line 1836 of file '-' has become obsolete.
To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'RTF_SOURCE_CODE' at line 1926 of file '-' has become obsolete.
To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
warning: Tag 'DOCBOOK_PROGRAMLISTING' at line 2031 of file '-' has become obsolete.
To avoid this warning please remove this line from your configuration file or upgrade it using "doxygen -u"
Since this is a language: system
package that is not managed by pre-commit, we will need to make sure it works for slightly-older and current versions. At the time of writing:
- Ubuntu 18.04 ships doxygen 1.8.13.
- Ubuntu 20.04 ships doxygen 1.8.17.
- Ubuntu 22.04 ships doxygen 1.9.1.
I'd say that supporting 1.8.13 and newer is an ideal goal. We may need to add filters for warnings that arise in 1.9.x, or simply call doxygen -u
to update the configuration if it is still correct for 1.8.x.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using doxygen 1.8.20
because that's the version use to generate https://docs.rapids.ai/api/libcudf/nightly/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pre-commit is not managing this dependency and uses a system (or conda) installation of doxygen
, whatever is on the user's path. Users of Ubuntu 22.04 who do not use a conda environment to build would likely be fetching the system-installed apt package for doxygen 1.9.1. Those users would get warnings and the pre-commit check would fail. Regardless of whether users are in a conda environment, they can make use of all other pre-commit hooks because they are self-managed by pre-commit itself. Because this hook cannot be self-managed by pre-commit, we need to take care of two small things:
- Users without doxygen installed must be able to make pre-commit pass locally (that's my suggestion above for
exit 0
if doxygen isn't found). In this case, CI (which has doxygen installed) will catch any issues. - Users with newer versions of doxygen must be able to make pre-commit pass locally (because we cannot pin a dependency that is not managed by pre-commit). I think we can easily adapt our configuration to be compatible with both 1.8.x and 1.9.x, and that should solve 95% of the potential problems here.
If you'd like to see a concrete solution, I can push a small commit to help resolve both issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to add -- if we solve the two problems above (no doxygen / new doxygen), it's totally fine to pin this the same way as our official documentation builds (1.8.20).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got the requirement about system python. That's the reason for using wheel version of clang-format too!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly! First, we prefer pre-commit hooks that are Python packages (or pip-installable binaries like clang-format, which doesn't have any Python code but is still pip-installable). Second, we prefer language: system
as a fallback option that may require additional work. From the list of Supported Languages, we really only want python
or system
(or possibly script
?). All the others require the system to have other tools installed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from unsupported tag
warnings, there are many bogus warning in older versions <1.8.20
.
Adding regex to ignore all those warnings is a hacky approach, and not recommended. I will enforce the doxygen version check in ci/check/doxygen.sh
.
Regarding updating the Doxyfile configuration for different version, unsupported tags alone can be removed to not throw unsupported tag
warnings on version from 1.8.20 to 1.9.1.
For doxygen versions >1.9.1
there are bogus warnings again. So, we cannot support them yet. (for example, default constructor without argument is throwing warning that their parameters are not all documented)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will enforce the doxygen version check in ci/check/doxygen.sh.
That sounds good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR updated. Please re-review
Co-authored-by: Bradley Dice <[email protected]>
Codecov Report
@@ Coverage Diff @@
## branch-22.08 #11076 +/- ##
===============================================
Coverage ? 86.34%
===============================================
Files ? 144
Lines ? 22741
Branches ? 0
===============================================
Hits ? 19635
Misses ? 3106
Partials ? 0 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still needs to exit gracefully if doxygen versions don't match (or doxygen isn't found).
ci/checks/doxygen.sh
Outdated
if [ $(version "$DOXYGEN_VERSION") -lt $(version "1.8.20") ] || [ $(version $DOXYGEN_VERSION) -gt $(version "1.9.1") ]; then | ||
echo -e "Unsupported doxygen version $DOXYGEN_VERSION" | ||
echo -e "Expecting doxygen version from 1.8.20 to 1.9.1" | ||
exit 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if doxygen is not installed?
What happens if an incompatible version is installed?
I would prefer for those cases to pass (in other words, silently skip this check if it cannot be completed correctly with the user's installed software). Any issues will be caught by CI. Otherwise users have to manually override pre-commit's failure in order to commit.
This is the same strategy we adopt for cmake-format: https://github.com/rapidsai/cudf/blob/branch-22.08/cpp/scripts/run-cmake-format.sh
We need a way to invoke CMake linting commands without causing pre-commit failures (which could block local commits or CI)
... exits gracefully if the file is not found.
cudf/cpp/scripts/run-cmake-format.sh
Lines 59 to 64 in 77ca025
echo "The rapids-cmake cmake-format configuration file was not found at any of the default search locations: " | |
echo "" | |
( IFS=$'\n'; echo "${DEFAULT_FORMAT_FILE_LOCATIONS[*]}" ) | |
echo "" | |
echo "Try setting the environment variable RAPIDS_CMAKE_FORMAT_FILE to the path to the config file." | |
exit 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if doxygen is not installed?
Fails
What happens if an incompatible version is installed?
Fails
If doxygen-check passes when doxygen is not installed or with unsupported version, I am concerned that users might be confused that local pre-commit doxygen-check passes, but pre-commit in CI failed.
Since pre-commit swallows STDOUT and STDERR when exit code is zero, users may not know the reason for local doxygen-check pass. IMO, it's showing different behavior in CI vs local machine silently, which ideally should not happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already lean towards letting things pass locally and fail in CI rather than putting requirements on users to install all system dependencies just to run basic code formatters on their local system. Python users in particular shouldn’t be required to install doxygen if they only make a small C++ tweak. I’d like to hear what @vyasr or @PointKernel think since they’ve also worked with pre-commit hooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that it is not possible to satisfy this version requirement (1.8.20 <= version <= 1.9.1) using apt
on Ubuntu 18.04 or 20.04. Only Ubuntu 22.04 has a compatible version (doxygen 1.9.1) and that could be upgraded to an incompatible version in the future at any time. The lack of graceful exit means that users effectively must use conda to satisfy the doxygen requirement (or build doxygen from source), which we specifically sought to avoid in the past. Users should be able to make pre-commit pass with software from only a system package manager, with no conda required. Otherwise users will simply not utilize pre-commit and benefit from the majority of features that do work on all systems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already lean towards letting things pass locally and fail in CI rather than putting requirements on users to install all system dependencies
Can you please provide more details on how this decision was made? I personally find it's impractical that sometimes my local pre-commit checks passed but then the CI failed. e.g. some of the cmake format failures, if not found by the local pre-commit hook, were obvious but others took me a while to "guess" where went wrong. If users choose to contribute to the code, I think it's fair to require them to install all the proper dependencies.
Python users in particular shouldn’t be required to install doxygen if they only make a small C++ tweak.
If they commit the "tweak", enforcing doxygen check is a reasonable requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the errors come at once. No early exit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d support exploring the build-from-source option within pre-commit.
Do you mean, build-doxygen-from-source?
It brings doxygen's compilation problems into pre-commit installation. I would support doxygen binaries through pip-installable wheels. https://sourceforge.net/projects/doxygen/files/ is actively maintained. so it's possible to implement similar to clang-format wheels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused here.
If we're going to require a specific version of a thing, we should automate getting it. We did this with clang-format.
By "we did it with clang-format", do you mean in the form of our conda environment? If so, that's still the case. Someone creating a custom environment for building cudf (or using compose) will get the correct version. If someone tries to use their own environment, though, they're on their own. That is also identical with how clang-format worked before.
I wonder if pre-commit hooks can execute code on installation (or build on first run?). That would be equally as good as pip-installable wheels.
I don't know how much control you have over the installation step, but you can execute basically arbitrary code on first run to do the installation if it doesn't already exist (in some prespecified directory). I wouldn't say that it's equally as good, though. Compiling code within a pre-commit hook immediately opens you up to any sort of build failure, which is also essentially out of your control to debug.
However, there is already a model for making wheels for non-Python packages in the clang-format wheels so I might lean towards trying that way first.
This is the better approach if we feel strongly about doxygen needing to always be available in this fashion. If we're already going to do the work to build it in a pre-commit hook, we're most of the way there to building the wheels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"We did this with clang-format"
clang-format binary is packaged as pip installable repo. ssciwr/clang-format-wheel). So it avoided using creating conda environment.
My first idea is to use conda for automate installing specific version of doxygen. but it creates additional dependency to have conda
installed in the system, which we don't like to put these additional conditions (conda
) on the developer.
@jrhemstad is suggesting to create a pip installable doxygen repo which can install doxygen binary, similar to clang-format.
@jrhemstad are you suggesting to build doxygen-from-source in pre-commit installation OR in pip installable doxygen repo releases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d echo all of what @vyasr said here: #11076 (comment)
Building from source on the first run of the pre-commit hook is not ideal — I only proposed exploring it because it was a different approach than previous ideas. Packaging doxygen binary wheels is the way I’d like to go if we decide to move forward here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once we have consensus on the exit code for local runs of pre-commit when doxygen is not installed / wrong versions. edit: resolved by verbose: true
and exit 0
. I really appreciate your hard work on this, it’s tricky to consider all the cases but it will be very nice for maintaining the quality of the docs.
Do we need to add a section in contributing guide mentioning how Doxygen check works in cudf? |
Added a line about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me! Thanks @karthikeyann !
Can we update the PR description with a little more detail on what this will actually do? |
Is the required/recommended doxygen version documented anywhere? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with one minor style suggestion.
function version { echo "$@" | awk -F. '{ printf("%d%03d%03d%03d\n", $1,$2,$3,$4); }'; } | ||
|
||
# doxygen supported version 1.8.20 to 1.9.1 | ||
DOXYGEN_VERSION=`doxygen --version` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DOXYGEN_VERSION=`doxygen --version` | |
DOXYGEN_VERSION=$(doxygen --version) |
For consistency with the rest of the script, we should switch this to use $()
instead of ``
.
For that matter, is the required clang-format version documented anywhere (I do not consider environment or pre-commit files to be documentation)? We probably want both. Also as discussed above, we should change the cmakelang hooks to also use |
@gpucibot merge |
Related to [FEA] Add Doxygen CI check #9373 (Add Doxygen CI check)