-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement GHA workflow that scans the website file tree for broken links #157
Conversation
Another option we have is to remove the spell checking from this PR and merge in the link checking, and then introduce spell checking as a follow-on task (some day). |
Hi @kaijli, I'd be OK merging the link checking portion of this in already. I'm not comfortable with the spell checking portion and would like to continue refining that (mainly, to get it to scan all the files I'm expecting and to not spell check HTML markup as though it is English text). Are you OK with me removing the spell checking stuff from this branch (I'd put it on a new, separate branch)? |
This was something I wanted to suggest but only after I had everything figured out, so, glad we came to the same conclusion!
Sounds good to me. I haven't been looking at this much because I didn't want to mess with something and get the hairs crossed, so let me know if I can be of any help! |
Thanks, @kaijli! I'll remove the spell checking stuff from this branch this afternoon (after 2pm PT) and merge the remainder of this branch in. |
I created this new branch (and draft PR) containing the spellchecker-related code: #187 |
On this branch, @kaijli and @eecavanna added a link checker workflow to the repository's GitHub Actions. Whenever the workflow that assembles the website runs, it also invokes this new link checker workflow.
The new link checker workflow does the following things. First, it finds all hyperlinks in the HTML files in the website file tree. Next, it visits each of those linked URLs and checks whether the hyperlink is broken or not. Finally, writes a report of broken links in the GitHub Actions output.
Also, if the workflow happens to be processing a commit on the
main
branch and it detects any broken links, it creates a GitHub Issue (like this one) listing the broken links.Note: This PR was initially going to introduce both a link checker and a spell checker. The scope has since changes to only include the link checker. The original PR description is below, and the commit history on this branch includes some content that I want to salvage and put on a different branch, in pursuit of getting a spell checker working.
PR for spell check and link check github actions.
The link checker used performs (what I believe) is recursive checking through the compiled documents (since Eric mentions in this issue that lychee is not recursive). I found this specific checker by looking through the link checkers listed in this table generated by lychee.
The spell checker used is based off my list in the original issue. I don't remember the reasons for the order of the list, but if this does not serve us as well down the line, there are other options to look into.
Both actions are listed in one yml to be grabbed by the deployment action after the build step. I'm not sure if the move is to keep it as is, or move the build step to its own file to be called by the deploy action. I think it would be good for this check action to be run with every PR and / or deployment because it doesn't hurt.