Automated accessibility testing of U.S. Federal Government websites using a serverless infrastructure.
Disclaimer: The scans do not constitute a complete accessibility evaluation. Due to the limitations of automated testing software, one should not take these scan results to be authoritative or to convey a Section 508 conformance assessment. Only a professional evaluator can perform a complete accessibility evaluation, often using a combination of manual and automated testing. For guidance, please refer to the Harmonized Testing Process for Section 508 Compliance: Baseline Tests for Software and Web Accessibility.
Following these steps will help you get started.
If you're only interested in the list of Federal domains we scan, you can checkout the spreadsheet that has them all - at least the ones we've been able to find - as well as the script) used to generate that file.
Follow the instructions here to install and configure the AWS CDK. You'll need to install node.js as a part of this step if you don't already have it.
You must specify your credentials and an AWS Region to use the AWS CDK CLI. There are multiple ways to do this, but our examples (and Makefile) use the
--profile
option with cdk commands.
This project uses Python 3.8, although other versions >= 3.5 should be fine. You can install Python from here, although using a system utility (e.g. homebrew for OSX) is fine as well.
Next, activate your python virtual environment:
python -m venv env
source env/bin/activate
pip install -r requirements.txt
These instructions prepare assets for deployment via the AWS CDK.
Before we let the AWS CDK deploy the a11y lambda function, we need to make a lambda layer for headless chrome and then tweak the internals of pa11y, the accessibility scanning tool, to use headless chrome.
To create the lamda layer with chrome-aws-lambda and replace pa11y
's dependency on puppeteer with puppeteer-core, run:
make build_a11y_scan
The above command will install the node modules into lambdas/a11y_scan
and create a zip archive called chrome_aws_lambda.zip
within /lambdas/
.
This lambda joins all of the individual scan results into one aggregate file, which will be usef the the front-end of the application. It also calculates historical trends and saves them for future reference.
To build this lambda, run:
make build_results_joiner
After this, you'll have a new directory in the root of the repo called lambda_releases
with a file called results.joiner.zip
. That is the lambda deployment package.
The scan pipeline:
lambda_gatherer
is a Lambda Function triggered the 1st and 15th of every month, sending one message per row in./domains/domains.csv
todomain_queue
SQS queue.lambda_a11y_scan
is a Lambda Function withdomain_queue
as its event source. It uses pa11y to scan each site, writing the results of each scan to an individual json file in theresults_bucket
.lambda_joiner
is a Lambda Function triggered the 8th and 23rd of every month. It generates summary statistics from the JSON files in theresults_bucket
, writing those results as two larger JSON files,data.json
andhist.json
, to thedata_bucket
S3 bucket. Importantly, all objects within theresults_bucket
are deleted every 10 days, hence the <10 day difference between the days of the month that trigger the other two lambda functions.
Could be done more elegantly with Step Functions...one day.
First, we'll create a Cloudformation stack to manage our infra's state as well as the s3 buckets for the lambda assets as well as the csv containg the domains we'll be scanning.
`cdk bootstrap --profile <your profile name>`
This shouldn't take too long and should finish with a message that looks something like this:
✅ Environment aws://<account id>/<your-region> bootstrapped.
Now we can deploy / redeploy our Stack:
`cdk deploy --profile <your profile name>`
After that command has finished, the resources specified in app.py
have been deployed to the AWS account you configured with the CDK. You can now log into your AWS Console and check out all the stuff.
You can optionally see the Cloudformation template generated by the CDK. To do so, run cdk synth
, then check the output file in the "cdk.out" directory.
You can destroy the AWS resources created by this app with cdk destroy --profile <your profile name>
. Note that although we've given the S3 Buckets a removalPolicy
of cdk.RemovalPolicy.DESTROY
so that they aren't orphaned at the end of this process (you can read more about that here), they'll fail to get destroyed if they contain objects. So you should log into the console and delete all of the objects within the buckets beforehand.
Note, however, that this step will not destroy the CloudFormation Stack or the S3 bucket created by cdk bootstrap
. There doesn't seem to be a way to do this from the command line at present, so you should log into your AWS console and manually delete first the s3 bucket and then the CloudFormation Stack.
GNU General Public License. See it here.