Skip to content
This repository has been archived by the owner on Dec 20, 2022. It is now read-only.

mozilla-services/edge-validator

Repository files navigation

edge-validator

A service-endpoint for validating pings against mozilla-pipeline-schemas.

CircleCI

See bug 1452166 for motivating background.

Quickstart

Start the docker container to start the local service at localhost:8000. The following command will fetch the latest image from DockerHub.

docker run -p 8000 -it mozilla/edge-validator:latest

Simply POST to the endpoint to check if a document is valid. The testing namespace has an example schema for validation.

$ OK_DATA="$(echo '{"payload": {"foo": true, "bar": 1, "baz": "hello world"}}')"
$ curl -X POST -H "Content-Type: application/json" -d "${OK_DATA}" localhost:8000/submit/testing/test/1
> OK

The endpoint will return 200 OK on a successful POST. The response will be 400 BAD if the posted documented does not pass validation. If the URI is malformed, the validator may return with a 404 NOT FOUND. The status will also include the exception that caused the error.

$ BAD_DATA="$(echo '{"payload": {"foo": null, "bar": "3", "baz": 55}}')"
$ curl -X POST -H "Content-Type: application/json" -d "${BAD_DATA}" localhost:8000/submit/testing/test/1
> BAD: ('type', '#/properties/payload/properties/foo', '#/payload/foo')

In this example, payload.foo should be a boolean and payload.baz should be a string. Currently, only the first validation exception will be propagated to the user.

The exposed port can be changed through the PORT environment variable. It is possible to mount a set of local json-schemas by mounting a folder structure mirroring mozilla-services/mozilla-pipeline-schemas to the container's /app/resources/schemas directory.

cd mozilla-pipeline-schemas
docker run -p 8000 -v "$(pwd)"/schemas:/app/resources/schemas -it edge-validator

User Guide

API

Generic Ingestion

The generic ingestion specification provides enough context to map the ping to a schema.

The namespace distinguishes different data collection systems from each other. Telemetry is the largest consumer of the ingestion system to date. The document type differentiates messages in the ingestion pipeline. For example, the schemas of the main and crash pings share little overlap. The document version allows for versioning between documents. Finally, the document id is used to check for duplicates. This is validated in the running pipeline, but not supported here.

POST /submit/<namespace>/<doctype>/<docversion/[<docid>]

The schemas are mounted under the application directory /app/resources/schemas with the following convention:

/schemas/<NAMESPACE>/<DOCTYPE>.<DOCVERSION>.schema.json

The following tree shows a subset of the resource directory.

/app/resources
└── schemas
    ├── telemetry
    │   ├── anonymous
    │   │   └── anonymous.4.schema.json
    │   ├── core
    │   │   ├── core.1.schema.json
    │   │   ├── core.2.schema.json
    │   │   ├── core.3.schema.json
    │   │   ├── core.4.schema.json
    │   │   ├── core.5.schema.json
    │   │   ├── core.6.schema.json
    │   │   ├── core.7.schema.json
    │   │   ├── core.8.schema.json
    │   │   └── core.9.schema.json
    │   ├── crash
    │   │   └── crash.4.schema.json
    │   ├── main
    │   │   └── main.4.schema.json
    │   └─── ...
    │   │   ├── ...
    │   │   ├── ...
    │   │   └── ...
    └── testing
        └── test
            └── test.1.schema.json

Telemetry Ingestion

The edge-validator implements the Edge Server POST request specification for Firefox Telemetry. The validator will reroute the request as a generic ingestion request.

POST /submit/<namespace>/<docid>/<appName>/<appVersion>/<appUpdateChannel>/<appBuildId>

Installation

Building from source

# clone and set the working directory
$ git clone --recursive https://github.com/mozilla-services/edge-validator.git
$ cd edge-validator

# if the `--recursive` option was omitted, then update and initialize the submodule
$ git submodule update --init

# make sure that the system pip is up to date
$ pip install --user --upgrade pip

# install the dependencies into a virtual environment
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

# bootstrap for test/report/serve
$ make sync

The docker environment is suitable for running a local service or for running any of the testing suites.

make shell

# Alternatively
$ docker run -p 8000 -it edge-validator:latest bash

If you don't require permanent changes to the engine itself, you may pull down a prebuilt docker image through DockerHub using the mozilla/edge-validator:latest image.

Serving

serving via docker host (recommended)

docker --version          # ensure that docker is installed
make build                # build the container
make serve                # start the service on localhost:8000

serving via local host

The docker host automates the following bootstrap process.

flask run --port 8000     # run the application

Running Tests

Unit tests do not require any dependencies and can be run out of the box. The sync command will copy the test resources into the application resource folder.

make sync
make test

You may also run the tests in docker in the same way as CI. A junit.xml file is generated in a test-reports folder. The image must be rebuilt to include modified test files.

IMAGE=edge-validator:latest ./docker_env.sh test

Running Integration Tests

An integration report gives a performance report based on sampled data.

Ensure that the Google Cloud SDK is correctly configured.

bq show moz-fx-data-shared-prod:monitoring.document_sample_nonprod_v1

Then run the report.

# export a google service account
export GOOGLE_APPLICATION_CREDENTIALS=<path/to/credentials.json>

# Run using the local app context
make report

# Run using the docker host
EXTERNAL=1 PORT=8000 make report

The report can also be run in Docker when given the correct permissions.

docker run \
    -v $GOOGLE_APPLICATION_CREDENTIALS:/tmp/credentials \
    -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/credentials \
    -it edge-validator:latest \
    make report

You may also be interested in a machine consumable integration report.

integration.py report --report-path test-reports/integration.json