Skip to content

Latest commit

 

History

History
137 lines (90 loc) · 6.52 KB

datarepo.org

File metadata and controls

137 lines (90 loc) · 6.52 KB

WCT Data Repository

The WCT test framework includes a tests data file repository.

Overview

The repository (repo) provides files for the purpose of:

  • providing known input to tests.
  • providing expected output for comparison to test output.
  • accepting and retaining files for historical comparison across software versions.

Developers of tests that will add to the repository must assure:

  • All files are placed in the repository according to the naming conventions (described next).
  • Guidelines for adding files are followed.

Naming convention

The repository is composed of a files housed in directories named according to the following pattern:

<category>/<version>/<prefix><sep><name>/<files>

The <prefix><sep><name> is the test name as described more in the section of the framework document on test source.

One exception to this pattern is the “input” category. This and other categories as well as other details are described in the remaining sub sections.

Categories

A <category> sub directory takes its name from an enumerated set of labels. The label indicates a particular intention for the subsequent use of the file. The categories include:

  • input are for files explicitly added and for use by tests as input. It has a special layout convention as described below.
  • output are for files that test developers think are important enough to be retained beyond the temporary working directory context of the test but are otherwise uncategorized including data files. This is not meant as a general dumping ground. Adhere to the guidelines (see section Guidelines for files).
  • plots similar to output but sepecifically files that are figures (PDF, PNG, SVG, etc) and which have specific need to be persisted beyond the test.
  • history are for files, usually data, specifically intended for further processing by a historical test. No historical file should be created unless a historical test will consume it.
  • reports are for files, usually figures, specifically intended for further processing and final output to HTML or PDF reports.

Versions

The <version> sub directory must match the “spelling” output by

wire-cell --version

Ultimately, that string comes from git describe --tags. See below for ways to make versions of history category be available. Generally only a small number of past <version> will be supported.

Prefix and names

The <prefix><sep><name> is should match the test program source file name with extension omitted. Developers of tests with the same source file names must assure that any output from the tests are mutually compatible.

Files

File names must NOT:

  • include any <version> information.
  • change between different runs of a test.

They may otherwise be chosen freely. As they will reside in a unique category, version and test prefix and name, these strings need not be repeated. As the intention of the repo is to share files between tests in different test groups, collusion is required by producer and consumer. Once files are produced for a release, especially for input and history categories, files should not be renamed.

Input

The input category does not include <version> but instead has sub-categories describing the general file content.

  • <depos> files holding IDepo data
  • <frames> files holding IFrame data

Test developers are strongly urged to consider use existing input files before introducing new input files.

Once introduced, an input file may neither be removed nor renamed until all tests requiring an input put are no longer in any past releases of historical interest.

Guidelines for files

Tests may add files to the test data repo following these guidelines:

  • Only include files with an intended purpose.
  • Do not include files “just in case”.
  • Include files which are as limited in scope as possible.
  • Reduce their size to just required data.
  • Prefer formats that the WCT core support, avoid ROOT.

Special care should be given to the input and history categories.

The repo is not meant as a replacement for saving out otherwise ephemeral files from a temporary directory. If such files are needed, a user may always re-run the test with the temporary directory retained.

Working directory

When a repo is in used in the context of a WCT software build it is simply a directory under the build/ directory. The input and other categories are fund matching these paths:

build/tests/input/
build/tests/<category>/<version>/<prefix><sep><name>/

Preparing a repo

The working directory is prepared as part of the normal build when tests are enabled. This requires HTTP access to the repo server.

waf --tests

Each release of WCT has a hard-wired list of past releases for which the current release can use historical files. Normally, users need not set this but if required this list may be overridden:

waf configure --tests --test-data-releases 0.23.0,0.24.1 [...]
waf

Distributing repository contents

Archive files for all history versions present in the working directory may be produced.

waf packrepo

Or, archives for specific releases may be produced with:

waf packrepo --test-data-releases 0.20.0,0.21.0,0.22.0,0.23.0,0.24.1

Normal users need not perform this and experts may perform this as part of the release. To get the correct version path, the local working repo and the packing should be run in a clean, release checkout. Use wire-cell --version to check what you will get.

Reinventing history

It is expected that new historical tests will be developed to consume historical files that were not produced for past software releases. Perhaps the test producing the history file did not even exist in past versions. To support these new historical tests we must:

  • run the new test that produces the required historical files
  • add these files to our repo under the proper history version sub directory
  • repackage that history category for future use
  • run the historical test that consumes the files across the required versions.

These steps can be performed manually by checking out the required code version, building and running tests as usual followed by explicitly running the new test in the new version but in the environment of the old version.

An example of automating this procedure can be found in test/scripts/multi-release-testing.