Skip to content

Commit

Permalink
Update docs (#24)
Browse files Browse the repository at this point in the history
* Add badges

* Use official badges

* Stage outline

* Stage intro

* Update description

* Update spellcheck

* Update docs for arguments

* Update quick start section

* Simplify if case

* Extend docs on extentions

* Update spellcheck

* Fix typo

* Update wording
  • Loading branch information
ruffsl authored Jul 16, 2021
1 parent 1d6ae57 commit f5cf838
Show file tree
Hide file tree
Showing 6 changed files with 175 additions and 45 deletions.
5 changes: 4 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,10 @@
"python.linting.pycodestyleEnabled": true,
"cSpell.words": [
"CPUs",
"lockfiles",
"noqa",
"sched"
"sched",
"subverb",
"subverbs"
]
}
203 changes: 166 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,69 +1,198 @@
# colcon-cache

An extension for [colcon-core](https://github.com/colcon/colcon-core) to cache packages for processing.
[![GitHub Workflow Status](https://github.com/ruffsl/colcon-cache/actions/workflows/test.yml/badge.svg)](https://github.com/ruffsl/colcon-cache/actions/workflows/test.yml)
[![Codecov](https://codecov.io/gh/ruffsl/colcon-cache/branch/master/graph/badge.svg)](https://codecov.io/gh/ruffsl/colcon-cache)

## Example usage
An extension for [colcon-core](https://github.com/colcon/colcon-core) to cache the processing of packages.

Setup workspace:
Enables caching of various colcon tasks, such as building or testing packages, by associating successful jobs with the respective state of package source files. In conjunction with [colcon-package-selection](https://github.com/colcon/colcon-package-selection), this extension can accelerate developer or continuous integration workflows by allowing users to finely cache valid workspace artifacts and skip processing of unaltered or unaffected packages during software development. For example, when pulling various changes into a local workspace to review pull requests, this extension can be used to track which packages need to be rebuilt or retested, maximizing the caching of the existing artifacts.

The extension works by generating lockfiles that incorporate the respective state of package source files, either directly via hashing source directories or indirectly via detected revision control. Upon successful task completion for a package job, as when evoking colcon verbs like build, test, etc, these lockfiles are updated for the evoked verb, thereby delineating the provenance of the job’s results. For package selection, these lockfiles are then used to assess whether a verb’s cached outcome for a package remains relevant or valid.


## Quick start

Setup an example colcon workspace:
```
mkdir -p ~/ws/src && cd ~/ws
wget https://raw.githubusercontent.com/colcon/colcon.readthedocs.org/main/colcon.repos
vcs import src < colcon.repos
```

Lock workspace by generating `cache` lockfiles:
First lock cache, then build and test workspace:
```
colcon cache lock
```

Build and test workspace:
```
colcon build
colcon test
```

Modify package source:
Modify workspace packages, then update cache lockfiles:
```
echo "#foo" >> src/colcon-cmake/setup.py
```

Update `cache` lockfiles:
```
echo "#bar" >> src/colcon-package-information/setup.py
colcon cache lock
```

List modified packges by comparing `cache` lockfile checksums
Rebuild and retest, skipping packages with valid cache:
```
PKGS_MODIFIED=$(colcon list --packages-select-cache-modified | xarg)
colcon build --packages-skip-cache-valid
colcon test --packages-skip-cache-valid
```

Rebuild only modified packages and above:
```
colcon build --packages-above $PKGS_MODIFIED
```

Modify package source again:
```
echo "#bar" >> src/colcon-cmake/setup.py
echo "#baz" >> src/colcon-package-information/setup.py
```
## Subverbs

Update cache lockfiles again:
```
colcon cache lock
```
### `lock` - Lock Package Cache

Rebuild by skipping packages with valid `build` lockfiles:
```
colcon build --packages-skip-cache-valid
```
The `lock` subverb generates or updates lockfiles for selected packages by capturing the current state of package source files. The subverb provides basic arguments to change the build base path where lockfiles are recorded, as well as the option to ignore dependencies when capturing package state. More advance arguments specific `lock` tasks used to capture the package state include:

- `--build-base`
- The base path for all build directories (default: build)
- `--ignore-dependencies`
- Ignore dependencies when capturing caches (default: false)


## Package Selection

This extension provides additional package selection arguments that can filter by modified package source or by validity of workspace cache with respect to the most recent invocation of the `lock` subverb. By default, the internal cache key is selected by the colcon verb that invokes the package selection argument, but can be manually overridden:

- `--packages-select-cache-key`
- Only process packages using considered cache key. Fallbacks using invoked verb handler if unspecified.

### Modified Package Selection

Check if the `current` checksum in a package's lockfile matches it's `reference` checksum.

- `--packages-select-cache-modified`
- Only process a subset of packages whose cache denote package modifications (packages without lockfiles are not considered as modified)
- `--packages-select-cache-unmodified`
- Only process a subset of packages whose cache denote no package modifications (packages without lockfiles are not considered as unmodified)

### Valid Package Selection

Check if the `current` checksum in the verb's lockfile matches that in the package's lockfile.

- `--packages-select-cache-invalid`
- Only process a subset of packages with an invalid cache (packages without a reference cache are not considered)
- `--packages-skip-cache-valid`
- Skip a set of packages with a valid cache (packages without a reference cache are not considered)


## Extension points

This extension makes use of a number of colcon-core extension points for registering verbs, subverbs, and package selection arguments with colcon CLI, an event handler to update lockfiles for successful jobs, as well auto detect revision control with package augmentation. This extension also provides a number of it's own extension points for additional support of alternative revision control systems or package caching strategies.

### `VerbExtensionPoint`

This extension point determines how lockfiles are propagated for jobs invoked by a given verb. As tasks may or may not require prerequisite processing, this extension point provides the means to express the relational provenance of cached artifacts generated when using colcon. Default verb extensions provided include:

- `cache`
- Do not propagate lockfile, as `lock` subverb handles this explicitly
- `list`
- Do not propagate lockfile, using `cache` lockfile as a reference
- `build`
- Propagate lockfile, using `cache` lockfile as a reference
- `test`
- Propagate lockfile, using `build` lockfile as a reference

### `PackageAugmentationExtensionPoint`

This extension point determines if or what revision control is in effect for package source files, modifying the package's metadata to use the appropriate task extension for the `lock` subverb.

- `DirhashPackageAugmentation`
- If no revision control is detected, the default dirhash extension is configured.
- `GitPackageAugmentation`
- If git revision control is detected via a git repo, the git extension is configured.

### `TaskExtensionPoint`

This extension point determines how lockfiles are derived, given the package's detected revision control.

#### `DirhashLockTask`

This extension derives the lockfile by computing the hash of package source files using [Dirhash](https://github.com/andhus/dirhash-python). While most Dirhash options are exposed, such as customizing match and ignore expressions to include dot file paths (ignored by default), several specific arguments provide control in updating the `reference` checksum for a package's lockfile.

- `--dirhash-ratchet`
- Ratchet reference checksum from previous value
- `--dirhash-reset`
- Reset reference checksum to current value

Retest by skipping packages with valid `test` lockfiles:
```
colcon test --packages-skip-cache-valid
Arguments for 'dirhash' packages:
--dirhash-ratchet Ratchet reference checksum from previous value
--dirhash-reset Reset reference checksum to current value
--dirhash-algorithm Hashing algorithm to use, by default "md5". Always
available: ['md5', 'sha1', 'sha224', 'sha256',
'sha384', 'sha512']. Additionally available on current
platform: ['blake2b', 'blake2s', 'md4', 'md5-sha1',
'ripemd160', 'sha3_224', 'sha3_256', 'sha3_384',
'sha3_512', 'sha512_224', 'sha512_256', 'shake_128',
'shake_256', 'sm3', 'whirlpool']. Note that the same
algorithm may appear multiple times in this set under
different names (thanks to OpenSSL)
[https://docs.python.org/2/library/hashlib.html]
--dirhash-match [ ...]
One or several patterns for paths to include. NOTE:
patterns with an asterisk must be in quotes ("*") or
the asterisk preceded by an escape character (\*).
--dirhash-ignore [ ...]
One or several patterns for paths to exclude. NOTE:
patterns with an asterisk must be in quotes ("*") or
the asterisk preceded by an escape character (\*).
--dirhash-empty-dirs Include empty directories (containing no files that
meet the matching criteria and no non-empty sub
directories).
--dirhash-no-linked-dirs
Do not include symbolic links to other directories.
--dirhash-no-linked-files
Do not include symbolic links to files.
--dirhash-properties [ ...]
List of file/directory properties to include in the
hash. Available properties are: ['is_link', 'data',
'name'] and at least one of name and data must be
included. Default is [data name] which means that both
the name/paths and content (actual data) of files and
directories will be included.
--dirhash-allow-cyclic-links
Allow presence of cyclic links (by hashing the
relative path to the target directory).
--dirhash-chunk-size DIRHASH_CHUNK_SIZE
The chunk size (in bytes) for reading of files.
--dirhash-jobs DIRHASH_JOBS
Number of jobs (parallel processes) to use.
```

List generated lockfiles from each `verb`:
#### `GitLockTask`

This extension derives the lockfile by computing the hash of tracked source files using [Git](https://git-scm.com). Several specific arguments provide control in specifying the reference revision and fallback used for diffing the package source file when computing the `current` hash for a package lockfile. This not only enables tracking of package source files with respect to the most recent invocation of the `lock` subverb, but also with respect to a particular git branch, tag or commit. The default match criteria for the diff filter comparison can also be overridden.

```
Arguments for 'git' packages:
--git-diff-filter GIT_DIFF_FILTER
Select only files that are Added (A), Copied (C),
Deleted (D), Modified (M), Renamed (R), have their
type (i.e. regular file, symlink, submodule, …​)
changed (T), are Unmerged (U), are Unknown (X), or
have had their pairing Broken (B). Any combination of
the filter characters (including none) can be used.
When * (All-or-none) is added to the combination, all
paths are selected if there is any file that matches
other criteria in the comparison; if there is no file
that matches other criteria, nothing is selected.
Also, these upper-case letters can be downcased to
exclude. View docs for info:
https://git-scm.com/docs/git-diff#Documentation/
git-diff.txt---diff-filterACDMRTUXB82308203
--git-reference-revision GIT_REFERENCE_REVISION
Optionally specify revision used as a reference. If
unset, the reference from the previous lockfile will
be reused. If nether provide references, the fallback
will be used. View docs for info:
https://git-scm.com/docs/gitrevisions
--git-reference-fallback GIT_REFERENCE_FALLBACK
Override fallback revision used as a reference. If
nether the user and the previous lockfile specify a
revision, or if reference is unresolvable, this
fallback will be used. View docs for info:
https://git-scm.com/docs/gitrevisions
```
ls build/colcon-cmake/cache
```
5 changes: 2 additions & 3 deletions colcon_cache/event_handler/lockfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,11 +40,10 @@ def __call__(self, event): # noqa: D102
verb_name = self.context.args.verb_name
verb_handler_extensions = get_verb_handler_extensions()

if verb_name in verb_handler_extensions:
verb_handler_extension = verb_handler_extensions[verb_name]
else:
if verb_name not in verb_handler_extensions:
return

verb_handler_extension = verb_handler_extensions[verb_name]
lockfile = verb_handler_extension.get_job_lockfile(job)

if job in self._test_failures:
Expand Down
4 changes: 2 additions & 2 deletions colcon_cache/task/lock/dirhash.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,10 @@ def add_arguments(self, *, parser): # noqa: D102
modifier_group = parser.add_mutually_exclusive_group()
modifier_group.add_argument(
'--dirhash-ratchet', action='store_true',
help='Ratchet refrence checksum from previous value')
help='Ratchet reference checksum from previous value')
modifier_group.add_argument(
'--dirhash-reset', action='store_true',
help='Reset refrence checksum to current value')
help='Reset reference checksum to current value')

parser.add_argument(
'--dirhash-algorithm',
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ classifiers =
Programming Language :: Python
Topic :: Software Development :: Build Tools
license = Apache License, Version 2.0
description = Extension for colcon to cache package source.
description = Extension for colcon to cache the processing of packages.
long_description = file: README.rst
keywords = colcon

Expand Down
1 change: 0 additions & 1 deletion test/spell_check.words
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,6 @@ noqa
pathlib
plugin
pytest
refrence
rmtree
rtype
scspell
Expand Down

0 comments on commit f5cf838

Please sign in to comment.