This repository contains minimal working examples of machine learning experiment configuration using Hydra.
These examples are intended to highlight some of the properties of Hydra which make it incredibly useful for machine learning research, including:
- Hierarchical composition, using the defaults list.
hydra-ml-examples/0_sklearn/config/main.yaml
Lines 1 to 4 in bc89f4c
- Variable interpolation1, which ensures a single source of truth for inter-linked configuration options.
- Object instantiation, which removes the need for boilerplate code to propagate configuration to backing classes/functions.
hydra-ml-examples/0_sklearn/script.py
Lines 23 to 24 in bc89f4c
The following commands can be used to perform run the basic example: a sweep over all combinations of model and dataset for a toy problem using scikit-learn.
conda env create --file 0_sklearn/env/environment.yaml
conda activate hydra-example-0
python 0_sklearn/script.py --multirun dataset=blobs,circles,moons model=randomforest,mlp,svm
docker build --build-arg EXAMPLE="0_sklearn" --tag hydra-example-0 .
docker run hydra-example-0 --multirun dataset=blobs,circles,moons model=randomforest,mlp,svm
Overriding parameters of the underlying model or dataset:
python 0_sklearn/script.py model=mlp model.activation=tanh
python 0_sklearn/script.py model=randomforest model.n_estimators=400
python 0_sklearn/script.py --multirun dataset=blobs,circles,moons dataset.n_samples=100,500,1000
Any parameter supported by the backing class can be modified from the command line.
For parameters which aren't explicitly specified in the configuration file, this can be achieved using append syntax2:
python 0_sklearn/script.py --multirun model=mlp +model.momentum=0.5,0.7,0.9
In this example the backing class is an MLP from scikit-learn (docs).
This mechanism is even more convenient with complex neural network definitions e.g. using Pytorch.