Configurability with Hydra

This repository contains minimal working examples of machine learning experiment configuration using Hydra.

These examples are intended to highlight some of the properties of Hydra which make it incredibly useful for machine learning research, including:

Hierarchical composition, using the defaults list.

hydra-ml-examples/0_sklearn/config/main.yaml

Lines 1 to 4 in bc89f4c

    
           defaults: 
        
             - model: randomforest 
        
             - dataset: blobs 
        
             - override hydra/job_logging: colorlog

Variable interpolation¹, which ensures a single source of truth for inter-linked configuration options.

hydra-ml-examples/0_sklearn/config/model/mlp.yaml

Line 12 in bc89f4c

random_state: ${seed}

hydra-ml-examples/0_sklearn/config/dataset/blobs.yaml

Line 6 in bc89f4c

random_state: ${seed}

Object instantiation, which removes the need for boilerplate code to propagate configuration to backing classes/functions.

hydra-ml-examples/0_sklearn/script.py

Lines 23 to 24 in bc89f4c

    
           # Instantiate the model. Type hints on instantiations can improve readability. 
        
           model: SKLearnClassifier = hydra.utils.instantiate(cfg.model)

Getting started

The following commands can be used to perform run the basic example: a sweep over all combinations of model and dataset for a toy problem using scikit-learn.

With Conda

conda env create --file 0_sklearn/env/environment.yaml
conda activate hydra-example-0
python 0_sklearn/script.py --multirun dataset=blobs,circles,moons model=randomforest,mlp,svm

With Docker

docker build --build-arg EXAMPLE="0_sklearn" --tag hydra-example-0 .
docker run hydra-example-0 --multirun dataset=blobs,circles,moons model=randomforest,mlp,svm

Advanced usage

Overriding parameters of the underlying model or dataset:

python 0_sklearn/script.py model=mlp model.activation=tanh
python 0_sklearn/script.py model=randomforest model.n_estimators=400
python 0_sklearn/script.py --multirun dataset=blobs,circles,moons dataset.n_samples=100,500,1000

Any parameter supported by the backing class can be modified from the command line.

For parameters which aren't explicitly specified in the configuration file, this can be achieved using append syntax²:

python 0_sklearn/script.py --multirun model=mlp +model.momentum=0.5,0.7,0.9

In this example the backing class is an MLP from scikit-learn (docs).

This mechanism is even more convenient with complex neural network definitions e.g. using Pytorch.

Using OmegaConf ↩
Hydra override syntax. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
0_sklearn		0_sklearn
1_pytorch		1_pytorch
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Configurability with Hydra

Getting started

With Conda

With Docker

Advanced usage

About

Releases

Packages

Languages

	defaults:
	- model: randomforest
	- dataset: blobs
	- override hydra/job_logging: colorlog

	# Instantiate the model. Type hints on instantiations can improve readability.
	model: SKLearnClassifier = hydra.utils.instantiate(cfg.model)

License

joncarter1/hydra-ml-examples

Folders and files

Latest commit

History

Repository files navigation

Configurability with Hydra

Getting started

With Conda

With Docker

Advanced usage

Footnotes

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages