Official repository for the paper Can a Confident Prior Replace a Cold Posterior?
Representing aleatoric uncertainty. We introduce the DirClip prior to control the aleatoric (data) uncertainty of a Bayesian neural network. Consider the following toy classification problem: should we prefer the smooth or the complex decision boundary? Either choice is valid, depending on our beliefs about the quality of the data labels. The DirClip prior lets us represent these beliefs.
Results. Using the DirClip prior, we can force a BNN to have low aleatoric uncertainty, nearly matching the accuracy of a cold posterior without any tempering.
Training stability. Why does the DirClip prior stop working when
The core directory contains all required code for model training. We recommend interfacing with Python, although Bash is also supported thanks to Fire.
# Python
from run import run
run(model_name='cnn', ds_name='mnist', distribution='dirclip-10', distribution_param=0.9)
# Bash
python run.py --model_name='cnn' --ds_name='mnist' --distribution='dirclip-10' --distribution_param=0.9
The experiments directory contains three Python scripts for reproducing all of our training runs. However, they are meant to serve mostly as pseudocode: the scripts are very readable but you might find it necessary to add some experiment-management code to run multiple jobs in parallel, monitor them, etc. Since reproducing all of our experiments would take ~700 TPU-core-days, we also provide download links for model weights (32 GB) and data to reproduce loss landscape plots and Normal prior confidence (31 MB).
All figures in the report were generated using the provided Jupyter notebooks:
- distributions.ipynb provides most of the distribution visualizations (slices, gradients, training stability, etc).
- weights_analysis_dirichlet.ipynb and weights_analysis_cold.ipynb provide visualizations of trained models
- 2d_classification.ipynb uses HMC to create Figure 1
@misc{dirclip,
title={Can a Confident Prior Replace a Cold Posterior?},
author={Martin Marek and Brooks Paige and Pavel Izmailov},
year={2024},
eprint={2403.01272},
archivePrefix={arXiv},
primaryClass={cs.LG}
}