Recreating R1's Aha Moment

Inspired by TinyZero, but designed to be 10X simpler, cleaner, and faster.

Setup Instructions

Clone the repository

git clone [email protected]:McGill-NLP/tiny-aha-moment.git

Install dependencies
First, load the necessay cuda tools:

module load cudatoolkit/12.5

Next, install torch:

pip install torch==2.5

Next, follow the installation guide on the vllm website for installing vllm.

pip install vllm

Next,

pip install datasets deepspeed jupyter ipykernel ipywidgets wandb

Next, install flash attention,

pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.2.post1/flash_attn-2.7.2.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

You should be set.

Start an interactive job on the cluster
Request resources using:
```
salloc --partition=main --gres=gpu:a100l:1 -c 6 --mem=64G -t 12:00:00
```
Then, connect via VS Code or Cursor.
Run the training script
Open r1_gold.ipynb, set CUDA_HOME and HF_HOME as needed, and start training.
Install VS Code extensions
Make sure to install the Jupyter and Python extensions in VS Code for a smoother experience.

File Descriptions

r1_gold.ipynb is the ground truth implementation.
r1_todo.ipynb misses some components and you need to fill those without looking at the r1_gold.ipynb.
r1_script.py is also just the r1_gold but for convenience of running with python.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
aha.py		aha.py
r1_gold.ipynb		r1_gold.ipynb
r1_script.py		r1_script.py
r1_todo.ipynb		r1_todo.ipynb
test_utils.py		test_utils.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recreating R1's Aha Moment

Setup Instructions

File Descriptions

About

Releases

Packages

Contributors 2

Languages

McGill-NLP/tiny-aha-moment

Folders and files

Latest commit

History

Repository files navigation

Recreating R1's Aha Moment

Setup Instructions

File Descriptions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages