Inspired by TinyZero, but designed to be 10X simpler, cleaner, and faster.
-
Clone the repository
git clone [email protected]:McGill-NLP/tiny-aha-moment.git
-
Install dependencies
First, load the necessay cuda tools:module load cudatoolkit/12.5
Next, install torch:
pip install torch==2.5
Next, follow the installation guide on the vllm website for installing vllm.
pip install vllm
Next,
pip install datasets deepspeed jupyter ipykernel ipywidgets wandb
Next, install flash attention,
pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.2.post1/flash_attn-2.7.2.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl
You should be set.
-
Start an interactive job on the cluster
Request resources using:salloc --partition=main --gres=gpu:a100l:1 -c 6 --mem=64G -t 12:00:00
Then, connect via VS Code or Cursor.
-
Run the training script
Openr1_gold.ipynb
, setCUDA_HOME
andHF_HOME
as needed, and start training. -
Install VS Code extensions
Make sure to install the Jupyter and Python extensions in VS Code for a smoother experience.
r1_gold.ipynb
is the ground truth implementation.r1_todo.ipynb
misses some components and you need to fill those without looking at ther1_gold.ipynb
.r1_script.py
is also just ther1_gold
but for convenience of running with python.