Skip to content

McGill-NLP/tiny-aha-moment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Recreating R1's Aha Moment

Inspired by TinyZero, but designed to be 10X simpler, cleaner, and faster.

Setup Instructions

  1. Clone the repository

    git clone [email protected]:McGill-NLP/tiny-aha-moment.git
  2. Install dependencies
    First, load the necessay cuda tools:

    module load cudatoolkit/12.5

    Next, install torch:

    pip install torch==2.5

    Next, follow the installation guide on the vllm website for installing vllm.

    pip install vllm

    Next,

    pip install datasets deepspeed jupyter ipykernel ipywidgets wandb

    Next, install flash attention,

    pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.2.post1/flash_attn-2.7.2.post1+cu12torch2.5cxx11abiFALSE-cp310-cp310-linux_x86_64.whl

    You should be set.

  3. Start an interactive job on the cluster
    Request resources using:

    salloc --partition=main --gres=gpu:a100l:1 -c 6 --mem=64G -t 12:00:00

    Then, connect via VS Code or Cursor.

  4. Run the training script
    Open r1_gold.ipynb, set CUDA_HOME and HF_HOME as needed, and start training.

  5. Install VS Code extensions
    Make sure to install the Jupyter and Python extensions in VS Code for a smoother experience.

File Descriptions

  • r1_gold.ipynb is the ground truth implementation.
  • r1_todo.ipynb misses some components and you need to fill those without looking at the r1_gold.ipynb.
  • r1_script.py is also just the r1_gold but for convenience of running with python.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published