Skip to content

Latest commit

 

History

History

mamba-peft

Mamba SSM PEFT

  • Setup

    • Install dependencies
    # Create env
    conda create -n mamba-ssm python=3.10
    conda activate mamba-ssm
    
    # Install pytorch, e.g.,
    conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
    
    # Install mamba
    pip install "causal-conv1d>=1.2.0"
    cd src/mamba
    pip install -e .
    cd -
    
    # Install requirements
    pip install -r requirements.txt
    • For Spider, download Spider and extract to data/xlangai_spider/spider
  • Train

    SDLoRA
    # spider
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/spider/*channels_and_states*.yaml
    
    # samsum
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/samsum/*channels_and_states*.yaml
    
    # dart
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/dart/*channels_and_states*.yaml
    
    # glue
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/glue/*/*channels_and_states*.yaml
    LoRA (for SDLoRA comparison)
    # spider
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/spider/*lora_outproj*.yaml
    
    # samsum
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/samsum/*lora_outproj*.yaml
    
    # dart
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/dart/*lora_outproj*.yaml
    
    # glue
    python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/glue/*/*lora_outproj*.yaml
    PEFT Benchmark
    # spider
    python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/spider/*.yaml
    
    # spider (mamba-2.8b)
    python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/spider28b/*.yaml
    
    # samsum
    python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/samsum/*.yaml
    
    # dart
    python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/dart/*.yaml
    
    # glue
    python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/glue/*/*.yaml
    
    # cifar
    python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/cifar/*.yaml

References

The Mamba architecture was introduced in Mamba: Linear-Time Sequence Modeling with Selective State Spaces by Albert Gu and Tri Dao.

The official implementation is here: https://github.com/state-spaces/mamba/tree/main