-
Setup
- Install dependencies
# Create env conda create -n mamba-ssm python=3.10 conda activate mamba-ssm # Install pytorch, e.g., conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia # Install mamba pip install "causal-conv1d>=1.2.0" cd src/mamba pip install -e . cd - # Install requirements pip install -r requirements.txt
- For Spider, download Spider and extract to data/xlangai_spider/spider
-
Train
SDLoRA
# spider python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/spider/*channels_and_states*.yaml # samsum python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/samsum/*channels_and_states*.yaml # dart python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/dart/*channels_and_states*.yaml # glue python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/glue/*/*channels_and_states*.yaml
LoRA (for SDLoRA comparison)
# spider python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/spider/*lora_outproj*.yaml # samsum python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/samsum/*lora_outproj*.yaml # dart python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/dart/*lora_outproj*.yaml # glue python run_all.py train.py --device 0 --cfg cfg/exps/sdlora/glue/*/*lora_outproj*.yaml
PEFT Benchmark
# spider python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/spider/*.yaml # spider (mamba-2.8b) python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/spider28b/*.yaml # samsum python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/samsum/*.yaml # dart python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/dart/*.yaml # glue python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/glue/*/*.yaml # cifar python run_all.py train.py --device 0 --cfg cfg/exps/benchmark/cifar/*.yaml
The Mamba architecture was introduced in Mamba: Linear-Time Sequence Modeling with Selective State Spaces by Albert Gu and Tri Dao.
The official implementation is here: https://github.com/state-spaces/mamba/tree/main