Parameter-Efficient Fine-Tuning of State Space Models

Kevin Galim^1, Wonjun Kang^1, Yuchen Zeng^*2, Hyung Il Koo¹, Kangwook Lee²

¹ FuriosaAI, ² UW-Madison

Abstract: Deep State Space Models (SSMs), such as Mamba (Gu & Dao, 2024), have emerged as powerful tools for language modeling, offering high performance with efficient inference and linear scaling in sequence length. However, the application of parameter-efficient fine-tuning (PEFT) methods to SSM-based models remains largely unexplored. This paper aims to systematically study two key questions: (i) How do existing PEFT methods perform on SSM-based models? (ii) Which modules are most effective for fine-tuning? We conduct an empirical benchmark of four basic PEFT methods on SSM-based models. Our findings reveal that prompt-based methods (e.g., prefix-tuning) are no longer effective, an empirical result further supported by theoretical analysis. In contrast, LoRA remains effective for SSM-based models. We further investigate the optimal application of LoRA within these models, demonstrating both theoretically and experimentally that applying LoRA to linear projection matrices without modifying SSM modules yields the best results, as LoRA is not effective at tuning SSM modules. To further improve performance, we introduce LoRA with Selective Dimension tuning (SDLoRA), which selectively updates certain channels and states on SSM modules while applying LoRA to linear projection matrices. Extensive experimental results show that this approach outperforms standard LoRA.

News 🚀

[11/1/24] Our paper is selected for oral presentation (5 of 92 accepted papers) at NeurIPS 2024 Workshop FITML! 🎉🎉
[10/11/24] Our paper is available on arXiv!
[10/9/24] Our paper is accepted by NeurIPS 2024 Workshop FITML! 🎉

Usage

PEFT implementation on S4: Refer to the S4 folder.
PEFT implementation on Mamba: Refer to the mamba-peft folder.

Citation

@article{galim2024parameter,
  title={Parameter-Efficient Fine-Tuning of State Space Models},
  author={Galim, Kevin and Kang, Wonjun and Zeng, Yuchen and Koo, Hyung Il and Lee, Kangwook},
  journal={arXiv preprint arXiv:2410.09016},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
S4		S4
mamba-peft		mamba-peft
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Parameter-Efficient Fine-Tuning of State Space Models

Kevin Galim^1, Wonjun Kang^1, Yuchen Zeng^*2, Hyung Il Koo¹, Kangwook Lee²

¹ FuriosaAI, ² UW-Madison

News 🚀

Usage

Citation

About

Releases

Packages

Contributors 2

Languages

furiosa-ai/ssm-peft

Folders and files

Latest commit

History

Repository files navigation

Parameter-Efficient Fine-Tuning of State Space Models

Kevin Galim*1, Wonjun Kang*1, Yuchen Zeng*2, Hyung Il Koo1, Kangwook Lee2 1 FuriosaAI, 2 UW-Madison

News 🚀

Usage

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Kevin Galim^1, Wonjun Kang^1, Yuchen Zeng^*2, Hyung Il Koo¹, Kangwook Lee²

¹ FuriosaAI, ² UW-Madison

Packages