Skip to content

Commit

Permalink
Improve documentation for the all-linear flag (#1357)
Browse files Browse the repository at this point in the history
* added docs for all-linear

* added doc in quantization section

* added doc in lora section

* minor edit

* minor edit
  • Loading branch information
SumanthRH authored Jan 22, 2024
1 parent bb2471d commit 4a15595
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/source/developer_guides/lora.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,4 +179,12 @@ model.unload()

# delete adapter
model.delete_adapter("dpo")
```

## QLoRA-style training

The default LoRA settings in 🤗PEFT follow the [original paper](https://hf.co/papers/2106.09685) and add trainable weights to the query and value layers of each attention block. However, in [QLoRA](https://hf.co/papers/2305.14314), it was found that adding trainable weights to all the linear layers of a transformer model is beneficial to match full-finetuning performance. Since the list of modules to add will vary depending on the architecture, we provided a convenient shorthand : simple specify `target_modules='all-linear'` and let 🤗PEFT handle the rest:

```py
config = LoraConfig(target_modules="all-linear", ...) # adds LoRA to all linear layers like in QLoRA
```
6 changes: 6 additions & 0 deletions docs/source/developer_guides/quantization.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,12 @@ lora_config = LoraConfig(

model = get_peft_model(model, lora_config)
```
### QLoRA-style training
QLoRA adds trainable weights to all the linear layers in the transformer architecture. Since the attribute names for these linear layers can vary across architectures, we provide a convenient flag `'all-linear'` for this setting:

```py
config = LoraConfig(target_modules="all-linear", ...) # adds LoRA to all linear layers like in QLoRA
```

## Next steps

Expand Down

0 comments on commit 4a15595

Please sign in to comment.