-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP
] Multi-adapter saving support for PEFT
#26411
[WIP
] Multi-adapter saving support for PEFT
#26411
Conversation
Co-authored-by: Benjamin Bossan <[email protected]>
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update. In general, LGTM. I have only some minor comments, no blockers.
Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters | ||
for inference) returns the list of all active adapters so that users can deal with them accordingly. | ||
|
||
For previous PEFT versions (that does not support multi-adapter inference), `module.active_adapter` will return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For previous PEFT versions (that does not support multi-adapter inference), `module.active_adapter` will return | |
For previous PEFT versions (that do not support multi-adapter inference), `module.active_adapter` will return |
I think this statement is a bit confusing: This method always returns a list of str, right? The statement seems to relate to what older PEFT versions do under the hood, but that's an implementation detail and should not be mentioned in the docstring (but you could add this as a comment in the code).
|
||
def active_adapter(self) -> str: | ||
def active_adapters(self) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, .active_adapters
is a method, in PEFT, it's a property
. Since we newly introduce this method in transformers, do we want to take the opportunity to make it property
for consistency? The downside is that active_adapter
is a method here, not a property
, so it would be inconsistent with that method. We could change it too, but that would be BC breaking.
break | ||
|
||
# For previous PEFT versions | ||
if isinstance(active_adapters, str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to have no active adapter at all or is this prevented at some point? Otherwise, active_adapters
could be undefined here.
if len(peft_multi_adapter_state_dict.keys()) == 1: | ||
current_adapter = list(peft_multi_adapter_state_dict.keys())[0] | ||
state_dict = peft_multi_adapter_state_dict[current_adapter].copy() | ||
peft_multi_adapter_state_dict = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this change, which I think makes the intent a bit more obvious and avoid changing the type of peft_multi_adapter_state_dict
:
if len(peft_multi_adapter_state_dict.keys()) == 1: | |
current_adapter = list(peft_multi_adapter_state_dict.keys())[0] | |
state_dict = peft_multi_adapter_state_dict[current_adapter].copy() | |
peft_multi_adapter_state_dict = None | |
if len(peft_multi_adapter_state_dict.keys()) == 1: | |
current_adapter = list(peft_multi_adapter_state_dict.keys())[0] | |
state_dict = peft_multi_adapter_state_dict.pop(current_adapter).copy() |
Not sure if the .copy()
is needed?
Then, below change:
- _peft_save_multi_adapter = _hf_peft_config_loaded and peft_multi_adapter_state_dict is not None
+ _peft_save_multi_adapter = _hf_peft_config_loaded and peft_multi_adapter_state_dict
for adapter_name in peft_multi_adapter_state_dict: | ||
for ignore_key in self._keys_to_ignore_on_save: | ||
if ignore_key in peft_multi_adapter_state_dict[adapter_name].keys(): | ||
del peft_multi_adapter_state_dict[adapter_name][ignore_key] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked ChatGPT to make this block more elegant, here is what it came up with:
if not _peft_save_multi_adapter:
state_dict = {k: v for k, v in state_dict.items() if k not in self._keys_to_ignore_on_save}
else:
peft_multi_adapter_state_dict = {
adapter_name: {k: v for k, v in adapter.items() if k not in self._keys_to_ignore_on_save}
for adapter_name, adapter in peft_multi_adapter_state_dict.items()
}
WDYT? :)
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
@younesbelkada What's the status? |
@BenjaminBossan I think this might be too much of an edge case for the work it requires, I propose to keep that PR as open and if some interest arises from the community I'll work on it |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
What does this PR do?
To be potentially merged after #26407
This PR adds the multi-adapter support for
save_pretrained
to be consistent with PEFT API that saves all adapters when callingsave_pretrained
. Note the default adapter is always saved in the root directory ofsave_directory
.cc @LysandreJik @BenjaminBossan @pacman100