Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Multi-adapter saving support for PEFT #26411

Closed

Conversation

younesbelkada
Copy link
Contributor

What does this PR do?

To be potentially merged after #26407
This PR adds the multi-adapter support for save_pretrained to be consistent with PEFT API that saves all adapters when calling save_pretrained. Note the default adapter is always saved in the root directory of save_directory.

cc @LysandreJik @BenjaminBossan @pacman100

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update. In general, LGTM. I have only some minor comments, no blockers.

Gets the current active adapters of the model. In case of multi-adapter inference (combining multiple adapters
for inference) returns the list of all active adapters so that users can deal with them accordingly.

For previous PEFT versions (that does not support multi-adapter inference), `module.active_adapter` will return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For previous PEFT versions (that does not support multi-adapter inference), `module.active_adapter` will return
For previous PEFT versions (that do not support multi-adapter inference), `module.active_adapter` will return

I think this statement is a bit confusing: This method always returns a list of str, right? The statement seems to relate to what older PEFT versions do under the hood, but that's an implementation detail and should not be mentioned in the docstring (but you could add this as a comment in the code).


def active_adapter(self) -> str:
def active_adapters(self) -> List[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, .active_adapters is a method, in PEFT, it's a property. Since we newly introduce this method in transformers, do we want to take the opportunity to make it property for consistency? The downside is that active_adapter is a method here, not a property, so it would be inconsistent with that method. We could change it too, but that would be BC breaking.

break

# For previous PEFT versions
if isinstance(active_adapters, str):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have no active adapter at all or is this prevented at some point? Otherwise, active_adapters could be undefined here.

Comment on lines +2025 to +2028
if len(peft_multi_adapter_state_dict.keys()) == 1:
current_adapter = list(peft_multi_adapter_state_dict.keys())[0]
state_dict = peft_multi_adapter_state_dict[current_adapter].copy()
peft_multi_adapter_state_dict = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this change, which I think makes the intent a bit more obvious and avoid changing the type of peft_multi_adapter_state_dict:

Suggested change
if len(peft_multi_adapter_state_dict.keys()) == 1:
current_adapter = list(peft_multi_adapter_state_dict.keys())[0]
state_dict = peft_multi_adapter_state_dict[current_adapter].copy()
peft_multi_adapter_state_dict = None
if len(peft_multi_adapter_state_dict.keys()) == 1:
current_adapter = list(peft_multi_adapter_state_dict.keys())[0]
state_dict = peft_multi_adapter_state_dict.pop(current_adapter).copy()

Not sure if the .copy() is needed?

Then, below change:

- _peft_save_multi_adapter = _hf_peft_config_loaded and peft_multi_adapter_state_dict is not None
+ _peft_save_multi_adapter = _hf_peft_config_loaded and peft_multi_adapter_state_dict

for adapter_name in peft_multi_adapter_state_dict:
for ignore_key in self._keys_to_ignore_on_save:
if ignore_key in peft_multi_adapter_state_dict[adapter_name].keys():
del peft_multi_adapter_state_dict[adapter_name][ignore_key]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked ChatGPT to make this block more elegant, here is what it came up with:

if not _peft_save_multi_adapter:
    state_dict = {k: v for k, v in state_dict.items() if k not in self._keys_to_ignore_on_save}
else:
    peft_multi_adapter_state_dict = {
        adapter_name: {k: v for k, v in adapter.items() if k not in self._keys_to_ignore_on_save}
        for adapter_name, adapter in peft_multi_adapter_state_dict.items()
    }

WDYT? :)

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@BenjaminBossan
Copy link
Member

@younesbelkada What's the status?

@younesbelkada
Copy link
Contributor Author

@BenjaminBossan I think this might be too much of an edge case for the work it requires, I propose to keep that PR as open and if some interest arises from the community I'll work on it

Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants