Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

[Compression V2] Movement pruning #4308

Merged
merged 21 commits into from
Nov 29, 2021
Merged

Conversation

J-shang
Copy link
Contributor

@J-shang J-shang commented Nov 11, 2021

Description

Implement movement pruning in this paper.
https://arxiv.org/abs/2005.07683

Checklist

  • test case
  • doc

How to test

@liuzhe-lz liuzhe-lz mentioned this pull request Nov 12, 2021
86 tasks
@J-shang J-shang marked this pull request as ready for review November 18, 2021 06:12
---------------

Movement pruner is an implementation of movement pruning.
This is a pruning by step algorithm, the masks may change during each step.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is "step algorithm"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to say pruning by step algorithm... means this pruner will generate and apply masks during each optimizer step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think every pruner is pruning step by step? What's the concrete meaning of step here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yes, it is easy to be misunderstood here, this means after each optimizer.step(), the model will be applied a new mask. I will update the docstring later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to
This is a "fine-pruning" algorithm, which means the masks may change during each fine-tuning step.


class PrunerScoredModuleWrapper(Module):
"""
Wrap an module to enable data parallel, forward method customization and buffer registeration.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

an -> a

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix it


# ignore the parameters with `weight_score` in name if you want to finetune with masks
optimizer_grouped_parameters = [{
"params": [p for n, p in model.named_parameters() if "weight_score" not in n and p.requires_grad]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is weight_score for? Can we handle this automatically so that user don't have to modify the optimizer manually? Besides, whether weight_score limits our appliable scenario to a specific implement version/repo of transformer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

weight_score is register in wrapper as a parameter, it is the sum of - weight * weight_grad.
It's OK that user directly use optimizer = Adam(model.named_parameters(), lr=2e-5), just some computing resources were wasted. But it's a good idea that we handle this automatically, I will try this.

weight_score will not limit our appliable scenario, all module that has weight can use this pruner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle this automatically so that user don't have to modify the optimizer manually?

fix it

@liuzhe-lz liuzhe-lz closed this Nov 26, 2021
@liuzhe-lz liuzhe-lz reopened this Nov 26, 2021
@liuzhe-lz liuzhe-lz merged commit 1eced0a into microsoft:master Nov 29, 2021
@J-shang J-shang deleted the movement-pruning branch December 15, 2021 03:14
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants