Skip to content
This repository has been archived by the owner on Aug 7, 2024. It is now read-only.

[bc breaking] change x, w, dL_dY variable names to input, weight, grad_output #323

Closed
wants to merge 4 commits into from

Conversation

vkuzo
Copy link
Contributor

@vkuzo vkuzo commented Jul 22, 2024

Stack from ghstack (oldest at bottom):

Summary:

The following naming scheme matches the rest of PyTorch better:

// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output

This PR changes all the previous references to x, w, dL_dY to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

Differential Revision: D60072596

…d_output

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 22, 2024
…d_output

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 5d1c7063d98ef2adebdb14b1757cc349d41e3020
Pull Request resolved: #323
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 22, 2024
…weight, grad_output"

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 22, 2024
…d_output

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 52683b06c816b8ab1cdef5142d5a7eaea1a9e0f2
Pull Request resolved: #323
@@ -23,4 +23,7 @@ jobs:
pip install black==23.3.0 usort==1.0.6 ufmt==2.1.0 libcst==1.0.1
- name: Analyzing the code with ufmt
run: |
ufmt format .
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

L26:28 is for easier debugging of differences between local machine and CI ufmt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the ufmt config different than CI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed this on a couple of PRs in the past. A better fix would be to align ufmt versions + env, but regardless this is useful for debugging.

…weight, grad_output"

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 22, 2024
…d_output

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: d1d6ffbcc37bcf1b39709e50156d6018120a2261
Pull Request resolved: #323
…weight, grad_output"

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 22, 2024
…d_output

Summary:

The following naming scheme matches the rest of PyTorch better:

```
// forward
output = input @ weight_t
// backward
grad_input = grad_output @ weight
grad_weight = input_t @ grad_output
```

This PR changes all the previous references to `x`, `w`, `dL_dY` to
match the naming scheme above.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

ghstack-source-id: 37a6a51987b55cd894b70fa54e6b8669f48ccb47
Pull Request resolved: #323
ctx.save_for_backward(fp8_amax_dL_dY, fp8_amax_history_dL_dY, fp8_scale_dL_dY)
ctx.save_for_backward(
fp8_amax_grad_output, fp8_amax_history_grad_output, fp8_scale_grad_output
)
ctx.scale_fn_name = scale_fn_name
ctx.is_amax_initialized = is_amax_initialized
ctx.linear_mm_config = linear_mm_config
return tensor

@staticmethod
def backward(ctx, go):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we update go? This one has confused me in the past

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this PR only contains the user-facing part since that's the most crucial. LOC is already high, so IMO better to rename the non-user facing things in subsequent PRs, to keep things smaller and more reviewable.

Copy link
Contributor

@drisspg drisspg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks!

@vkuzo
Copy link
Contributor Author

vkuzo commented Jul 22, 2024

@vkuzo has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 603efc2.

vkuzo added a commit that referenced this pull request Jul 25, 2024
Summary:

In #323 we
changed the user facing variable notation from `x/w/dL_dY` to
`input/weight/grad_output`.

This PR follows up by changing most of the internal variables to also match
the new notation, to reduce confusion.

Test Plan:

```
./test/test_everything.sh
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 25, 2024
…rad_output notation"

Summary:

In #323 we
changed the user facing variable notation from `x/w/dL_dY` to
`input/weight/grad_output`.

This PR follows up by changing most of the internal variables to also match
the new notation, to reduce confusion.

Test Plan:

```
./test/test_everything.sh
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
vkuzo added a commit that referenced this pull request Jul 25, 2024
…ion"

Summary:

In #323 we
changed the user facing variable notation from `x/w/dL_dY` to
`input/weight/grad_output`.

This PR follows up by changing most of the internal variables to also match
the new notation, to reduce confusion.

Test Plan:

```
./test/test_everything.sh
```

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]
facebook-github-bot pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Pull Request resolved: #335

In #323 we
changed the user facing variable notation from `x/w/dL_dY` to
`input/weight/grad_output`.

This PR follows up by changing most of the internal variables to also match
the new notation, to reduce confusion.

Reviewed By: weifengpy

Differential Revision: D60252071

fbshipit-source-id: b91ec5b975df550962418eafc93f1904d64a3dd8
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants