Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run sdpa with dtensor #163

Closed
wants to merge 2 commits into from
Closed

Conversation

tianyu-l
Copy link
Contributor

@tianyu-l tianyu-l commented Mar 26, 2024

Stack from ghstack (oldest at bottom):

There are several caveats:

  1. FSDP + TP uses bfloat16 as default param dtype, which seems to create minor numerical discrepancies. I'm still investigating if the discrepancies are caused by collectives or dtensor.
  2. Flash attention only supports bfloat16 and float16.
    (1) if dtype is torch.float32, efficient attention will be used, which is not supported by DTensor yet (will add this);
    (2) if dtype is torch.float64, decomposed attention will be used, which is not compatible with DTensor due to the usage of intermediate torch.Tensor mask.
  3. 1D TP only parallelism would fail with this PR by default for the reason mentioned in 2.(1).

[ghstack-poisoned]
tianyu-l added a commit that referenced this pull request Mar 26, 2024
ghstack-source-id: 156c5bb2603373647b643ebb40599865cb9aa08c
Pull Request resolved: #163
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Mar 26, 2024
tianyu-l added a commit that referenced this pull request Mar 26, 2024
ghstack-source-id: 873078b338c59870c8525fbc9f6793532111ab16
Pull Request resolved: #163
@tianyu-l tianyu-l requested a review from drisspg March 26, 2024 20:18
Copy link
Contributor

@drisspg drisspg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, nice cleanup!

@tianyu-l
Copy link
Contributor Author

Closing this PR and use #180 instead to work with FSDP2 changes.

@tianyu-l tianyu-l closed this Mar 30, 2024
tianyu-l added a commit that referenced this pull request Aug 16, 2024
ghstack-source-id: 873078b338c59870c8525fbc9f6793532111ab16
Pull Request resolved: #163
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants