【AutoParallel】Promote fuselinear pass in auto-parallel #59188

heavyrain-lzy · 2023-11-21T03:01:19Z

PR types

Performance optimization

PR changes

Others

Description

Pcard-76459
Add a pass before the fused_linear pass to promote the performance. This pass mainly solve the following scenario when enabling the MP or TP parallelism:

The origin linear operator as follows:

matmul --> add

After enabling MPor TP, some linear operators may become as follows:

matmul --> comm_op --> add

The communication operator prohibits the matmul and add from being fused.

experiment

Take the experiment on the GPT-3 with 6.7B parameters using the single host with 8 V100 GPUs:

strategy	tokens/s	speed change
no-FusedLinear	2166	BaseLine
FusedLinear	2203	1.71%
FusedLinear-Promotion	2215	2.26%

paddle-bot · 2023-11-21T03:01:24Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

zhiqiu

Add an unittest

heavyrain-lzy · 2023-12-04T03:32:45Z

Add an unittest

The unittest 'test_auto_parallel_fused_linear_promotion_pass.py' has been added. But the machine of CI/CE can't run the test, because the 'fused_gemm_epilogue' requires the CUDA11.6 at least. I have run the test by myself:

zhiqiu

LGTM

XiaoguangHu01

LGTM

heavyrain-lzy added 3 commits November 20, 2023 17:18

add fused_linear_promotion pass

f53b3c4

merge develop

af41b2e

add promote_fusedlinear pass

6400677

heavyrain-lzy added 7 commits November 21, 2023 20:49

support sp without dp

576dde0

delete some log

ef6acc9

Merge remote-tracking branch 'upstream/develop' into promote_fuselinear

30e8100

fix bug in process_mesh

b9703a4

add sp+dp support

d7f9826

fix bug when dp_group is None

5311717

modify code according to review

0c79561

zhiqiu reviewed Nov 30, 2023

View reviewed changes

heavyrain-lzy added 3 commits December 1, 2023 16:06

add unit_test

2c98caf

add unit_test

146c0a8

fix the test

7233d7d

zhiqiu approved these changes Dec 5, 2023

View reviewed changes

XiaoguangHu01 approved these changes Dec 6, 2023

View reviewed changes

heavyrain-lzy merged commit 77f80a5 into PaddlePaddle:develop Dec 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【AutoParallel】Promote fuselinear pass in auto-parallel #59188

【AutoParallel】Promote fuselinear pass in auto-parallel #59188

heavyrain-lzy commented Nov 21, 2023 •

edited

Loading

paddle-bot bot commented Nov 21, 2023

zhiqiu left a comment

heavyrain-lzy commented Dec 4, 2023

zhiqiu left a comment

XiaoguangHu01 left a comment

【AutoParallel】Promote fuselinear pass in auto-parallel #59188

【AutoParallel】Promote fuselinear pass in auto-parallel #59188

Conversation

heavyrain-lzy commented Nov 21, 2023 • edited Loading

PR types

PR changes

Description

experiment

paddle-bot bot commented Nov 21, 2023

zhiqiu left a comment

Choose a reason for hiding this comment

heavyrain-lzy commented Dec 4, 2023

zhiqiu left a comment

Choose a reason for hiding this comment

XiaoguangHu01 left a comment

Choose a reason for hiding this comment

heavyrain-lzy commented Nov 21, 2023 •

edited

Loading