[AutoParallel] Add test for PIR refined recompute #9679

waliwali777 · 2024-12-23T15:16:18Z

PR types

Others

PR changes

Others

Description

Judge refined_ops_patterns training args.
Add test for PIR refined recompute.

在 yaml 文件中的调用举例：

 enable_auto_parallel: 1
 to_static: 1
 recompute: 1
 refined_ops_patterns:
        - {main_ops: [flash_attn], num: -1, pre_ops: [], suf_ops: []}
        - {main_ops: [matmul], num: 2, pre_ops: [], suf_ops: [add]}

在python命令行参数中的调用举例

python -u -m paddle.distributed.launch \
                    --gpus "0,1,2,3" \
                    --log_dir $case_log_dir \
                    run_pretrain_auto.py \
                    --to_static \
                    --enable_auto_parallel 1 \ 
                    --recompute 1 \
                    --refined_ops_patterns '[{"main_ops":["matmul"],"num":-1,"pre_ops":["softmax"],"suf_ops":[]}]' \
                   ...

paddle-bot · 2024-12-23T15:16:28Z

Thanks for your contribution!

codecov · 2024-12-23T15:49:34Z

Codecov Report

Attention: Patch coverage is 40.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 52.49%. Comparing base (a52035f) to head (57da8d7).
Report is 3 commits behind head on develop.

Files with missing lines	Patch %	Lines
paddlenlp/trainer/auto_training_args.py	33.33%	6 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #9679      +/-   ##
===========================================
+ Coverage    52.46%   52.49%   +0.03%     
===========================================
  Files          727      723       -4     
  Lines       115028   114327     -701     
===========================================
- Hits         60353    60020     -333     
+ Misses       54675    54307     -368

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ZHUI · 2025-01-06T03:00:19Z

paddlenlp/trainer/training_args.py

@@ -762,6 +762,7 @@ class TrainingArguments:
            "If a parameter is omitted, it defaults to `xxx:0`."
        },
    )
+    refined_ops_patterns: str = field(default=None, metadata={"help": "The pattern of refined recompute."})


如果这个暂时不能合并的话，要不先放到 paddlenlp/trainer/auto_training_args.py 中？

ZHUI

LGTM

waliwali777 force-pushed the add_refined_recompute_test branch 2 times, most recently from 72ea11c to 8b31f4b Compare December 27, 2024 03:30

waliwali777 mentioned this pull request Dec 27, 2024

[PIR-Auto-Parallel]refactor refined recompute pass in PIR mode PaddlePaddle/Paddle#70064

Merged

waliwali777 force-pushed the add_refined_recompute_test branch 3 times, most recently from 3226de9 to 04cf542 Compare January 3, 2025 07:14

ZHUI reviewed Jan 6, 2025

View reviewed changes

waliwali777 added 3 commits January 6, 2025 12:00

Add test for refine recompute

385d7b8

move refined_ops_patterns flag to auto training args

7601ed5

fix

57da8d7

waliwali777 force-pushed the add_refined_recompute_test branch from 441bf79 to 57da8d7 Compare January 6, 2025 04:03

ZHUI approved these changes Jan 6, 2025

View reviewed changes

ZHUI merged commit 7e06e02 into PaddlePaddle:develop Jan 6, 2025
8 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoParallel] Add test for PIR refined recompute #9679

[AutoParallel] Add test for PIR refined recompute #9679

waliwali777 commented Dec 23, 2024 •

edited

Loading

paddle-bot bot commented Dec 23, 2024

codecov bot commented Dec 23, 2024 •

edited

Loading

ZHUI Jan 6, 2025

waliwali777 Jan 6, 2025

ZHUI left a comment

[AutoParallel] Add test for PIR refined recompute #9679

[AutoParallel] Add test for PIR refined recompute #9679

Conversation

waliwali777 commented Dec 23, 2024 • edited Loading

PR types

PR changes

Description

paddle-bot bot commented Dec 23, 2024

codecov bot commented Dec 23, 2024 • edited Loading

Codecov Report

ZHUI Jan 6, 2025

Choose a reason for hiding this comment

waliwali777 Jan 6, 2025

Choose a reason for hiding this comment

ZHUI left a comment

Choose a reason for hiding this comment

waliwali777 commented Dec 23, 2024 •

edited

Loading

codecov bot commented Dec 23, 2024 •

edited

Loading