Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ernie-3.0 mkldnn fp32 and int8 support #2468

Merged
merged 4 commits into from
Jun 13, 2022

Conversation

lidanqing-intel
Copy link
Contributor

@lidanqing-intel lidanqing-intel commented Jun 9, 2022

PR types

Performance optimization

PR changes

Others

Description

Add ernie-3.0 fp32 and int8 mkldnn support. The Paddle need to be after Ernie-3.0 int8 fix #43297 [Bug fix] Do not quantize weights Y when matmul X and Y both other ops outputs.

@lidanqing-intel lidanqing-intel changed the title add ernie-3.0 mkldnn fp32 and int8 support Add ernie-3.0 mkldnn fp32 and int8 support Jun 9, 2022
@lidanqing-intel
Copy link
Contributor Author

Hi, @yeliang2258 please review and merge this PR. Now with this PR, both fp32 and int8 works, without using save_quant_model.py. Thanks.

@lidanqing-intel
Copy link
Contributor Author

lidanqing-intel commented Jun 9, 2022

Paddle: 0d719718b308587efcb6b3547f925582a8009176

Ernie-3.0 FP32 mkldnn, 1 thread on ICX is 65.45 QPS

python infer.py --task_name tnews --model_path /home/guest/PaddleNLP/model_zoo/ernie-3.0/ernie-3.0/float32--perf --device cpu --num_threads 1

Ernie-3.0 INT8 mkldnn, 1 thread on ICX is 153.77 QPS

python infer.py --task_name tnews --model_path /home/guest/PaddleNLP/model_zoo/ernie-3.0/ernie-3.0/int8  --perf --device cpu --num_threads 1 --enable_quantize

@ZeyuChen ZeyuChen requested a review from yeliang2258 June 10, 2022 00:32
@ZeyuChen ZeyuChen self-assigned this Jun 10, 2022
Copy link
Contributor

@yeliang2258 yeliang2258 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lidanqing-intel
Copy link
Contributor Author

lidanqing-intel commented Jun 13, 2022

Hi @ZeyuChen could you please merge this PR? @yeliang2258 has approved.
This docs/readthedocs CI failed, Could you please suggest what I should do to make it pass? Thanks!

Copy link
Member

@ZeyuChen ZeyuChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ZeyuChen ZeyuChen merged commit d338928 into PaddlePaddle:develop Jun 13, 2022
@ZeyuChen
Copy link
Member

@lidanqing-intel Thanks for your contributions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants