Add ernie-3.0 mkldnn fp32 and int8 support #2468

lidanqing-intel · 2022-06-09T17:06:59Z

PR types

Performance optimization

PR changes

Others

Description

Add ernie-3.0 fp32 and int8 mkldnn support. The Paddle need to be after Ernie-3.0 int8 fix #43297 [Bug fix] Do not quantize weights Y when matmul X and Y both other ops outputs.

lidanqing-intel · 2022-06-09T17:08:24Z

Hi, @yeliang2258 please review and merge this PR. Now with this PR, both fp32 and int8 works, without using save_quant_model.py. Thanks.

lidanqing-intel · 2022-06-09T17:11:30Z

Paddle: 0d719718b308587efcb6b3547f925582a8009176

Ernie-3.0 FP32 mkldnn, 1 thread on ICX is 65.45 QPS

python infer.py --task_name tnews --model_path /home/guest/PaddleNLP/model_zoo/ernie-3.0/ernie-3.0/float32--perf --device cpu --num_threads 1

Ernie-3.0 INT8 mkldnn, 1 thread on ICX is 153.77 QPS

python infer.py --task_name tnews --model_path /home/guest/PaddleNLP/model_zoo/ernie-3.0/ernie-3.0/int8  --perf --device cpu --num_threads 1 --enable_quantize

yeliang2258

LGTM

lidanqing-intel · 2022-06-13T04:19:50Z

Hi @ZeyuChen could you please merge this PR? @yeliang2258 has approved.
This docs/readthedocs CI failed, Could you please suggest what I should do to make it pass? Thanks!

ZeyuChen

LGTM

ZeyuChen · 2022-06-13T04:39:30Z

@lidanqing-intel Thanks for your contributions!

add ernie-3.0 mkldnn fp32 and int8 support

59d6067

lidanqing-intel changed the title ~~add ernie-3.0 mkldnn fp32 and int8 support~~ Add ernie-3.0 mkldnn fp32 and int8 support Jun 9, 2022

fix code style

b977edb

ZeyuChen requested a review from yeliang2258 June 10, 2022 00:32

ZeyuChen self-assigned this Jun 10, 2022

lidanqing-intel added 2 commits June 10, 2022 03:00

fix doc

b034a7a

update

65a4fd4

lidanqing-intel force-pushed the develop-ernie-3.0 branch from 3650582 to 65a4fd4 Compare June 10, 2022 03:08

yeliang2258 approved these changes Jun 10, 2022

View reviewed changes

ZeyuChen approved these changes Jun 13, 2022

View reviewed changes

ZeyuChen merged commit d338928 into PaddlePaddle:develop Jun 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ernie-3.0 mkldnn fp32 and int8 support #2468

Add ernie-3.0 mkldnn fp32 and int8 support #2468

lidanqing-intel commented Jun 9, 2022 •

edited

Loading

lidanqing-intel commented Jun 9, 2022

lidanqing-intel commented Jun 9, 2022 •

edited

Loading

yeliang2258 left a comment

lidanqing-intel commented Jun 13, 2022 •

edited

Loading

ZeyuChen left a comment

ZeyuChen commented Jun 13, 2022

Add ernie-3.0 mkldnn fp32 and int8 support #2468

Add ernie-3.0 mkldnn fp32 and int8 support #2468

Conversation

lidanqing-intel commented Jun 9, 2022 • edited Loading

PR types

PR changes

Description

lidanqing-intel commented Jun 9, 2022

lidanqing-intel commented Jun 9, 2022 • edited Loading

yeliang2258 left a comment

Choose a reason for hiding this comment

lidanqing-intel commented Jun 13, 2022 • edited Loading

ZeyuChen left a comment

Choose a reason for hiding this comment

ZeyuChen commented Jun 13, 2022

lidanqing-intel commented Jun 9, 2022 •

edited

Loading

lidanqing-intel commented Jun 9, 2022 •

edited

Loading

lidanqing-intel commented Jun 13, 2022 •

edited

Loading