Optimize Inference Performance on CPU #1035

carter54 · 2019-12-04T08:37:02Z

Description

the news in https://github.com/dmlc/gluon-nlp/releases/tag/v0.8.1 shows BERT int8 quantization is presented in blog
https://medium.com/apache-mxnet/optimization-for-bert-inference-performance-on-cpu-3bb2413d376c
But the blog only shows some results of BERT quantization test,

The work on low precision deployment is still ongoing and involves un-released SW, the reproduction instructions will be available later.

When will this work be released and can we apply this quantization method on GPT2?

Thanks a lot for the great work!

leezu · 2019-12-04T08:41:42Z

@TaoLv

TaoLv · 2019-12-09T07:58:36Z

Sorry for missing the message. We're working on cleaning the code and solution. Hope we can have a PR soon. I'm not familiar with the status of GPT2 in GluonNLP. Could you please point me to the scripts and whether it can be exported as a static model?

leezu · 2019-12-09T10:18:31Z

Yes, recently static GPT2 model is supported: #1010

carter54 · 2019-12-12T08:28:46Z

Thanks for the replies. @leezu @TaoLv
Looking forward to try int8 bert and gpt2 soon~

TaoLv · 2020-02-02T03:39:08Z

@carter54 FYI, here is the PR for BERT quantization: #1080

carter54 · 2020-02-13T08:39:04Z

@TaoLv Thx for the work, can this method be applied to GPT 2 model?

carter54 added the enhancement New feature or request label Dec 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize Inference Performance on CPU #1035

Optimize Inference Performance on CPU #1035

carter54 commented Dec 4, 2019 •

edited

Loading

leezu commented Dec 4, 2019

TaoLv commented Dec 9, 2019

leezu commented Dec 9, 2019

carter54 commented Dec 12, 2019

TaoLv commented Feb 2, 2020

carter54 commented Feb 13, 2020 •

edited

Loading

Optimize Inference Performance on CPU #1035

Optimize Inference Performance on CPU #1035

Comments

carter54 commented Dec 4, 2019 • edited Loading

Description

leezu commented Dec 4, 2019

TaoLv commented Dec 9, 2019

leezu commented Dec 9, 2019

carter54 commented Dec 12, 2019

TaoLv commented Feb 2, 2020

carter54 commented Feb 13, 2020 • edited Loading

carter54 commented Dec 4, 2019 •

edited

Loading

carter54 commented Feb 13, 2020 •

edited

Loading