You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
Sorry for missing the message. We're working on cleaning the code and solution. Hope we can have a PR soon. I'm not familiar with the status of GPT2 in GluonNLP. Could you please point me to the scripts and whether it can be exported as a static model?
Description
the news in https://github.com/dmlc/gluon-nlp/releases/tag/v0.8.1 shows BERT int8 quantization is presented in blog
https://medium.com/apache-mxnet/optimization-for-bert-inference-performance-on-cpu-3bb2413d376c
But the blog only shows some results of BERT quantization test,
When will this work be released and can we apply this quantization method on GPT2?
Thanks a lot for the great work!
The text was updated successfully, but these errors were encountered: