From 603a6617ce31a8104775cceab6fc08e28ba9b924 Mon Sep 17 00:00:00 2001 From: "Dong, Bo" Date: Mon, 4 Mar 2024 16:26:09 +0800 Subject: [PATCH] Update supported_models.md (#150) * Update supported_models.md --- docs/supported_models.md | 64 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 59 insertions(+), 5 deletions(-) diff --git a/docs/supported_models.md b/docs/supported_models.md index 84c443010..05895a9f1 100644 --- a/docs/supported_models.md +++ b/docs/supported_models.md @@ -7,17 +7,19 @@ Neural Speed supports the following models: Model Name - INT8 - INT4 + INT8 + INT4 Transformer Version RTN GPTQ AWQ + AutoRound RTN GPTQ AWQ + AutoRound @@ -31,6 +33,8 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ Latest @@ -42,6 +46,8 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ Latest CodeLlama-7b @@ -51,6 +57,8 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ Latest @@ -61,16 +69,20 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ Latest GPT-J-6B ✅ - ✅ + + ✅ + Latest @@ -78,9 +90,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -88,9 +102,11 @@ Neural Speed supports the following models: ✅ + ✅ + 4.28.1 or newer @@ -99,9 +115,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -110,9 +128,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -120,9 +140,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -132,9 +154,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -146,6 +170,8 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ Latest @@ -154,9 +180,11 @@ Neural Speed supports the following models: ✅ + ✅ + 4.33.1 @@ -165,9 +193,11 @@ Neural Speed supports the following models: ✅ + ✅ + 4.33.1 @@ -176,9 +206,11 @@ Neural Speed supports the following models: ✅ + ✅ + 4.36.0 or newer @@ -187,9 +219,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -199,9 +233,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -213,9 +249,11 @@ Neural Speed supports the following models: ✅ + ✅ + Latest @@ -227,15 +265,19 @@ Neural Speed supports the following models: Model Name - INT8 - INT4 + INT8 + INT4 Transformer Version RTN GPTQ + AWQ + AutoRound RTN GPTQ + AWQ + AutoRound @@ -246,6 +288,10 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ + ✅ + ✅ Latest @@ -254,6 +300,10 @@ Neural Speed supports the following models: ✅ ✅ ✅ + ✅ + ✅ + ✅ + ✅ Latest @@ -262,8 +312,12 @@ Neural Speed supports the following models: StarCoder-15.5B ✅ + + ✅ + + Latest