Add a const to easily tweak the dtype used by llama #47

LaurentMazare · 2023-06-30T14:02:33Z

…tions.

* Add q4k quantization with imatrix * Sketch some imatrix generation * Fixes * Add quantize_imatrix_onto * Support loading the imatrix file * Fix load_imatrix * Implement imatrix quantization for q2k * Implement imatrix quantization for q3k * Fix build on cuda * Add imatrix q5k, q6k quants

Add a const to easily tweak the dtype used for llama internal computa…

ed4d095

…tions.

LaurentMazare merged commit dbd7d5b into main Jun 30, 2023

LaurentMazare deleted the llama-f32 branch August 15, 2023 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a const to easily tweak the dtype used by llama #47

Add a const to easily tweak the dtype used by llama #47

LaurentMazare commented Jun 30, 2023

Add a const to easily tweak the dtype used by llama #47

Add a const to easily tweak the dtype used by llama #47

Conversation

LaurentMazare commented Jun 30, 2023