Skip to content

Commit

Permalink
Convert f32 tensors to f16 as well.
Browse files Browse the repository at this point in the history
  • Loading branch information
ycros committed Jan 29, 2024
1 parent 4970f35 commit 4b93970
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -9539,10 +9539,16 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
quantize = true;
}


enum ggml_type new_type;
void * new_data;
size_t new_size;

if (tensor->type == GGML_TYPE_F32) {
quantize = true;
new_type = GGML_TYPE_F16;
}

if (quantize) {
new_type = quantized_type;
if (!params->pure) {
Expand Down

0 comments on commit 4b93970

Please sign in to comment.