train : fix KQ_pos allocation #3392

ggerganov · 2023-09-29T08:52:41Z

#3228 changes seem to have broken the train examples. I think this should fix it

xaedes · 2023-09-29T13:48:54Z

I am currently letting it run a test finetune & train to see if it actually works, but from looking at it I think this should be correct.

On a sitenote, a function ggml_range(ctx, dtype, start, stop, step) analog to numpy.arange would be nice to have for filling a tensor with a sequence of values.
It is a obvious basic primitive for array/tensor processing libs and would avoid the need for this manual for loop.

struct ggml_tensor * KQ_pos = ggml_new_tensor_1d(ctx, GGML_TYPE_I32, N);
ggml_allocr_alloc(alloc, KQ_pos);
if (!ggml_allocr_is_measure(alloc)) {
    int * data = (int *) KQ_pos->data;
    for (int i = 0; i < N; ++i) {
        data[i] = n_past + i;
    }
}

would then just look like this:

struct ggml_tensor * KQ_pos = ggml_range(ctx, GGML_TYPE_I32, 0, N, 1);
ggml_allocr_alloc(alloc, KQ_pos);

ggerganov · 2023-09-29T13:53:53Z

Yup, ggml_range() is a great idea - we will add it

xaedes · 2023-09-29T13:56:33Z

Yup, ggml_range() is a great idea - we will add it

Ok, after testing the train & finetune from this PR here, I will make a PR for ggml_range.

slaren · 2023-09-29T14:04:17Z

ggml_range looks useful, but as it is now, implementing it would require adding an implementation in every backend.

xaedes

Making sure that KQ_pos is not reallocated was missing in finetune.

Performed some finetune and train tests, the results indicate that it works.

…example * 'master' of github.com:ggerganov/llama.cpp: ggml-cuda : perform cublas mat mul of quantized types as f16 (ggerganov#3412) llama.cpp : add documentation about rope_freq_base and scale values (ggerganov#3401) train : fix KQ_pos allocation (ggerganov#3392) llama : quantize up to 31% faster on Linux and Windows with mmap (ggerganov#3206) readme : update hot topics + model links (ggerganov#3399) readme : add link to grammars app (ggerganov#3388) swift : fix build on xcode 15 (ggerganov#3387) build : enable more non-default compiler warnings (ggerganov#3200) ggml_tensor: update the structure comments. (ggerganov#3283) ggml : release the requested thread pool resource (ggerganov#3292) llama.cpp : split llama_context_params into model and context params (ggerganov#3301) ci : multithreaded builds (ggerganov#3311) train : finetune LORA (ggerganov#2632) gguf : basic type checking in gguf_get_* (ggerganov#3346) gguf : make token scores and types optional (ggerganov#3347) ci : disable freeBSD builds due to lack of VMs (ggerganov#3381) llama : custom attention mask + parallel decoding + no context swaps (ggerganov#3228) docs : mark code as Bash (ggerganov#3375) readme : add Mistral AI release 0.1 (ggerganov#3362) ggml-cuda : perform cublas fp16 matrix multiplication as fp16 (ggerganov#3370)

* train : fix KQ_pos allocation * make sure KQ_pos is not reallocated in finetune --------- Co-authored-by: xaedes <[email protected]>

train : fix KQ_pos allocation

70e4a99

ggerganov requested a review from xaedes September 29, 2023 08:52

xaedes approved these changes Sep 29, 2023

View reviewed changes

make sure KQ_pos is not reallocated in finetune

1eb4de0

ggerganov merged commit bc34dd4 into master Sep 29, 2023

ggerganov deleted the train-fix-kq-pos branch September 29, 2023 16:05

yusiwen pushed a commit to yusiwen/llama.cpp that referenced this pull request Oct 7, 2023

train : fix KQ_pos allocation (ggerganov#3392)

90cd73c

* train : fix KQ_pos allocation * make sure KQ_pos is not reallocated in finetune --------- Co-authored-by: xaedes <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

train : fix KQ_pos allocation #3392

train : fix KQ_pos allocation #3392

ggerganov commented Sep 29, 2023

xaedes commented Sep 29, 2023 •

edited

Loading

ggerganov commented Sep 29, 2023

xaedes commented Sep 29, 2023

slaren commented Sep 29, 2023

xaedes left a comment •

edited

Loading

train : fix KQ_pos allocation #3392

train : fix KQ_pos allocation #3392

Conversation

ggerganov commented Sep 29, 2023

xaedes commented Sep 29, 2023 • edited Loading

ggerganov commented Sep 29, 2023

xaedes commented Sep 29, 2023

slaren commented Sep 29, 2023

xaedes left a comment • edited Loading

Choose a reason for hiding this comment

xaedes commented Sep 29, 2023 •

edited

Loading

xaedes left a comment •

edited

Loading