Is llama2 model finetuned on all three stages? #14

Yang-bug-star · 2024-03-03T02:34:36Z

In the original paper, it claims that in the first phase, all parameters, with the exception of those associated with the Multi-modal Understanding Adapters, undergo freezing. In my understanding, llama2 should only be fine-tuned in the third stage, but in the code it seems that llama2 is fine-tuned in lora in all three stages because llama and lora appear in the trainable parameter names in all three stages of the 'get_trainable_params' function

Yang-bug-star · 2024-03-03T08:59:27Z

Also, the paper says that in the final training stage, the LoRA training strategy is employed to train the LLaMA 2 model, concurrently finetuning the Multi-modal Understanding Adapter and Output Projection layer. But the parameters of output projection layer and adapter are not trained in the third stage in the given code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is llama2 model finetuned on all three stages? #14

Is llama2 model finetuned on all three stages? #14

Yang-bug-star commented Mar 3, 2024

Yang-bug-star commented Mar 3, 2024

Is llama2 model finetuned on all three stages? #14

Is llama2 model finetuned on all three stages? #14

Comments

Yang-bug-star commented Mar 3, 2024

Yang-bug-star commented Mar 3, 2024