You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the original paper, it claims that in the first phase, all parameters, with the exception of those associated with the Multi-modal Understanding Adapters, undergo freezing. In my understanding, llama2 should only be fine-tuned in the third stage, but in the code it seems that llama2 is fine-tuned in lora in all three stages because llama and lora appear in the trainable parameter names in all three stages of the 'get_trainable_params' function
The text was updated successfully, but these errors were encountered:
Also, the paper says that in the final training stage, the LoRA training strategy is employed to train the LLaMA 2 model, concurrently finetuning the Multi-modal Understanding Adapter and Output Projection layer. But the parameters of output projection layer and adapter are not trained in the third stage in the given code.
In the original paper, it claims that in the first phase, all parameters, with the exception of those associated with the Multi-modal Understanding Adapters, undergo freezing. In my understanding, llama2 should only be fine-tuned in the third stage, but in the code it seems that llama2 is fine-tuned in lora in all three stages because llama and lora appear in the trainable parameter names in all three stages of the 'get_trainable_params' function
The text was updated successfully, but these errors were encountered: