-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError When Saving Phi 3.5 Vision Due to Shared Tensors #223
Comments
Same Issue, even while saving the processor getting issue like |
@ChenRocks any ideas on this? |
For visibility, a contributor to a forked version of the Phi 3 Vision cookbook suggested the following solution, stating "You need to remove the wte weight. It's okay because when the model is loaded from the checkpoint, it will automatically copy the weight from the embedding weight."
This does indeed seem to work. However, it doesn't exactly fit into a use case that relies on the I opened a feature request against the
Can you folks here (@leestott @ChenRocks ) comment on idea #2 above? Is there a general abstraction inherent to the architecture of Phi 3/3.5 Vision around which we could develop a heuristic (i.e., to avoid a the naive conditional |
@jjbuck @vjagannath786 @leestott You could check on to the code I made. The model could be saved by removing the wte weight in the Trainer class too. https://github.com/2U1/Phi3-Vision-Finetune/blob/406eafbbf8c6d84d2a3cc0878376db0a86c39af2/src/training/trainer.py#L205-L210 |
@vjagannath786 You need to copy the You could do this before saving the processor
|
I’m trying to fine-tune Phi 3.5 Vision using transformers. However, I’m running into an issue trying to save the model during or after training. See below for a minimal reproducible example.
My example below seems to be essentially what's happening in the official "cookbook" example: https://github.com/microsoft/Phi-3CookBook/blob/main/code/04.Finetuning/vision_finetuning/finetune_hf_trainer_docvqa.py#L482-L485.
However, I also see from this other example (
Phi-3CookBook/code/04.Finetuning/Phi-3-vision-Trainingscript.py
Line 256 in 6566572
safe_serialization=False
is used....is that strictly required? The example fromfinetune_hf_trainer_docvqa.py
doesn't seem to use it, and it's not clear to me how that works successfully.Does anyone have any pointers? This issue has been reported in a few other locations, but I haven't come across any solutions - see below.
The error suggests “saving using safe_serialization=False”…but I’m not sure what the implications of that are.
Minimal Reproducible Example
The text was updated successfully, but these errors were encountered: