You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I am trying to follow the instruction in dobf.md to evaluate Transcoder_model_1.pth on Clone detection. After I run following command, the error related to reloading the model appears. I wonder if I did something wrong or if the script needs to be modified to evaluate Transcoder model on CodeXGlue.
tee: logs/TransCoder_model_1_roberta_java_05_12_lr2.5e-5.log: No such file or directory
adding to path /home/h6ju/CodeGen
05/18/2022 17:29:14 - WARNING - __main__ - Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False
/home/h6ju/CodeGen/TransCoder_model_1.pth
Traceback (most recent call last):
File "run.py", line 642, in <module>
main()
File "run.py", line 596, in main
model = model_class.from_pretrained(args.model_name_or_path,
File "/home/h6ju/CodeGen/codegen_sources/wrappers/models.py", line 160, in from_pretrained
model.reload_model(model_path)
File "/home/h6ju/CodeGen/codegen_sources/wrappers/models.py", line 124, in reload_model
self.transformer.load_state_dict(model_reloaded, strict=True)
File "/project/6001884/h6ju/CodeGen/newCodeGen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TransformerModel:
size mismatch for position_embeddings.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([2048, 1024]).
Even I modified strict=True to strict=False in models.py, the same error still appears.
Thank you for your great help!
The text was updated successfully, but these errors were encountered:
Hi,
TransCoder is not really made for pre-training a model for things like clone detection. What you are trying to do would reload only the encoder of TransCoder and fine-tune for clone detection.
It seems like the size of the model in the checkpoint doesn't match the config. I believe it is due to you using a config with a RoBERTa size model (12 layers, dim 1024) while TransCoder had 6 layers of dim 2048. I would have expected both the config and model to be reloaded from your $MODEL checkpoint but it seems like it is not happening.
The xlm_java model type would use the right tokenizer for this model, but I suspect you will have the same parameter mismatch error.
Hi,
Thanks for your reply! Based on that, could I ask will the same error happen again if I try to evaluate TransCoder on any other CodeXGlue benchmark such as Code to Code translation or Code Completion? Thanks again! If TransCoder does not fit with CodeXGlue, I will try dobf_plus_denoising model instead!
Hi,
We definitely managed to test models with the same encoder parameters as TransCoder on CodeXGlue before. I did not test it recently, and I guess there will still be a bug with xlm_java instead of roberta_java based on your stack trace.
I would need to look into that further to solve this bug.
Hi,
I am trying to follow the instruction in dobf.md to evaluate Transcoder_model_1.pth on Clone detection. After I run following command, the error related to reloading the model appears. I wonder if I did something wrong or if the script needs to be modified to evaluate Transcoder model on CodeXGlue.
Then, following error comes out:
Even I modified
strict=True
tostrict=False
in models.py, the same error still appears.Thank you for your great help!
The text was updated successfully, but these errors were encountered: