Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate Transcoder_model_1 on CodeXGlue benchmark #74

Open
LANCHERBA opened this issue May 19, 2022 · 3 comments
Open

Evaluate Transcoder_model_1 on CodeXGlue benchmark #74

LANCHERBA opened this issue May 19, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@LANCHERBA
Copy link

Hi,
I am trying to follow the instruction in dobf.md to evaluate Transcoder_model_1.pth on Clone detection. After I run following command, the error related to reloading the model appears. I wonder if I did something wrong or if the script needs to be modified to evaluate Transcoder model on CodeXGlue.

SOURCEDIR=/home/h6ju/CodeGen
MODEL=/home/h6ju/CodeGen/TransCoder_model_1.pth
lr=2.5e-5
export PYTHONPATH=/home/h6ju/CodeGen
source $SOURCEDIR/newCodeGen/bin/activate

cd CodeXGLUE/Code-Code/Clone-detection-BigCloneBench/code; bash run_xlm_general.sh $MODEL 12 05 roberta_java TransCoder_model_1 $lr 2>&1 | tee logs/TransCoder_model_1_roberta_java_05_12_lr$lr.log

Then, following error comes out:

tee: logs/TransCoder_model_1_roberta_java_05_12_lr2.5e-5.log: No such file or directory
adding to path /home/h6ju/CodeGen
05/18/2022 17:29:14 - WARNING - __main__ -   Process rank: -1, device: cuda, n_gpu: 1, distributed training: False, 16-bits training: False
/home/h6ju/CodeGen/TransCoder_model_1.pth
Traceback (most recent call last):
  File "run.py", line 642, in <module>
    main()
  File "run.py", line 596, in main
    model = model_class.from_pretrained(args.model_name_or_path,
  File "/home/h6ju/CodeGen/codegen_sources/wrappers/models.py", line 160, in from_pretrained
    model.reload_model(model_path)
  File "/home/h6ju/CodeGen/codegen_sources/wrappers/models.py", line 124, in reload_model
    self.transformer.load_state_dict(model_reloaded, strict=True)
  File "/project/6001884/h6ju/CodeGen/newCodeGen/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1406, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for TransformerModel:
        size mismatch for position_embeddings.weight: copying a param with shape torch.Size([1024, 1024]) from checkpoint, the shape in current model is torch.Size([2048, 1024]).

Even I modified strict=True to strict=False in models.py, the same error still appears.
Thank you for your great help!

@baptisteroziere
Copy link
Contributor

Hi,
TransCoder is not really made for pre-training a model for things like clone detection. What you are trying to do would reload only the encoder of TransCoder and fine-tune for clone detection.
It seems like the size of the model in the checkpoint doesn't match the config. I believe it is due to you using a config with a RoBERTa size model (12 layers, dim 1024) while TransCoder had 6 layers of dim 2048. I would have expected both the config and model to be reloaded from your $MODEL checkpoint but it seems like it is not happening.
The xlm_java model type would use the right tokenizer for this model, but I suspect you will have the same parameter mismatch error.

You can also use one of the RoBERTa-size models that we trained to compare ourselves to CodeBERT and GraphCodeBERT such as this one: https://dl.fbaipublicfiles.com/transcoder/pre_trained_models/dobf_plus_denoising.pth

@LANCHERBA
Copy link
Author

LANCHERBA commented May 31, 2022

Hi,
Thanks for your reply! Based on that, could I ask will the same error happen again if I try to evaluate TransCoder on any other CodeXGlue benchmark such as Code to Code translation or Code Completion? Thanks again! If TransCoder does not fit with CodeXGlue, I will try dobf_plus_denoising model instead!

@baptisteroziere baptisteroziere added the bug Something isn't working label Jun 1, 2022
@baptisteroziere
Copy link
Contributor

Hi,
We definitely managed to test models with the same encoder parameters as TransCoder on CodeXGlue before. I did not test it recently, and I guess there will still be a bug with xlm_java instead of roberta_java based on your stack trace.
I would need to look into that further to solve this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants