Error in 'KD-based Answer Assignment' #6

hegdekartik · 2023-06-09T20:53:58Z

Hi,

Thanks for the great work. I found your work interesting, so I wanted to try this out. But in 'KD-based Answer Assignment', we are getting errors.

We are getting the following error when we run the following command:

CUDA_VISIBLE_DEVICES=0 python main.py --dataset v2 --mode q_v_debias --debias learned_mixin --topq 1 --topv -1 --qvp 5 --output lmh_css --seed 2048

Traceback (most recent call last):
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/main.py", line 178, in <module>
    main()
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/main.py", line 175, in main
    train(model, train_loader, eval_loader, args,qid2type)
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/train.py", line 219, in train
    word_grad = torch.autograd.grad((pred * (a > 0).float()).sum(), word_emb, create_graph=True)[0]
  File "/home

So we tried the other way given, which is using a pretrained teacher model (CSS) download from CSS-VQA. But unfortunately, after downloading 'model.pth' and running 'Assign new answer' command we got error as below.

CUDA_VISIBLE_DEVICES=0 python assign_answer.py --dataset v2 --name number --split high

DATASET LEN 443757
100%|███████████████████████████████████████████████████| 443757/443757 [00:00<00:00, 946121.94it/s]
100%|███████████████████████████████████████████████████| 443757/443757 [00:02<00:00, 167279.98it/s]
Get language bias, which is an input of CSS teacher model.
loading dictionary from data/dictionary.pkl
tokenize: 100%|██████████████████████████████████████████| 443757/443757 [00:04<00:00, 97819.23it/s]
tensorize: 100%|████████████████████████████████████████| 443757/443757 [00:04<00:00, 109012.99it/s]
Load model from: ./logs/lmh_css/model.pth
Traceback (most recent call last):
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/assign_answer.py", line 171, in <module>
    ood_model.load_state_dict(model_state)
  File "/home/kartik/.conda/envs/BLIP_env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1667, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for BaseModel:
        size mismatch for classifier.main.3.bias: copying a param with shape torch.Size([2274]) from checkpoint, the shape in current model is torch.Size([2410]).
        size mismatch for classifier.main.3.weight_v: copying a param with shape torch.Size([2274, 2048]) from checkpoint, the shape in current model is torch.Size([2410, 2048]).

How can I get rid of this error?

Thank you

The text was updated successfully, but these errors were encountered:

ItemZheng · 2023-06-11T04:42:23Z

For the first error, can you provide more error logs? For the second error, you were missing the argument --teacher_path, and the entire command is CUDA_VISIBLE_DEVICES=0 python assign_answer.py --dataset [cpv2/v2] --name number --split high --teacher_path [] mentioned in README.md.

hegdekartik · 2023-06-11T06:36:37Z

For the second error, --teacher_path was an optional argument. So we added the model.pth into the correct folder mentioned in the assign_answer.py, which is './logs/lmh_css/model.pth.

Could you please provide the correct link to the right model.pth for this step?

Error logs for the first error :

Building train dataset...
caching-features: 100%|████████████████████████████████████| 443757/443757 [38:56<00:00, 189.96it/s]
tokenize: 100%|█████████████████████████████████████████| 443757/443757 [00:03<00:00, 119740.31it/s]
tensorize: 100%|████████████████████████████████████████| 443757/443757 [00:04<00:00, 106497.75it/s]
Building test dataset...
caching-features: 100%|████████████████████████████████████| 214354/214354 [18:59<00:00, 188.16it/s]
tokenize: 100%|██████████████████████████████████████████| 214354/214354 [00:04<00:00, 48356.19it/s]
tensorize: 100%|████████████████████████████████████████| 214354/214354 [00:01<00:00, 109298.11it/s]
Starting training...
Epoch 1:   0%|                                                              | 0/867 [00:00<?, ?it/s]/home/kartik/.conda/envs/BLIP_env/lib/python3.10/site-packages/torch/nn/functional.py:1967: UserWarning: nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.
  warnings.warn("nn.functional.sigmoid is deprecated. Use torch.sigmoid instead.")
Epoch 1:   0%|                                                              | 0/867 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/main.py", line 178, in <module>
    main()
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/main.py", line 175, in main
    train(model, train_loader, eval_loader, args,qid2type)
  File "/mnt/44b643af-38ed-4d24-abcc-00e81b36025c/kartik/KDDAug/train.py", line 280, in train
    visual_grad = torch.autograd.grad((pred * (a > 0).float()).sum(), v, create_graph=True)[0]
  File "/home/kartik/.conda/envs/BLIP_env/lib/python3.10/site-packages/torch/autograd/__init__.py", line 300, in grad
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512, 2048]], which is output 0 of ReluBackward0, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

hegdekartik · 2023-07-06T18:04:46Z

Hi,

I am still having this issue. Can you please check and help me resolve this issue? Thank you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in 'KD-based Answer Assignment' #6

Error in 'KD-based Answer Assignment' #6

hegdekartik commented Jun 9, 2023

ItemZheng commented Jun 11, 2023

hegdekartik commented Jun 11, 2023

hegdekartik commented Jul 6, 2023 •

edited

Loading

Error in 'KD-based Answer Assignment' #6

Error in 'KD-based Answer Assignment' #6

Comments

hegdekartik commented Jun 9, 2023

ItemZheng commented Jun 11, 2023

hegdekartik commented Jun 11, 2023

hegdekartik commented Jul 6, 2023 • edited Loading

hegdekartik commented Jul 6, 2023 •

edited

Loading