Do I need to download the tremendous weights again? #44

fanlyu · 2019-04-30T03:01:42Z

I see the updates of the code, so do I need to download the tremendous weights again?

apsdehal · 2019-04-30T03:06:03Z

Mostly not except for the new model. Features can stay the same but it will require some directory refactor.

fanlyu · 2019-04-30T09:29:59Z

I try the old version weights that put vqa/train2014 and vqa/val2014 into train_val_2014.
Then I run the code with model pythia as
python tools/run.py --tasks vqa --datasets vqa2 --model pythia --config configs/vqa/vqa2/pythia.yml

But I got the error as below, how to slove it?

2019-04-30T17:16:15 INFO: Starting training...
2019-04-30T17:16:16 ERROR: 'Traceback (most recent call last):\n  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop\n    samples = collate_fn([dataset[i] for i in batch_indices])\n  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>\n    samples = collate_fn([dataset[i] for i in batch_indices])\n  File "/home/lvfan/pythia_new/pythia/tasks/multi_task.py", line 73, in __getitem__\n    item = self.chosen_task[idx]\n  File "/home/lvfan/pythia_new/pythia/tasks/base_task.py", line 154, in __getitem__\n    item = self.chosen_dataset[idx]\n  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 81, in __getitem__\n    return self.datasets[dataset_idx][sample_idx]\n  File "/home/lvfan/pythia_new/pythia/tasks/base_dataset.py", line 49, in __getitem__\n    sample = self.get_item(idx)\n  File "/home/lvfan/pythia_new/pythia/tasks/vqa/vqa2/dataset.py", line 93, in get_item\n    return self.load_item(idx)\n  File "/home/lvfan/pythia_new/pythia/tasks/vqa/vqa2/dataset.py", line 127, in load_item\n    current_sample = self.add_answer_info(sample_info, current_sample)\n  File "/home/lvfan/pythia_new/pythia/tasks/vqa/vqa2/dataset.py", line 161, in add_answer_info\n    {"answers": answers, "tokens": sample_info["ocr_tokens"]}\nKeyError: \'ocr_tokens\'\n'
Traceback (most recent call last):
  File "tools/run.py", line 87, in <module>
    run()
  File "tools/run.py", line 76, in run
    trainer.train()
  File "/home/lvfan/pythia_new/pythia/common/trainer.py", line 240, in train
    for batch in self.train_loader:
  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 637, in __next__
    return self._process_next_batch(batch)
  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
KeyError: 'Traceback (most recent call last):\n  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop\n    samples = collate_fn([dataset[i] for i in batch_indices])\n  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>\n    samples = collate_fn([dataset[i] for i in batch_indices])\n  File "/home/lvfan/pythia_new/pythia/tasks/multi_task.py", line 73, in __getitem__\n    item = self.chosen_task[idx]\n  File "/home/lvfan/pythia_new/pythia/tasks/base_task.py", line 154, in __getitem__\n    item = self.chosen_dataset[idx]\n  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/utils/data/dataset.py", line 81, in __getitem__\n    return self.datasets[dataset_idx][sample_idx]\n  File "/home/lvfan/pythia_new/pythia/tasks/base_dataset.py", line 49, in __getitem__\n    sample = self.get_item(idx)\n  File "/home/lvfan/pythia_new/pythia/tasks/vqa/vqa2/dataset.py", line 93, in get_item\n    return self.load_item(idx)\n  File "/home/lvfan/pythia_new/pythia/tasks/vqa/vqa2/dataset.py", line 127, in load_item\n    current_sample = self.add_answer_info(sample_info, current_sample)\n  File "/home/lvfan/pythia_new/pythia/tasks/vqa/vqa2/dataset.py", line 161, in add_answer_info\n    {"answers": answers, "tokens": sample_info["ocr_tokens"]}\nKeyError: \'ocr_tokens\'\n'

apsdehal · 2019-04-30T09:46:21Z

That's imdb related issue. It would be better if you download new imdbs, they are not that big. Alternatively, I also pushed a small change that will make sure not to load tokens even for answers when use_ocr is False.

fanlyu · 2019-05-03T09:27:47Z

That's imdb related issue. It would be better if you download new imdbs, they are not that big. Alternatively, I also pushed a small change that will make sure not to load tokens even for answers when use_ocr is False.

Hey, I addressed the problem, but find when do

def finalize(self):
        torch.save(self.trainer.model, self.pth_filepath)
        torch.save(self.trainer.model.state_dict(), self.params_filepath)

It got error

2019-05-03T17:18:35 ERROR: can't pickle _thread.lock objects
Traceback (most recent call last):
  File "tools/run.py", line 87, in <module>
    run()
  File "tools/run.py", line 76, in run
    trainer.train()
  File "/home/lvfan/pythia_new/pythia/common/trainer.py", line 260, in train
    self.finalize()
  File "/home/lvfan/pythia_new/pythia/common/trainer.py", line 296, in finalize
    self.checkpoint.finalize()
  File "/home/lvfan/pythia_new/pythia/utils/checkpoint.py", line 243, in finalize
    torch.save(self.trainer.model, self.pth_filepath)
  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 219, in save
    return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 144, in _with_file_like
    return body(f)
  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 219, in <lambda>
    return _with_file_like(f, "wb", lambda f: _save(obj, f, pickle_module, pickle_protocol))
  File "/home/lvfan/anaconda3/lib/python3.7/site-packages/torch/serialization.py", line 292, in _save
    pickler.dump(obj)
TypeError: can't pickle _thread.lock objects

That is weird, and result to failure inference.

apsdehal · 2019-05-08T04:49:16Z

@fanlyu Can you check now, it should be fixed in the master?

fanlyu · 2019-05-09T11:54:46Z

@apsdehal It works, thanks! I'd close the issue.

vedanuj mentioned this issue May 6, 2019

Fix filepath and save state_dict for model for checkpoint finalize #46

Merged

fanlyu closed this as completed May 9, 2019

ChenyuGAO-CS mentioned this issue Jul 3, 2019

How can I train LoRRA on TextVQA dataset using multi-GPUs? #116

Closed

apsdehal added a commit that referenced this issue May 8, 2020

[fix] Move to attribute usage rather than getitem (#44)

46d0ea0

apsdehal added a commit that referenced this issue May 8, 2020

[fix] Move to attribute usage rather than getitem (#44)

20d55bc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Do I need to download the tremendous weights again? #44

Do I need to download the tremendous weights again? #44

fanlyu commented Apr 30, 2019

apsdehal commented Apr 30, 2019

fanlyu commented Apr 30, 2019

apsdehal commented Apr 30, 2019

fanlyu commented May 3, 2019

apsdehal commented May 8, 2019

fanlyu commented May 9, 2019

Do I need to download the tremendous weights again? #44

Do I need to download the tremendous weights again? #44

Comments

fanlyu commented Apr 30, 2019

apsdehal commented Apr 30, 2019

fanlyu commented Apr 30, 2019

apsdehal commented Apr 30, 2019

fanlyu commented May 3, 2019

apsdehal commented May 8, 2019

fanlyu commented May 9, 2019