finetune problem #36

nywang2019 · 2019-10-11T02:23:47Z

I pretrained the model with my own data about 10 epochs. and the result does not converge. then i want to try finetune step.but failed. any one can help me? thanks. @shepnerd (my image size is 512X512, about 1000 pics in trainingset)
RuntimeError: size mismatch, m1: [4 x 4096], m2: [16384 x 1] at C:/w/1/s/tmp_conda_3.7_055457/conda/conda-bld/pytorch_1565416617654/work/aten/src\THC/generic/THCTens
orMathBlas.cu:273

nywang2019 · 2019-10-11T02:39:34Z

here is the full information:
(base) D:\PycharmProjects2\inpainting_gmcnn\pytorch>python train.py --dataset Mydata --data_file ./training_data_list/data_list.txt --gpu_ids 0 --pretrain_network 0
--load_model_dir ./checkpoints/20191010-124841_GMCNN_Mydata_b4_s256x256_gc32_dc64_randmask-rect_pretrain --batch_size 4
------------ Options -------------
D_max_iters: 5
batch_size: 4
checkpoint_dir: ./checkpoints
d_cnum: 64
data_file: ./training_data_list/data_list.txt
dataset: Mydata
dataset_path: ./training_data_list/data_list.txt
date_str: 20191010-193601
epochs: 10
g_cnum: 32
gpu_ids: ['0']
img_shapes: [256, 256, 3]
lambda_adv: 0.001
lambda_ae: 1.2
lambda_gp: 10
lambda_mrf: 0.05
lambda_rec: 1.4
load_model_dir: ./checkpoints/20191010-124841_GMCNN_Mydata_b4_s256x256_gc32_dc64_randmask-rect_pretrain
lr: 1e-05
margins: [0, 0]
mask_shapes: [64, 64]
mask_type: rect
max_delta_shapes: [32, 32]
model_folder: ./checkpoints\20191010-193601_GMCNN_Mydata_b4_s256x256_gc32_dc64_randmask-rect
model_name: GMCNN
padding: SAME
phase: train
pretrain_network: False
random_crop: True
random_mask: True
random_seed: False
spectral_norm: True
train_spe: 100
vgg19_path: vgg19_weights/imagenet-vgg-verydeep-19.mat
viz_steps: 5
-------------- End ----------------
loading data..
data loaded..
configuring model..
initialize network with normal
initialize network with normal
---------- Networks initialized -------------
GMCNN(
(EB1): ModuleList(
(0): Conv2d(4, 32, kernel_size=(7, 7), stride=(1, 1))
(1): Conv2d(32, 64, kernel_size=(7, 7), stride=(2, 2))
(2): Conv2d(64, 64, kernel_size=(7, 7), stride=(1, 1))
(3): Conv2d(64, 128, kernel_size=(7, 7), stride=(2, 2))
(4): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1))
(5): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1))
(6): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1), dilation=(2, 2))
(7): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1), dilation=(4, 4))
(8): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1), dilation=(8, 8))
(9): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1), dilation=(16, 16))
(10): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1))
(11): Conv2d(128, 128, kernel_size=(7, 7), stride=(1, 1))
(12): PureUpsampling()
)
(EB2): ModuleList(
(0): Conv2d(4, 32, kernel_size=(5, 5), stride=(1, 1))
(1): Conv2d(32, 64, kernel_size=(5, 5), stride=(2, 2))
(2): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
(3): Conv2d(64, 128, kernel_size=(5, 5), stride=(2, 2))
(4): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1))
(5): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1))
(6): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1), dilation=(2, 2))
(7): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1), dilation=(4, 4))
(8): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1), dilation=(8, 8))
(9): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1), dilation=(16, 16))
(10): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1))
(11): Conv2d(128, 128, kernel_size=(5, 5), stride=(1, 1))
(12): PureUpsampling()
(13): Conv2d(128, 64, kernel_size=(5, 5), stride=(1, 1))
(14): Conv2d(64, 64, kernel_size=(5, 5), stride=(1, 1))
(15): PureUpsampling()
)
(EB3): ModuleList(
(0): Conv2d(4, 32, kernel_size=(3, 3), stride=(1, 1))
(1): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2))
(2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(3): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2))
(4): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))
(5): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))
(6): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), dilation=(2, 2))
(7): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), dilation=(4, 4))
(8): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), dilation=(8, 8))
(9): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), dilation=(16, 16))
(10): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))
(11): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1))
(12): PureUpsampling()
(13): Conv2d(128, 64, kernel_size=(3, 3), stride=(1, 1))
(14): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1))
(15): PureUpsampling()
(16): Conv2d(64, 32, kernel_size=(3, 3), stride=(1, 1))
(17): Conv2d(32, 32, kernel_size=(3, 3), stride=(1, 1))
)
(decoding_layers): ModuleList(
(0): Conv2d(224, 16, kernel_size=(3, 3), stride=(1, 1))
(1): Conv2d(16, 3, kernel_size=(3, 3), stride=(1, 1))
)
(pads): ModuleList(
(0): ReflectionPad2d((0, 0, 0, 0))
(1): ReflectionPad2d((1, 1, 1, 1))
(2): ReflectionPad2d((2, 2, 2, 2))
(3): ReflectionPad2d((3, 3, 3, 3))
(4): ReflectionPad2d((4, 4, 4, 4))
(5): ReflectionPad2d((5, 5, 5, 5))
(6): ReflectionPad2d((6, 6, 6, 6))
(7): ReflectionPad2d((7, 7, 7, 7))
(8): ReflectionPad2d((8, 8, 8, 8))
(9): ReflectionPad2d((9, 9, 9, 9))
(10): ReflectionPad2d((10, 10, 10, 10))
(11): ReflectionPad2d((11, 11, 11, 11))
(12): ReflectionPad2d((12, 12, 12, 12))
(13): ReflectionPad2d((13, 13, 13, 13))
(14): ReflectionPad2d((14, 14, 14, 14))
(15): ReflectionPad2d((15, 15, 15, 15))
(16): ReflectionPad2d((16, 16, 16, 16))
(17): ReflectionPad2d((17, 17, 17, 17))
(18): ReflectionPad2d((18, 18, 18, 18))
(19): ReflectionPad2d((19, 19, 19, 19))
(20): ReflectionPad2d((20, 20, 20, 20))
(21): ReflectionPad2d((21, 21, 21, 21))
(22): ReflectionPad2d((22, 22, 22, 22))
(23): ReflectionPad2d((23, 23, 23, 23))
(24): ReflectionPad2d((24, 24, 24, 24))
(25): ReflectionPad2d((25, 25, 25, 25))
(26): ReflectionPad2d((26, 26, 26, 26))
(27): ReflectionPad2d((27, 27, 27, 27))
(28): ReflectionPad2d((28, 28, 28, 28))
(29): ReflectionPad2d((29, 29, 29, 29))
(30): ReflectionPad2d((30, 30, 30, 30))
(31): ReflectionPad2d((31, 31, 31, 31))
(32): ReflectionPad2d((32, 32, 32, 32))
(33): ReflectionPad2d((33, 33, 33, 33))
(34): ReflectionPad2d((34, 34, 34, 34))
(35): ReflectionPad2d((35, 35, 35, 35))
(36): ReflectionPad2d((36, 36, 36, 36))
(37): ReflectionPad2d((37, 37, 37, 37))
(38): ReflectionPad2d((38, 38, 38, 38))
(39): ReflectionPad2d((39, 39, 39, 39))
(40): ReflectionPad2d((40, 40, 40, 40))
(41): ReflectionPad2d((41, 41, 41, 41))
(42): ReflectionPad2d((42, 42, 42, 42))
(43): ReflectionPad2d((43, 43, 43, 43))
(44): ReflectionPad2d((44, 44, 44, 44))
(45): ReflectionPad2d((45, 45, 45, 45))
(46): ReflectionPad2d((46, 46, 46, 46))
(47): ReflectionPad2d((47, 47, 47, 47))
(48): ReflectionPad2d((48, 48, 48, 48))
)
)
[Network GM] Total number of parameters : 12.562 M
Loading pretrained model from ./checkpoints/20191010-124841_GMCNN_Mydata_b4_s256x256_gc32_dc64_randmask-rect_pretrain
loading the model from ./checkpoints/20191010-124841_GMCNN_Mydata_b4_s256x256_gc32_dc64_randmask-rect_pretrain\2_net_GM.pth
Loading done.
model setting up..
training initializing..
Traceback (most recent call last):
File "train.py", line 41, in
ourModel.optimize_parameters()
File "D:\PycharmProjects2\inpainting_gmcnn\pytorch\model\net.py", line 332, in optimize_parameters
self.forward_D()
File "D:\PycharmProjects2\inpainting_gmcnn\pytorch\model\net.py", line 304, in forward_D
self.completed_logit, self.completed_local_logit = self.netD(self.completed.detach(), self.completed_local.detach())
File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "D:\PycharmProjects2\inpainting_gmcnn\pytorch\model\net.py", line 202, in forward
x_local = self.local_discriminator(x_l)
File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "D:\PycharmProjects2\inpainting_gmcnn\pytorch\model\net.py", line 184, in forward
self.logit = self.layers-1
File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "D:\PycharmProjects2\inpainting_gmcnn\pytorch\model\layer.py", line 267, in forward
return self.module.forward(*input)
File "D:\Anaconda3\lib\site-packages\torch\nn\modules\linear.py", line 87, in forward
return F.linear(input, self.weight, self.bias)
File "D:\Anaconda3\lib\site-packages\torch\nn\functional.py", line 1369, in linear
ret = torch.addmm(bias, input, weight.t())
RuntimeError: size mismatch, m1: [4 x 4096], m2: [16384 x 1] at C:/w/1/s/tmp_conda_3.7_055457/conda/conda-bld/pytorch_1565416617654/work/aten/src\THC/generic/THCTens
orMathBlas.cu:273

shepnerd · 2019-10-14T11:15:57Z

The input channel number of the fc layer in pytorch version needs to be modified if the mask shape changes. Since the current input image size changes from 128x128 to 64x64 for the local discriminator, L260 in model/net.py should be updated to l_fc_channels=4 * 4 * opt.d_cnum * 4.

nywang2019 · 2019-10-14T14:41:31Z

i updated net.py, and it works, thanks a lot.

xyzdcgan · 2019-11-22T12:02:08Z

Hi, can you describe the steps for training with custom dataset in tensorflow. @nywang2019
Thanks

nywang2019 · 2019-11-22T20:45:47Z

@xyzdcgan hi,just follow what @shepnerd suggested in this post. see blow:
The input channel number of the fc layer in pytorch version needs to be modified if the mask shape changes. Since the current input image size changes from 128x128 to 64x64 for the local discriminator, L260 in model/net.py should be updated to l_fc_channels=4 * 4 * opt.d_cnum * 4.

xyzdcgan · 2019-11-26T10:29:45Z

Hi @nywang2019 can you please explain how does gan generate image..?

What i know about GAN: there is a discriminator and generator, discriminator differ from fake and real image, and generator generate new images till discriminator cannot find difference between original and fake image.

Even both use CNN algorithm in the model.

My question how gan performs object removal..??

Lets say for example i have added a mask near eye region where it has wrinkle, and clicked the complete button.

When image is generated how it is generated without wrinkle...?

One of actual process of the GAN which i know is, ATTGAN, where to generate smiling image for non-smiling image, it gather smiling facial expression from dataset/model trained, then it apply those smiling expression to the image given as input.

In case of GMCNN, how does model select mask area and then make it wrinkle free.
what is actual thing that tells these changes need to be applied to the mask region

Lets assume i have a face image with some pimple i applied mask and it removed pimples from image, how it was able to generated image without pimples.

Can you please tell me what are internal process of GAN, how all this work.
@nywang2019 @shepnerd

Thanks

shepnerd · 2020-01-08T13:58:31Z

In short, GAN-based inpainting methods predict the user-marked regions based on image context.

In your given example (remove pimples from face images), with the input image and the annotated region, the model infers the semantics (e.g. skin, eye, etc.) and low-level details (e.g. color, texture, etc.) of the marked region from the unknown regions (context), and composes the final result. The prediction of the semantics (whether it contains pimples or not in your example) is from the given image context and the learned priors from the training set. Thus, the final predicted region is highly likely to be different from the corresponding one of the ground truth, as the final model optimization goal is to encourage the prediction to be realistic just as the training data.

For your mentioned ATTGAN, it has extra conditions as attribute vectors. With that, the GAN model can interactively control its prediction. Similar works or ideas can be referred to pixel2pixel and its following works.

About the quick understanding of the GAN-based inpainting methods, you can treat the GAN loss as a learnable metric. It measures how real the prediction is (compared with the real data) from a high-level perspective (similar to human observers) instead of requiring pixel-wise similarities like L1 or L2 loss.

qwerdbeta · 2020-09-09T20:45:11Z

I have a similar problem but instead of making the images and masks smaller, I made them 512x512 (images) and 256x256 (masks) and I get a similar error with different numbers reported. See #61

qwerdbeta · 2020-09-11T05:42:27Z

above @nywang2019 says he is using 512x512 training images but in the trace it says img sizes are 256x256. @shepnerd what would be needed to get the pretrain step to work with 512x512 training images?

I get similar errors.

nywang2019 closed this as completed Oct 11, 2019

nywang2019 reopened this Oct 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

finetune problem #36

finetune problem #36

nywang2019 commented Oct 11, 2019 •

edited

Loading

nywang2019 commented Oct 11, 2019

shepnerd commented Oct 14, 2019

nywang2019 commented Oct 14, 2019

xyzdcgan commented Nov 22, 2019

nywang2019 commented Nov 22, 2019

xyzdcgan commented Nov 26, 2019

shepnerd commented Jan 8, 2020

qwerdbeta commented Sep 9, 2020

qwerdbeta commented Sep 11, 2020

finetune problem #36

finetune problem #36

Comments

nywang2019 commented Oct 11, 2019 • edited Loading

nywang2019 commented Oct 11, 2019

shepnerd commented Oct 14, 2019

nywang2019 commented Oct 14, 2019

xyzdcgan commented Nov 22, 2019

nywang2019 commented Nov 22, 2019

xyzdcgan commented Nov 26, 2019

shepnerd commented Jan 8, 2020

qwerdbeta commented Sep 9, 2020

qwerdbeta commented Sep 11, 2020

nywang2019 commented Oct 11, 2019 •

edited

Loading