Training on a single GPU #5

azamshoaib · 2020-03-05T06:27:54Z

Hi,
I would like to know is this network can be trained on a single gpu. Because when I am training it gives me Cuda out of memory error. Please help me in this regard.

yaohungt · 2020-03-05T06:29:40Z

Can you try this alternative codebase:
https://github.com/yaohungt/Capsules-Inverted-Attention-Routing

This uses less memory and has better inference speed.

azamshoaib · 2020-03-05T22:35:26Z

@yaohungt Thank you so much. I have reduced the batch size and now it is training.

AliS567 · 2020-03-24T11:37:01Z

Hello,
I have a similar problem running on all 3 gpu's, my input size however is 84x84.
Thanks!

yaohungt · 2020-03-24T13:32:02Z

Hi, can you be more specific?

If your input has a larger size, then you may need a larger network to fit the training.

AliS567 · 2020-03-24T14:45:52Z

Yes of course, i was attempting to input .mat files of 84 x 84 with only 1 channel. Trying to work my way through some errors i decided to alter the dimensions of my input image to 32 x 32 to match the CIFAR10 data used in this example, i feel this should fix the memory problems. However, i now have an error to do with batch size matching. ValueError: Expected input batch_size (128) to match target batch_size (5).

I believe this is because i am inputting 32 x 32 with no padding.

Apologies for taking up your time, i am fairly new to pytorch!

Thanks alot for the swift reply! 😃

yaohungt · 2020-03-24T15:05:08Z

I haven't seen your code, but my guess is because of your input size: 84x84x1. While CIFAR10 has 32x32x3.

You can modify the config file in ./configs so that the code can work on your dataset.

AliS567 · 2020-03-24T15:31:59Z

I have altered the backbone code to accept one channel.

def DataGenerationwt():
    data_path='/home/icos/Desktop/Ali/compute/WT_features/'
    original_path='/home/icos/Desktop/Imene/1d_dataset4_updated/'

    data = (sio.loadmat(data_path+'wt_real.mat', squeeze_me=False, chars_as_strings=False, mat_dtype=True, struct_as_record=True))  
    label = (sio.loadmat(original_path+'EMIdatav2.mat', squeeze_me=False, chars_as_strings=False, mat_dtype=True, struct_as_record=True))  

    data_array = np.array(data['wt_real'])
    label_array = np.array(label['Y'])

    data_final = torch.from_numpy(data_array)
    label_final = torch.from_numpy(label_array)

    return data_final, label_final

# Data
print('==> Preparing data..')
# assert args.dataset == 'CIFAR10' or args.dataset == 'CIFAR100'
# transform_train = transforms.Compose([
#         transforms.RandomCrop(32, padding=4),
#         transforms.RandomHorizontalFlip(),
#         transforms.ToTensor(),
#         transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
#     ])
# transform_test = transforms.Compose([
#         transforms.ToTensor(),
#         transforms.Normalize((0.4914, 0.4822, 0.4465), (0.2023, 0.1994, 0.2010)),
#     ])
assert args.dataset == 'wt'
data, label = DataGenerationwt()
print(data.size())
data = data.view(-1,1,32,32)
print(label.size())
input(data.size())
samples = int(data.size(0)*0.75)
samples2 = int(data.size(0)*0.25)
print(samples)
mainset = tudata.TensorDataset(data.float(), label)

my_train, my_test = torch.utils.data.random_split(mainset, [samples, samples2])
trainloader = tudata.DataLoader(my_train, batch_size=128, shuffle=True, num_workers=args.num_workers)        
testloader = tudata.DataLoader(my_test, batch_size=100, shuffle=False, num_workers=args.num_workers)

print('==> Building model..')
# Model parameters

I think my problem will be to do with the way im loading data?

ValueError: Expected input batch_size (128) to match target batch_size (5).

Thanks again for your time

yaohungt · 2020-03-24T15:56:42Z

I'm not sure. I think you can print 1) the shape of the default CIFAR10 data; and 2) the shape of your own data. They shall look alike.

AliS567 · 2020-03-24T16:05:18Z

Yeah it's been quite mind boggling so far, i'll keep working!

Thank you for all your good work!
Ali

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training on a single GPU #5

Training on a single GPU #5

azamshoaib commented Mar 5, 2020

yaohungt commented Mar 5, 2020 •

edited

Loading

azamshoaib commented Mar 5, 2020

AliS567 commented Mar 24, 2020

yaohungt commented Mar 24, 2020

AliS567 commented Mar 24, 2020 •

edited

Loading

yaohungt commented Mar 24, 2020

AliS567 commented Mar 24, 2020 •

edited

Loading

yaohungt commented Mar 24, 2020

AliS567 commented Mar 24, 2020

Training on a single GPU #5

Training on a single GPU #5

Comments

azamshoaib commented Mar 5, 2020

yaohungt commented Mar 5, 2020 • edited Loading

azamshoaib commented Mar 5, 2020

AliS567 commented Mar 24, 2020

yaohungt commented Mar 24, 2020

AliS567 commented Mar 24, 2020 • edited Loading

yaohungt commented Mar 24, 2020

AliS567 commented Mar 24, 2020 • edited Loading

yaohungt commented Mar 24, 2020

AliS567 commented Mar 24, 2020

yaohungt commented Mar 5, 2020 •

edited

Loading

AliS567 commented Mar 24, 2020 •

edited

Loading

AliS567 commented Mar 24, 2020 •

edited

Loading