Calling .to(device) on classification loss functions, or calling .to(device) on the parameters inside forward() #139

fleawood · 2020-07-14T13:17:24Z

Hello, I have been using this library recently. I find a bug in ProxyAnchorLoss.

pytorch-metric-learning/src/pytorch_metric_learning/losses/proxy_anchor_loss.py

Lines 11 to 23 in a637d48

    
           class ProxyAnchorLoss(WeightRegularizerMixin, BaseMetricLossFunction): 
        
               def __init__(self, num_classes, embedding_size, margin = 0.1, alpha = 32, **kwargs): 
        
                   super().__init__(**kwargs) 
        
                   self.proxies = torch.nn.Parameter(torch.randn(num_classes, embedding_size)) 
        
                   torch.nn.init.kaiming_normal_(self.proxies, mode='fan_out') 
        
                   self.num_classes = num_classes 
        
                   self.margin = margin 
        
                   self.alpha = alpha 
        
               def compute_loss(self, embeddings, labels, indices_tuple): 
        
                   miner_weights = lmu.convert_to_weights(indices_tuple, labels).unsqueeze(1) 
        
                   prox = torch.nn.functional.normalize(self.proxies, p=2, dim=1) if self.normalize_embeddings else self.proxies 
        
                   cos = lmu.sim_mat(embeddings, prox)

self.proxies is defined in Line 14, however, it doesn't move to the same device as embeddings explicitly. An error will raise when I use CUDA

Traceback (most recent call last):
  File "base.py", line 130, in <module>
    main()
  File "base.py", line 126, in main
    trainer.train(num_epochs=num_epochs)
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/trainers/base_trainer.py", line 85, in train
    self.forward_and_backward()
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/trainers/base_trainer.py", line 112, in forward_and_backward
    self.calculate_loss(self.get_batch())
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/trainers/metric_loss_only.py", line 12, in calculate_loss
    self.losses["metric_loss"] = self.maybe_get_metric_loss(embeddings, labels, indices_tuple)
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/trainers/metric_loss_only.py", line 16, in maybe_get_metric_loss
    return self.loss_funcs["metric_loss"](embeddings, labels, indices_tuple)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/losses/base_metric_loss_function.py", line 37, in forward
    loss_dict = self.compute_loss(embeddings, labels, indices_tuple)
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/losses/proxy_anchor_loss.py", line 23, in compute_loss
    cos = lmu.sim_mat(embeddings, prox)
  File "/usr/local/lib/python3.7/site-packages/pytorch_metric_learning/utils/loss_and_miner_utils.py", line 27, in sim_mat
    return torch.matmul(x, y.t())
RuntimeError: Expected object of device type cuda but got device type cpu for argument #2 'mat2' in call to _th_mm

My solution is adding this one line code before sim_mat is called

prox = prox.to(embeddings.device)

The text was updated successfully, but these errors were encountered:

KevinMusgrave · 2020-07-14T13:51:17Z

Normally I move the loss function to the gpu:

loss_func = ProxyAnchorLoss()
loss_func = loss_func.to(torch.device('cuda'))

However, I can also do what you've suggested, to make it more convenient. If I make this change, it'll be to all loss functions with a weight matrix (ArcFace, NormalizedSoftmaxLoss etc...)

KevinMusgrave · 2020-07-31T03:12:00Z

This is fixed in v0.9.90.dev0:

pip install pytorch-metric-learning==0.9.90.dev0

In my opinion, it's still better to move the loss function to the device like in my previous comment. This is because it should be on the device before you create the loss function's optimizer.

KevinMusgrave changed the title ~~A bug in ProxyAnchorLoss~~ Calling .to(device) on classification loss functions, or calling .to(device) on the parameters inside forward() Jul 14, 2020

KevinMusgrave added enhancement New feature or request question A general question about the library and removed question A general question about the library labels Jul 14, 2020

KevinMusgrave added the fixed in dev branch label Jul 31, 2020

KevinMusgrave closed this as completed Aug 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Calling .to(device) on classification loss functions, or calling .to(device) on the parameters inside forward() #139

Calling .to(device) on classification loss functions, or calling .to(device) on the parameters inside forward() #139

fleawood commented Jul 14, 2020

KevinMusgrave commented Jul 14, 2020 •

edited

Loading

KevinMusgrave commented Jul 31, 2020 •

edited

Loading

Calling .to(device) on classification loss functions, or calling .to(device) on the parameters inside forward() #139

Calling .to(device) on classification loss functions, or calling .to(device) on the parameters inside forward() #139

Comments

fleawood commented Jul 14, 2020

KevinMusgrave commented Jul 14, 2020 • edited Loading

KevinMusgrave commented Jul 31, 2020 • edited Loading

KevinMusgrave commented Jul 14, 2020 •

edited

Loading

KevinMusgrave commented Jul 31, 2020 •

edited

Loading