-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi gradcheck failed #1
Comments
I also noticed this problem. There is a gap between numerical grad and analytical grad.
|
Thank you for check
Did you achieved the similar result with mask-rcnn with your roi align module?
…--
Gyeongsik Moon
Ph.D. Candidate
Department of ECE, SNU, Seoul, Korea
http://cv.snu.ac.kr <http://cv.snu.ac.kr/>
2017. 12. 10. 오후 1:02, longcw ***@***.***> 작성:
I also noticed this problem. There is a gap between numerical grad and analytical grad.
But outputs and grads of pytorch version and tensorflow version are almost the same.
numerical:(
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.2012 0.0000 0.0000 0.0000
0.6258 0.1490 0.0000 0.0000
0.0000 0.6855 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0373 0.0000 0.1788 0.0000
0.1341 0.0298 0.5662 0.1192
0.0000 0.1490 0.0000 0.5960
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0596 0.0000
0.0000 0.0000 0.2086 0.0596
0.0000 0.0000 0.0000 0.2384
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
[torch.FloatTensor of size 25x4]
,)
analytical:(
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.2111 0.0000 0.0000 0.0000
0.6141 0.1408 0.0000 0.0000
0.0000 0.6844 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0447 0.0000 0.1893 0.0000
0.1300 0.0298 0.5507 0.1263
0.0000 0.1449 0.0000 0.6138
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0665 0.0000
0.0000 0.0000 0.1934 0.0444
0.0000 0.0000 0.0000 0.2156
0.0000 0.0000 0.0000 0.0000
0.0000 0.0000 0.0000 0.0000
[torch.FloatTensor of size 25x4]
,)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#1 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AM-Lu8_QGoegiZsOVSnjnQNJU02rcEUeks5s-1fLgaJpZM4Q8J9C>.
|
I am not working on Mask RCNN. gradcheck(roi_align, (image_torch, boxes, box_index), eps=1e-3) output (max_val, min_error, max_error, mean_error):
|
How many time did you run the gradcheck?
I ran it 10 times, but only 2 passsed the gradcheck.
…--
Gyeongsik Moon
Ph.D. Candidate
Department of ECE, SNU, Seoul, Korea
<http://cv.snu.ac.kr/> http://cv.snu.ac.kr/
From: longcw [mailto:[email protected]]
Sent: Sunday, December 10, 2017 3:18 PM
To: longcw/RoIAlign.pytorch <[email protected]>
Cc: Gyeongsik Moon <[email protected]>; Author <[email protected]>
Subject: Re: [longcw/RoIAlign.pytorch] Hi gradcheck failed (#1)
I am not working on Mask RCNN.
BTW, I found that this layer can pass the gradcheck if I set eps=1e-3.
eps is the perturbation for finite differences.
gradcheck(roi_align, (image_torch, boxes, box_index), eps=1e-3)
output (max_val, min_error, max_error, mean_error):
('forward:', 0.87139809, 0.0, 7.0184469e-06, 5.5792748e-07)
('backward:', 1.0228419, 0.0, 1.7911196e-05, 9.7078487e-09)
test ok
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#1 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AM-Lu8lNUQkESuGHouF_jRgJrZluMwJiks5s-3ejgaJpZM4Q8J9C> . <https://github.com/notifications/beacon/AM-LuytAF_ONMjU1R8hR8Lw-xK1mVysYks5s-3ejgaJpZM4Q8J9C.gif>
|
@mks0601 Try to modify the random input image: # image_data = np.random.randn(batch_size, depth, im_height, im_width).astype(np.float32)
# =>
image_data = np.random.rand(batch_size, depth, im_height, im_width).astype(np.float32) |
Sorry to say, but changing the input seems not good...
It shows the implemented roi align layer is working on the specific input form (or at least, it does not work on the specific input form such as randn).
Can you tell me why there exists that kind of error?
…--
Gyeongsik Moon
Ph.D. Candidate
Department of ECE, SNU, Seoul, Korea
<http://cv.snu.ac.kr/> http://cv.snu.ac.kr/
From: longcw [mailto:[email protected]]
Sent: Sunday, December 10, 2017 3:59 PM
To: longcw/RoIAlign.pytorch <[email protected]>
Cc: Gyeongsik Moon <[email protected]>; Mention <[email protected]>
Subject: Re: [longcw/RoIAlign.pytorch] Hi gradcheck failed (#1)
@mks0601 <https://github.com/mks0601> Try to modify the random input image:
# image_data = np.random.randn(batch_size, depth, im_height, im_width).astype(np.float32)
# =>
image_data = np.random.rand(batch_size, depth, im_height, im_width).astype(np.float32)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#1 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AM-Lux14lIa-OJC00cQamIRTF1-ooj1eks5s-4EvgaJpZM4Q8J9C> . <https://github.com/notifications/beacon/AM-Lu7T5Ij_Ja2EidOFyR9kOncNzcC-fks5s-4EvgaJpZM4Q8J9C.gif>
|
I don't think this is the problem of the implementation. It's the problem we using gradcheck. I don't know the real reason. You can check the gradcheck function and the source code if you want to figure out the reason for this problem. |
Also, can you let me understand the result of your roi_align layer?
If I fed the input tensor as
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
0 1 2 3 4 5 6
(1x1x7x7)
to the roi_align layer (crop_height=3, crop_width=3)
with argument (xs = [[0,2]], ys = [[0,2]], nbox=1, nbatch=1)
then I think the output should be
0 1 2
0 1 2
0 1 2
.
But, the output of your implementation is different.
Did I understand the roi align in a wrong way?
…--
Gyeongsik Moon
Ph.D. Candidate
Department of ECE, SNU, Seoul, Korea
<http://cv.snu.ac.kr/> http://cv.snu.ac.kr/
From: longcw [mailto:[email protected]]
Sent: Sunday, December 10, 2017 4:55 PM
To: longcw/RoIAlign.pytorch <[email protected]>
Cc: Gyeongsik Moon <[email protected]>; Mention <[email protected]>
Subject: Re: [longcw/RoIAlign.pytorch] Hi gradcheck failed (#1)
I don't think this is the problem of the implementation. It's the problem we using gradcheck.
Changing randn to rand actually decreases the max value of inputs. It can always pass the check if eps > max(inputs)/500, whatever the input is.
I don't know the real reason. You can check the gradcheck function and the source code if you want to figure out the reason for this problem.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub <#1 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/AM-Lu9NjLL3PsXS4Kw3aF3VFACog-Zi4ks5s-45agaJpZM4Q8J9C> . <https://github.com/notifications/beacon/AM-Lu-9_cbPKx0Hv0WYJ-EkTnZS2pAp6ks5s-45agaJpZM4Q8J9C.gif>
|
What you want is import numpy as np
import torch
from torch.autograd import Variable
from roi_align.roi_align import RoIAlign
def to_varabile(arr, requires_grad=False, is_cuda=True):
tensor = torch.from_numpy(arr)
if is_cuda:
tensor = tensor.cuda()
var = Variable(tensor, requires_grad=requires_grad)
return var
# inputs
is_cuda = False
image_data = np.tile(np.arange(7, dtype=np.float32), 7).reshape(7, 7)
image_data = image_data[np.newaxis, np.newaxis]
boxes_data = np.asarray([[0, 0, 2, 2]], dtype=np.float32)
box_index_data = np.asarray([0], dtype=np.int32)
image_torch = to_varabile(image_data, requires_grad=True, is_cuda=is_cuda)
boxes = to_varabile(boxes_data, requires_grad=False, is_cuda=is_cuda)
box_index = to_varabile(box_index_data, requires_grad=False, is_cuda=is_cuda)
# set transform_fpcoor to False is the crop_and_resize
roi_align = RoIAlign(3, 3, transform_fpcoor=False)
print(roi_align(image_torch, boxes, box_index)) output:
If use RoIAlign in this implimentation: # input
...
boxes_data = np.asarray([[0, 0, 3, 3]], dtype=np.float32)
...
roi_align = RoIAlign(3, 3, transform_fpcoor=True)
print(roi_align(image_torch, boxes, box_index)) output:
You can read more about the roialign here: |
Oh, I misunderstood the code of yours. Thanks for clarifying. Can you help me to understand the link you provided? I read the NOTES.md, however I cannot understand why crop_and_resize is different from roi_align except the input form (normalized/unnormalized coordinates?). Also, I cannot understand the code. If I just set the boxes_data as [xmin, ymin, xmax+1, ymax+1] and set transform_fpcoor=True, then it seems works well so far. |
Great help. Thank you. |
Sorry, but I still cannot understand the code. What is the input of your roi_align module? spacing_w is function of x1-x0, not x1-x0+1. So, I think xmax and ymax should be ++. Also, I cannot understand why we have to subtract 0.5. what if just x0, y0, x1, y1 = tf.split(boxes, 4, axis=1) nx0 = x0 / tf.to_float(image_shape[1] - 1) nx1 = x1 / tf.to_float(image_shape[1] - 1) return tf.concat([ny1, nx1, ny1, nx1], axis=1) and transform_fpcoor = False? |
Hi, @mks0601, you may try how to use it for MASK-RCNN at here: https://github.com/tensorboy/Pytorch_Mask_RCNN. :) |
@tensorboy i couldn't access the link. |
Hi thanks for sharing your implementation.
I want to use RoIAlign layer in my pytorch code, and I found your implementation.
To verify your implementation, I ran the test.py and the gradcheck failed.
Did you check the code?
The text was updated successfully, but these errors were encountered: