-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DFT] Add FWD/BWD_STRIDES to public API, deprecate INPUT/OUTPUT_STRIDES #528
[DFT] Add FWD/BWD_STRIDES to public API, deprecate INPUT/OUTPUT_STRIDES #528
Conversation
* Adds the FWD/BWD_STRIDES API from the interface spec * Deprecates old INPUT/OUTPUT_STRIDES using the deprecated attribute. * Modifies DFT tests to use FWD/BWD_STRIDES API * The old API is no longer thoroughly tested * Adds additional descriptor tests
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! The strides in cuFFT are pretty difficult to get its head around but that was already the case before... We have pretty good tests so this should be fine!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed the MKLGPU/MKLCPU part of addition and deprication, looks good to me, only thing I was mulling over was the get_strides_api, if in/out_strides are set then fwd/bwd_strides (config_value) must be zero, (which is a internal property of the DftiSetValue (in cpu codebase), set_value(gpu codebase)). But here you are checking config_value.fwd_strides is zero or not! Are we explicitly setting fwd_strides to zero in this PR when we set input_strides or vice-versa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved, can have this #528 (review) discussing without blocking.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My review is coming a little late but I'd appreciate it if one could give it consideration.
for (int r = 1; r < rank; ++r) { | ||
valid = valid && (inner_nfwd <= inner_sfwd) && (inner_nbwd <= inner_sbwd); | ||
inner_nfwd *= n_copy[rank - r - 1]; | ||
inner_nbwd *= n_copy[rank - r - 1]; | ||
inner_sfwd *= strides_fwd[rank - r - 1]; | ||
inner_sbwd *= strides_bwd[rank - r - 1]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does cuFFT support negative stride values? If yes, wouldn't these checks be invalid with negative strides?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is correct.
For CUFFT, strides_fwd
and strides_bwd
correspond to inembed
and onembed
, where
n[i] ≤ inembed[i]
,n[i] ≤ onembed[i]
The strides can be negative, if you set a_stride
and b_stride
negative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So this particular logic is likely adequate w.r.t. the (i|o)nembed
values but there are other bits (determining the (i|o)nembed
) that might need revision before we reach this point then, in my opinion. For instance, a_min
and b_min
would be the "most negative" value as of now (not the minimal stride in absolute value) which is not what we'd want in this case, unless I'm wrong. This is rather orthogonal to the task targeted herein though, I just wanted to point out a possible concern.
I am so sorry @raphael-egan I forgot you were also planning to review it! I merged it as Hugh was going on holiday. I am sure he will be able to address your comments once he is back. |
No problem, I'm sure he will. |
Description
Fixes #488
Checklist
All Submissions
New interfaces
New features