-
Notifications
You must be signed in to change notification settings - Fork 449
Fix DeviceHistogram::Even for mixed float/int levels and sample types. #487
Fix DeviceHistogram::Even for mixed float/int levels and sample types. #487
Conversation
gpuCI: NVIDIA/thrust#1697 |
@Melirius This should fix your issue with |
It is nice, thanks for your work! However, it does not address another problem #489 that is rooted in the same code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for starting the work on this! It seems that no one looked at the code for a while 😄
Regarding the PR, it seems to introduce some substantial changes. I'd like us to consider all implications before merging this PR. For instance, documentation doesn't say much about SampleT
and LevelT
. Unlike ConterT
, LevelT
isn't even required to be primitive. Let's gather requirements on these types, update documentation / assert these requirements and then have another review.
b0b8adb
to
ff1b254
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for documenting actual requirements! A few minor fixes below.
ff1b254
to
c765c89
Compare
#include <cub/util_cpp_dialect.cuh> | ||
|
||
#if CUB_CPP_DIALECT >= 2017 && __cpp_if_constexpr | ||
# define CUB_IF_CONSTEXPR if constexpr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be sure, we cannot pass -Wno-c++17-extensions
to our build flags
Asking for a friend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No -- Thrust/CUB are header only, so that would require our users to do the same.
c765c89
to
16288ce
Compare
16288ce
to
cf20c71
Compare
cf20c71
to
24b675d
Compare
24b675d
to
3b3bf92
Compare
The
ScaleTransform
utility precomputes the reciprocal of the "sample
->bin_idx
" scaling factor for floating point types as an optimization.Trouble is, the
Init
method checked whetherLevelT
is fp whileBinSelect
checkedSampleT
. This caused the optimization to be incorrectly applied when one of these types is fp but the other is not.Fixed this bug and simplified the implementation of
ScaleTransform
usingcuda::std::common_type
to consistently apply the optimization when eitherLevelT
orScaleT
are fp.Fixes #479 and #489 and adds regression tests for both.
Breaking Changes:
DeviceHistogram::HistogramEven
algorithms when the samples and levels are different types (cub::DeviceHistogram::HistogramEven: Incorrect result whenLevelT
does not exactly matchSampleT
#479) and when both are integral with fractional bin sizes (cub::DeviceHistogram::HistogramEven: Incorrect result with integer levels and samples #489). These fixes introduced the following explicit restrictions:SampleT
andLevelT
must support common arithmetic operations.cuda::std::common_type<SampleT, LevelT>
must have a valid definition.int
.