-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Match Spark's NaN handling in collect_set #8909
Merged
Merged
Changes from all commits
Commits
Show all changes
9 commits
Select commit
Hold shift + click to select a range
0fbc47f
Match Spark's NaN handling in collect_set
thirtiseven e9e482b
Revert "Fix collect_set_on_nested_type tests failed (#8783)"
thirtiseven 04e62e3
clean up and add comments
thirtiseven 250a89a
remove xfail
thirtiseven f500a8a
update cudfmergesets too
thirtiseven db500a5
edit comment
thirtiseven cd587cf
remove related nan datagens
thirtiseven 3b4abd7
clean up
thirtiseven 7b8af24
clean up
thirtiseven File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check performed a recursive check for floats/doubles at all nested levels, but your changes seem to say that
Array[Double]
andArray[Float]
areUNEQUAL
. Should your code handleArray[Array[Double]]
(and any level of nesting, including things likeStruct[Array[Double]]
for example)? Are those casesUNEQUAL
or are they somehow compared differently. If so we should add comments clarifying why an array/struct a 2nd+ level is ok.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes it's a bit confusing. I added a comment.
In
GpuCollectBase
we have:My change wants to make NaNs UNEQUAL only if
child.dataType
is double or float, otherwise ALL_EQUAL.But when we call
CudfCollectSet
orCudfMergeSets
we need to passdataType
to it. So when I check ifdatatype
isArray(Double)
orArray(Float)
, what I really want to check is ifchild.dataType
is float or double.(I hope I made it clear...)