[FEA] Simplify code for NaN handling in lists/drop_list_duplicates
#9257
Labels
feature request
New feature or request
improvement
Improvement / enhancement to an existing function
libcudf
Affects libcudf (C++/CUDA) code.
non-breaking
Non-breaking change
Currently,
drop_list_duplicates
requires an input parameter specifying whether NaN values should be considered as equal or not. This parameter fulfills different desired behaviors in both Pandas and Spark. Insidedrop_list_duplicates
, the implementation code needs to pass that parameter down to multiple levels, increasing the complexity of the implementation and leading to burdensome in maintanance.We should simplify the code somehow, reducing the number of code paths, or at least removing the passing-down parameter. Another potential way for this may be as recommended in #9202 (comment), which worth to explore.
The text was updated successfully, but these errors were encountered: