Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Apply sliced_child when calling to slice #9219

Open
ttnghia opened this issue Sep 11, 2021 · 10 comments
Open

[FEA] Apply sliced_child when calling to slice #9219

ttnghia opened this issue Sep 11, 2021 · 10 comments
Labels
feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. wontfix This will not be worked on

Comments

@ttnghia
Copy link
Contributor

ttnghia commented Sep 11, 2021

I observe that there are a lot of bugs related to the situations when an API directly accesses the child columns of a sliced column instead of calling to get_sliced_child. As such, the slice API is a kind of shallow slice, not a deep slice. Maybe shallow slice is more efficient as it can avoid unnecessary slicing of the children columns when we don't care, it has caused a lot of (potential) bugs that cost a lot of developer time.

An instance of such bugs is here: #9218. In the past, I have also dealt with many similar situations but I could catch them immediately through unit tests. If a developer forgets to write unit tests for sliced input, the bug may be there.

I would like to rewrite slice into deep slicing, i.e., recursively calling to slice on all children columns of the column being sliced. This way, when we access its children column through the APIs child_begin(), child_end, or child(idx) we will have the expected results all the time. Although we have talked about this before and didn't do anything as deep slicing is expensive, I still decided to raise the issue again as it still causes bugs.

An alternative solution to this issue is to rename the existing slice API into shallow_slice then add another slice version that does recursively calling shallow_slice on the columns. So, a developer will only call shallow_slice if he/she knows exactly that just the shallow version is needed in the context. Otherwise, a more expensive slice version will produce the correct results in most situations.

@ttnghia ttnghia added feature request New feature or request libcudf Affects libcudf (C++/CUDA) code. tech debt improvement Improvement / enhancement to an existing function labels Sep 11, 2021
@jrhemstad
Copy link
Contributor

I would like to rewrite slice into deep slicing, i.e., recursively calling to slice on all children columns of the column being sliced. This way, when we access its children column through the APIs child_begin(), child_end, or child(idx) we will have the expected results all the time.

This would break strings columns. And maybe lists? So this would only apply to structs, which would be pretty inconsistent behavior.

@ttnghia
Copy link
Contributor Author

ttnghia commented Sep 14, 2021

Another related-similar issue is not just child and sliced_child but also offsets. When dealing with lists_column_view, I discovered a bug that is due to accessing lists_column_view::offsets() of a sliced lists column. The offsets API returns the original offsets, not the sliced offsets.

I'm not sure if this is also the case for strings column.

@jrhemstad
Copy link
Contributor

The offsets API returns the original offsets, not the sliced offsets. I'm not sure if this is also the case for strings column.

Yes, it is. Otherwise it would require actually modifying device memory to update all of the values of the offset column to be relative to the offset of the parent. This is obviously expensive and not something we want to do except in cases where it is explicitly required.

@ttnghia
Copy link
Contributor Author

ttnghia commented Sep 14, 2021

I see. It seems that deep slicing is necessary to avoid bugs, but it is expensive thus we try to avoid.

I have an idea: Apply lazy evaluation concept: we do deep slicing but do not initialize the offset value until we actually need it. Instead, when doing deep slicing that needs memcpy from the device, we just store the device memory addresses to be copied from. Something similar to this:

class column_view {
.....
size_type _offset = INVALID_OFFSET;

size_type offset() const {
  if(_offset == INVALID_OFFSET) {
    cudaMemcpy(.....);
  }
  return _offset;
}

void set_offset(void const* device_add) { ....}

Of course, actual implementation is more complex than this. But the idea above can:

  • Avoid expensive memcpy and device sync when deep slicing is not needed, and
  • Eliminate potential bugs due to accessing non-sliced children, as it is deep slicing by default

Note that this is something similar to caching scalar value that we mentioned before.

@davidwendt
Copy link
Contributor

Deep slicing is not always necessary. We could add a specific statement to the developer guide that child columns are not sliced.

This kind of coding error would ideally be caught by an appropriate gtest or during a PR review.

@jrhemstad
Copy link
Contributor

I have an idea: Apply lazy evaluation concept: we do deep slicing but do not initialize the offset value until we actually need it. Instead, when doing deep slicing that needs memcpy from the device, we just store the device memory addresses to be copied from.

Unfortunately its more complicated than that. When a strings/list column is sliced, the values of the offsets column remain unchanged. Meaning, the values of the offset column are still relative to the _un_sliced version of the parent.

For example, a strings column:

logical: {"a", "", "bc"}
offset: 0
chars: {abc}
offsets: {0, 1, 1, 3}

If you were to slice off the last two elements of this strings column you'd have:

logical: {"", "bc"}
offset: 1
chars: {abc}
offsets: {0, 1, 1, 3}

The values of the offsets column is still relative to the original unsliced column.

So it's not just a matter of changing the singular offset of the chars and offsets column, but you'd also need to change the values of the offsets child.

@ttnghia
Copy link
Contributor Author

ttnghia commented Sep 14, 2021

I see. This is really context-dependent 😞

@jrhemstad
Copy link
Contributor

One idea would be to add a chars_begin and offsets_begin functions to strings_column_view that return a transform iterator that automatically applies the offset of the parent.

I don't think that would solve many of the problems you highlighted above, but still seems like a useful thing that would alleviate similar kinds of problems.

@github-actions
Copy link

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

@github-actions
Copy link

This issue has been labeled inactive-90d due to no recent activity in the past 90 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed.

@davidwendt davidwendt added the wontfix This will not be worked on label Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

5 participants