Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Compute] Consider widening the row offset of the row table to 64-bit #43495

Closed
zanmato1984 opened this issue Jul 31, 2024 · 1 comment
Closed

Comments

@zanmato1984
Copy link
Contributor

zanmato1984 commented Jul 31, 2024

Describe the enhancement requested

We’ve seen several reports about the hash join not working for large inputs (e.g., #34474, #37655, and #36995). The reason turns out to be that the row table (the hash table for the hash join) uses uint32_t to represent the row offset within the row data buffer, effectively preventing the row data from exceeding 4GB.

What makes things worse is that, when this limitation is exceeded, users can barely workaround it by regular methods like "splitting the input into smaller batches," which works for many other issues. Because the row table accumulates all the input data, smaller batches do not change the overall data size.

There are also some other aspects:

  1. 32-bit row offset also makes the row table implementation error-prone. For example, GH-34474: [C++] Detect and raise an error if a join will need too much key data #35087, GH-41813: [C++] Fix avx2 gather offset larger than 2GB in CompareColumnsToRows #42188, and GH-43202: [C++][Compute] Detect and explicit error for offset overflow in row table #43226 are fixes to address or detect certain edge cases related to the row offset. Even for the fixed-length code path, which doesn’t deal with the offset buffer at all and thus is supposed to be less problematic, there are obvious offset overflow issues like [1] and [2] (these issues are currently unreported but observed in my local experiments).
  2. We are going to support large-offset data types for the row table eventually, as requested in [C++] Add hash-join support for large-offset types (e.g. large_string) and dictionary encoded types to the new hash-join impl #31622.

Therefore, we should consider widening the row offset of the row table to 64-bit.

[1]

uint32_t offset_right = irow_right * fixed_length + offset_within_row;

[2]
__m256i offset_right =
_mm256_mullo_epi32(irow_right, _mm256_set1_epi32(fixed_length));

Component(s)

C++

pitrou pushed a commit that referenced this issue Aug 19, 2024
…bit (#43389)

### Rationale for this change

The row table uses `uint32_t` as the row offset within the row data buffer, effectively limiting the row data from growing beyond 4GB. This is quite restrictive, and the impact is described in more detail in #43495. This PR proposes to widen the row offset from 32-bit to 64-bit to address this limitation.

#### Benefits
Currently, the row table has three major limitations:
1. The overall data size cannot exceed 4GB.
2. The size of a single row cannot exceed 4GB.
3. The number of rows cannot exceed 2^32.

This enhancement will eliminate the first limitation. Meanwhile, the second and third limitations are less likely to occur. Thus, this change will enable a significant range of use cases that are currently unsupported.

#### Overhead
Of course, this will introduce some overhead:
1. An extra 4 bytes of memory consumption for each row due to the offset size difference from 32-bit to 64-bit.
2. A wider offset type requires a few more SIMD instructions in each 8-row processing iteration.

In my opinion, this overhead is justified by the benefits listed above.

### What changes are included in this PR?

Change the row offset of the row table from 32-bit to 64-bit. Relative code in row comparison/encoding and swiss join has been updated accordingly.

### Are these changes tested?

Test included.

### Are there any user-facing changes?

Users could potentially see higher memory consumption when using acero's hash join and hash aggregation. However, on the other hand, certain use cases used to fail are now able to complete.

* GitHub Issue: #43495

Authored-by: Ruoxi Sun <[email protected]>
Signed-off-by: Antoine Pitrou <[email protected]>
@pitrou pitrou added this to the 18.0.0 milestone Aug 19, 2024
@pitrou
Copy link
Member

pitrou commented Aug 19, 2024

Issue resolved by pull request 43389
#43389

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants