bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539

itschrispeck · 2024-03-02T08:39:57Z

Addresses #12534

Based on my understanding, calling _decompressor.decompress(compressedInput, decompressedOutput) is problematic because it moves the position of the src and dest ByteBuffers. It is possible to move the position of the src buffer beyond the limit of the buffer (see issue for details).

The change calls the same underlying method, but only moves the position of the dest buffer. The reason why this was a V4 raw index writer/reader version issue is that this only happens with the length prefixed LZ4 library, and this code path is not used for other raw index writer versions.

Testing: Set up a unit test locally, now able to read a previously unreadable segment when loading/iterating through a problematic column using PinotSegmentRecordReader

codecov-commenter · 2024-03-02T09:15:12Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 61.81%. Comparing base (59551e4) to head (a6f65ac).
Report is 119 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #12539      +/-   ##
============================================
+ Coverage     61.75%   61.81%   +0.05%     
+ Complexity      207      198       -9     
============================================
  Files          2436     2453      +17     
  Lines        133233   133864     +631     
  Branches      20636    20764     +128     
============================================
+ Hits          82274    82743     +469     
- Misses        44911    45006      +95     
- Partials       6048     6115      +67

Flag	Coverage Δ
custom-integration1	`<0.01% <0.00%> (-0.01%)`	⬇️
integration	`<0.01% <0.00%> (-0.01%)`	⬇️
integration1	`<0.01% <0.00%> (-0.01%)`	⬇️
integration2	`0.00% <0.00%> (ø)`
java-11	`61.75% <100.00%> (+0.04%)`	⬆️
java-21	`61.69% <100.00%> (+0.07%)`	⬆️
skip-bytebuffers-false	`61.77% <100.00%> (+0.03%)`	⬆️
skip-bytebuffers-true	`61.67% <100.00%> (+33.94%)`	⬆️
temurin	`61.81% <100.00%> (+0.05%)`	⬆️
unittests	`61.80% <100.00%> (+0.05%)`	⬆️
unittests1	`46.93% <0.00%> (+0.04%)`	⬆️
unittests2	`27.69% <100.00%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

richardstartin · 2024-03-02T21:07:37Z

This is a good catch. I have a couple of remarks:

It would be good to add a test because this is a subtle bug that has been hiding in plain sight for over two years, and it could regress in refactoring
Whilst I introduced this bug and am a committer, I’m not really involved in the project any more and can only rubber stamp this change.

Jackie-Jiang · 2024-03-05T23:11:06Z

Trying to understand the problem better here. Why would it cause problem when the source position is moved? It is allowed for decompress() to move the position of the compressed buffer, and it should not break.

itschrispeck · 2024-03-06T07:30:52Z

Trying to understand the problem better here. Why would it cause problem when the source position is moved? It is allowed for decompress() to move the position of the compressed buffer, and it should not break.

I don't think the last bit is always true. While it is allowed to move the position, lz4_decompress_fast can move it beyond the source buffer's limit. See this example, the returned bytes read value is greater than src.limit:

There's a note in the docs:

The function never writes past the output buffer. However, since it doesn't know its 'src' size, it may read past the intended input. Also, because match offsets are not validated during decoding, reads from 'src' may underflow.

I'm also wondering if the the fast decompressor is the best choice. It may be better to move to the safe decompressor which is already used for non v4 indexes - doing that would also address this bug. The lz4 and lz4-java docs disagree here, the former states no significant performance difference but the latter suggests safe is a bit slower (may be out of date, or slower if non JNI path is used).

LZ4_decompress_fast() : unsafe! This function used to be a bit faster than LZ4_decompress_safe(), though situation has changed in recent versions, and now LZ4_decompress_safe() can be as fast and sometimes faster than LZ4_decompress_fast(). Moreover, LZ4_decompress_fast() is not protected vs malformed input, as it doesn't perform full validation of compressed data. As a consequence, this function is no longer recommended, and may be deprecated in future versions. It's only remaining specificity is that it can decompress data without knowing its compressed size.

Jackie-Jiang · 2024-03-06T22:44:53Z

@itschrispeck Thanks for digging into it! I feel using safe decompressor is the safe go here, and we can get consistent behavior for v2 and v4.

Jackie-Jiang · 2024-03-06T22:45:21Z

+1 on adding a unit test to prevent future regression

itschrispeck · 2024-03-12T21:19:54Z

Updated to use safe decompressor. This will move src to src.limit() instead of based on bytes read (ref).

I wasn't able to figure out a good way to round trip unit test this. I've added a comment on why the safe decompressor is chosen to hopefully avoid refactoring into the fast decompressor. With more understanding of the LZ4 algorithm I could probably edit some compressed data's bytes to cause this, but given that fast decompressor is deprecated I'd prefer if we can merge this without the test.

I did a basic validation of the underlying decompress implementations. In general, perf differences are a wash and vary based on host/input:

+--------------+------------------------------+--------------+----------------+--------------+--------------+
|Source        |Function Benchmarked          | Total Seconds|  Iterations/sec|  ns/Iteration|  % of default|
+--------------+------------------------------+--------------+----------------+--------------+--------------+
|Normal Text   |LZ4_decompress_safe()         | 140.382024000|          356170|          2807|       100.00%|
|Normal Text   |LZ4_decompress_fast()         | 133.453277000|          374662|          2669|        95.06%|
|              |                              |              |                |              |              |
|Compressible  |LZ4_decompress_safe()         |  28.044312000|         1782892|           560|        19.98%|
|Compressible  |LZ4_decompress_fast()         |  32.404868999|         1542978|           648|        23.08%|
+--------------+------------------------------+--------------+----------------+--------------+--------------+

ankitsultana requested a review from richardstartin March 2, 2024 09:03

richardstartin approved these changes Mar 2, 2024

View reviewed changes

itschrispeck changed the title ~~bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress~~ [don't merge] do not move src ByteBuffer position for LZ4 length prefixed decompress Mar 4, 2024

do not move src ByteBuffer position

2865f63

itschrispeck force-pushed the lz4_length_decompress branch from 055aa92 to 2865f63 Compare March 4, 2024 07:44

itschrispeck changed the title ~~[don't merge] do not move src ByteBuffer position for LZ4 length prefixed decompress~~ bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress Mar 4, 2024

Jackie-Jiang added the bugfix label Mar 5, 2024

switch to safe decompressor

a6f65ac

Jackie-Jiang merged commit da2604b into apache:master Mar 12, 2024
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539

bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539

itschrispeck commented Mar 2, 2024 •

edited

Loading

codecov-commenter commented Mar 2, 2024 •

edited

Loading

richardstartin commented Mar 2, 2024

Jackie-Jiang commented Mar 5, 2024

itschrispeck commented Mar 6, 2024 •

edited

Loading

Jackie-Jiang commented Mar 6, 2024

Jackie-Jiang commented Mar 6, 2024

itschrispeck commented Mar 12, 2024

bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539

bugfix: do not move src ByteBuffer position for LZ4 length prefixed decompress #12539

Conversation

itschrispeck commented Mar 2, 2024 • edited Loading

codecov-commenter commented Mar 2, 2024 • edited Loading

Codecov Report

richardstartin commented Mar 2, 2024

Jackie-Jiang commented Mar 5, 2024

itschrispeck commented Mar 6, 2024 • edited Loading

Jackie-Jiang commented Mar 6, 2024

Jackie-Jiang commented Mar 6, 2024

itschrispeck commented Mar 12, 2024

itschrispeck commented Mar 2, 2024 •

edited

Loading

codecov-commenter commented Mar 2, 2024 •

edited

Loading

itschrispeck commented Mar 6, 2024 •

edited

Loading