Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: we may have a chance to avoid some copies when decoding with some alignment adjustment(page alignment, chunk alignment) #3115

Open
broccoliSpicy opened this issue Nov 11, 2024 · 3 comments · Fixed by #3116
Assignees

Comments

@broccoliSpicy
Copy link
Contributor

broccoliSpicy commented Nov 11, 2024

PR #3101 added alignment in page layout and chunk layout, but PR #3099 still need to do a copy of the raw data read from disk to start decoding, see the code here

for fastlanes bitpacking, there is also a copy

let chunk_in_u8: Vec<u8> = data.to_vec();

in binary + miniblock, I think this copy can be avoided if we align each chunk to 4 bytes.
in fastlanes bitpacking, because the use of SIMD instruction, the alignment requirement is stronger, reference here, we may also need to change the compression logic to allow it

@broccoliSpicy broccoliSpicy self-assigned this Nov 11, 2024
@broccoliSpicy broccoliSpicy changed the title perf: we may have a change to avoid some copies with some alignment adjustment(page alignment, chunk alignment) perf: we may have a chance to avoid some copies with some alignment adjustment(page alignment, chunk alignment) Nov 11, 2024
@broccoliSpicy broccoliSpicy changed the title perf: we may have a chance to avoid some copies with some alignment adjustment(page alignment, chunk alignment) perf: we may have a chance to avoid some copies when decoding with some alignment adjustment(page alignment, chunk alignment) Nov 11, 2024
@westonpace
Copy link
Contributor

in binary + miniblock, I think this copy can be avoided if we align each chunk to 4 bytes.

We already align each chunk to 8 bytes but there was a bug preventing this from working correctly.

in fastlanes bitpacking, because the use of SIMD instruction, the alignment requirement is stronger

How much stronger? 64 byte alignment for a 4KiB chunk seems pretty extensive (up to 6% of the block is wasted space) but I suppose it is manageable. Still, if we want to require this then I'd prefer changing the MiniBlockCompressor trait to allow compressors to state how much alignment they need. This way compressors that don't need such strict requirements don't have to pay.

@westonpace
Copy link
Contributor

(up to 6% of the block is wasted space)

I did the math wrong. 64/4096 is 1/64th so more like 1-2%.

broccoliSpicy pushed a commit that referenced this issue Nov 11, 2024
We align buffers in the file writer but we were not doing the same thing
in the test utility. This forced encodings to do extra copies. We remove
one such copy in this PR.

Closes #3115
@broccoliSpicy broccoliSpicy reopened this Nov 11, 2024
@broccoliSpicy
Copy link
Contributor Author

let chunk_in_u8: Vec<u8> = data.to_vec();

instantiate a copy for the data using .to_vec doesn't guarantee the data starts at a 64 byte aligned position either, we may be able to get rid of this copy with the currently page layout padding, if so, changes in bitpack mini-block compression logic needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants