Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement on the scalar / random bitpacker code. #1781

Merged
merged 2 commits into from
Jan 19, 2023

Conversation

fulmicoton
Copy link
Collaborator

Added proptesting
Added simple benchmark
Added assert and comments on the very non trivial hidden contract Remove the need for an extra padding.

The last point introduces a small performance regression (~10%).

@fulmicoton fulmicoton force-pushed the bitpacker-no-padding branch from 659bf93 to e0d93c0 Compare January 12, 2023 05:45
@fulmicoton fulmicoton requested a review from PSeitz January 12, 2023 05:46
@fulmicoton fulmicoton marked this pull request as ready for review January 12, 2023 05:46
return 0u64;
}
let addr_in_bits = idx * self.num_bits as u32;
let addr_in_bits = idx * self.num_bits;
Copy link
Contributor

@PSeitz PSeitz Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if dynamic dispatch on a compile time num_bits for all 64 variants would be a benefit for performance

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the usage sites, it looks like the compiler will have a hard time inferring a constant, hence I would suspect that it could help. But I suspect it would also need some kind of type registry to provide access to &'static dyn without boxing?

Finally, if the number of actually used instance is really small, it might be better to use "dictionary dispatch", e.g. have a match on the bit width and branch into compile-time specializations with known bit widths, c.f. https://lwn.net/Articles/815908/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if what I meant was clear, when opening you'd get a typed version

pub fn new(num_bits: u8) -> Arc<dyn BitUnpackerTrait>{
    match num_bits {
        1 => Arc::new(BitUnpacker::<1>),
        2 => Arc::new(BitUnpacker::<2>),
        ...
    }
}

That would limit inlining and we have a dynamic dispatch, but it would allow optimization in the get method and reduce the number of cycles there.

https://godbolt.org/z/ch4Wsjaf6

A match would also be an alternative worth looking into.

With specialized versions, we don't need the padding anymore, since we can read a variable number of bytes, depending on num_bits.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if what I meant was clear, when opening you'd get a typed version

This is exactly how I understood your suggestion. I assumed a registry type data structure to avoid allocating the Arc/Box/etc every time and handing out &'static dyn BitUnpackerTrait instead.

Is there are inherent limit on the possible values of num_bits? I don't think eagerly producing 255 monomorphisations and matching on num_bits to dispatch into them is reasonable, but IDK 32 might be.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect a regression but who knows.
That's not a bad idea., at least it could help removing the 0 special case if there are no regressions...
and with the bit=1 case we could also use it to build an adhoc column value for booleans.

Added proptesting
Added simple benchmark
Added assert and comments on the very non trivial hidden contract
Remove the need for an extra padding.

The last point introduces a small performance regression (~10%).
@fulmicoton fulmicoton force-pushed the bitpacker-no-padding branch from e0d93c0 to e3e46e1 Compare January 19, 2023 09:08
@fulmicoton fulmicoton merged commit 08919a2 into main Jan 19, 2023
@fulmicoton fulmicoton deleted the bitpacker-no-padding branch January 19, 2023 09:09
Hodkinson pushed a commit to Hodkinson/tantivy that referenced this pull request Jan 30, 2023
* Improvement on the scalar / random bitpacker code.

Added proptesting
Added simple benchmark
Added assert and comments on the very non trivial hidden contract
Remove the need for an extra padding.

The last point introduces a small performance regression (~10%).

* Fixing unit tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants