Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

struct CaseSet: Optimize by matching on len once per set_ctx #1283

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

kkysen
Copy link
Collaborator

@kkysen kkysen commented Jul 3, 2024

In C, case_set works by switching once on len, once per set_ctx, but my CaseSet implementation in Rust accidentally switched to matching on buf.len() each time. This goes back to the original, optimal behavior.

To do this, set_ctx is made a rank-2 polymorphic "closure". This does not exist in Rust, so it's emulated through a generic trait with a generic method. This inner generic over trait CaseSetter is what allows fn CaseSet::one to select the correct CaseSetterN at compile time.

However, this means that closures can't be used anymore, which is very annoying. To partially remedy this, I added the set_ctx! macro, which emulates the set_ctx closure as much as possible. All captures (up vars in rustc) and their types must be declared.

I didn't actually benchmark this or look at the asm yet, though, if anyone wants to do that.

@kkysen kkysen requested a review from rinon July 3, 2024 11:08
In C, `case_set` works by `switch`ing once on `len`, once per `set_ctx`,
but my `CaseSet` implementation in Rust accidentally switched to `match`ing on `buf.len()` each time.
This goes back to the original, optimal behavior.

To do this, `set_ctx` is made a rank-2 polymorphic "closure".
This does not exist in Rust, so it's emulated through a generic trait with a generic method.
This inner generic over `trait CaseSetter` is what allows
`fn CaseSet::one` to select the correct `CaseSetterN` at compile time.

However, this means that closures can't be used anymore, which is very annoying.
To partially remedy this, I added the `set_ctx!` macro,
which emulates the `set_ctx` closure as much as possible.
All captures (up vars in `rustc`) and their types must be declared.
@kkysen kkysen force-pushed the kkysen/case_set-rank2 branch from 4df8569 to 0de4037 Compare July 3, 2024 11:13
@rinon
Copy link
Collaborator

rinon commented Jul 3, 2024

I'm working on validating performance for this. It's not making as much of a difference as expected, and I think some checks are not getting elided.

@kkysen
Copy link
Collaborator Author

kkysen commented Jul 4, 2024

I'm working on validating performance for this. It's not making as much of a difference as expected, and I think some checks are not getting elided.

😭 I didn't get a chance to look at the asm yet, though. Let me know if there's anything simple we can do there. I can look a bit too to see which things aren't elided.

@rinon
Copy link
Collaborator

rinon commented Jul 4, 2024

I have a very different way of doing this that is a lot less complicated, I think, so don't spend time on it for now.

@kkysen
Copy link
Collaborator Author

kkysen commented Jul 4, 2024

I have a very different way of doing this that is a lot less complicated, I think, so don't spend time on it for now.

Oh interesting, hope it works!

@kkysen kkysen changed the title struct CaseSet: Optimize by matching on len one per set_ctx struct CaseSet: Optimize by matching on len once per set_ctx Jul 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants