`struct CaseSet`: Optimize by `match`ing on `len` once per `set_ctx` #1283

kkysen · 2024-07-03T11:08:14Z

In C, case_set works by switching once on len, once per set_ctx, but my CaseSet implementation in Rust accidentally switched to matching on buf.len() each time. This goes back to the original, optimal behavior.

To do this, set_ctx is made a rank-2 polymorphic "closure". This does not exist in Rust, so it's emulated through a generic trait with a generic method. This inner generic over trait CaseSetter is what allows fn CaseSet::one to select the correct CaseSetterN at compile time.

However, this means that closures can't be used anymore, which is very annoying. To partially remedy this, I added the set_ctx! macro, which emulates the set_ctx closure as much as possible. All captures (up vars in rustc) and their types must be declared.

I didn't actually benchmark this or look at the asm yet, though, if anyone wants to do that.

In C, `case_set` works by `switch`ing once on `len`, once per `set_ctx`, but my `CaseSet` implementation in Rust accidentally switched to `match`ing on `buf.len()` each time. This goes back to the original, optimal behavior. To do this, `set_ctx` is made a rank-2 polymorphic "closure". This does not exist in Rust, so it's emulated through a generic trait with a generic method. This inner generic over `trait CaseSetter` is what allows `fn CaseSet::one` to select the correct `CaseSetterN` at compile time. However, this means that closures can't be used anymore, which is very annoying. To partially remedy this, I added the `set_ctx!` macro, which emulates the `set_ctx` closure as much as possible. All captures (up vars in `rustc`) and their types must be declared.

rinon · 2024-07-03T22:01:00Z

I'm working on validating performance for this. It's not making as much of a difference as expected, and I think some checks are not getting elided.

kkysen · 2024-07-04T01:19:53Z

I'm working on validating performance for this. It's not making as much of a difference as expected, and I think some checks are not getting elided.

😭 I didn't get a chance to look at the asm yet, though. Let me know if there's anything simple we can do there. I can look a bit too to see which things aren't elided.

rinon · 2024-07-04T01:21:10Z

I have a very different way of doing this that is a lot less complicated, I think, so don't spend time on it for now.

kkysen · 2024-07-04T01:22:37Z

I have a very different way of doing this that is a lot less complicated, I think, so don't spend time on it for now.

Oh interesting, hope it works!

kkysen requested a review from rinon July 3, 2024 11:08

kkysen force-pushed the kkysen/case_set-rank2 branch from 4df8569 to 0de4037 Compare July 3, 2024 11:13

kkysen added the performance label Jul 3, 2024

kkysen changed the title ~~struct CaseSet: Optimize by matching on len one per set_ctx~~ struct CaseSet: Optimize by matching on len once per set_ctx Jul 4, 2024

rinon mentioned this pull request Jul 6, 2024

[WIP] case_set implementation comparison #1292

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`struct CaseSet`: Optimize by `match`ing on `len` once per `set_ctx` #1283

`struct CaseSet`: Optimize by `match`ing on `len` once per `set_ctx` #1283

kkysen commented Jul 3, 2024

rinon commented Jul 3, 2024

kkysen commented Jul 4, 2024

rinon commented Jul 4, 2024

kkysen commented Jul 4, 2024

struct CaseSet: Optimize by matching on len once per set_ctx #1283

Are you sure you want to change the base?

struct CaseSet: Optimize by matching on len once per set_ctx #1283

Conversation

kkysen commented Jul 3, 2024

rinon commented Jul 3, 2024

kkysen commented Jul 4, 2024

rinon commented Jul 4, 2024

kkysen commented Jul 4, 2024

`struct CaseSet`: Optimize by `match`ing on `len` once per `set_ctx` #1283

`struct CaseSet`: Optimize by `match`ing on `len` once per `set_ctx` #1283