-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Granular benchmarks of palette transformations. #460
Granular benchmarks of palette transformations. #460
Conversation
This commit introduces a new `transform.rs` module and moves row transformation functions into the new module: * From `util.rs`: - `unpack_bits` (no longer needs to be `pub`) - `expand_trns_line` - `expand_trns_line16` - `expand_trns_and_strip_line16` * From `mod.rs`: - `expand_paletted` - `expand_gray_u8` This commit also renames `util.rs` into `adam7.rs`, because after the refactoring above this module contains only Adam7-related functionality: - `struct Adam7Iterator` - `fn expand_pass` which operates on already-transformed, but still-interlaced rows) This commit is intended to be just pure refactoring (i.e. no changes in behavior or performance are expected).
This ensures that all the public functions in the `transform.rs` module are infallible.
This commit tweaks `transform.rs` so that all the functions take the same parameters: `input: &[u8], output: &mut [u8], info: &Info`. This is achieved by: 1. Taking `info: &Info` instead of `trns: Option<&[u8]>, channels: usize` for `expand_trns_line`, `expand_trns_line16`, `expand_trns_and_strip_line16`. 2. Removing `trns: Option<Option<&[u8]>>` parameter from `expand_paletted` and `expand_gray_u8` by splitting these functions into two separate flavors: ones that emit an alpha channel and ones that don't.
Instead of deciding which function to use for every row, memoize and reuse the first decision. One desirable outcome of this commit is making the public API of the `transform.rs` module quite thin (just the `TransformFn` type alias and the `create_transform_fn` function) - this makes this functionality easier to test and benchmark. Another desirable outcome is a small runtime improvement in most benchmarks (compared to the baseline just before the commit that introduces `transform.rs`): decode/paletted-zune.png: [-8.7989% -7.4940% -6.1466%] (p = 0.00 < 0.05) decode/kodim02.png: [-4.4824% -4.0883% -3.6232%] (p = 0.00 < 0.05) decode/Transparency.png: [-4.5886% -3.5213% -2.2121%] (p = 0.00 < 0.05) decode/kodim17.png: [-2.4406% -2.0663% -1.7093%] (p = 0.00 < 0.05) decode/kodim07.png: [-3.4461% -2.8264% -2.2676%] (p = 0.00 < 0.05) decode/kodim23.png: [-1.7490% -1.3101% -0.7639%] (p = 0.00 < 0.05) decode/Lohengrin: [-2.9387% -2.3664% -1.7545%] (p = 0.00 < 0.05) generated-noncompressed-4k-idat/8x8.png: [-4.0353% -3.5931% -3.1529%] (p = 0.00 < 0.05) generated-noncompressed-4k-idat/128x128.png: [-5.2607% -4.6452% -4.0279%] (p = 0.00 < 0.05) generated-noncompressed-4k-idat/2048x2048.png: [-3.0347% -1.7376% -0.4028%] (p = 0.03 < 0.05) generated-noncompressed-64k-idat/128x128.png: [-2.3769% -1.7924% -1.2211%] (p = 0.00 < 0.05) generated-noncompressed-64k-idat/2048x2048.png: [-12.113% -9.8099% -7.2633%] (p = 0.00 < 0.05) generated-noncompressed-64k-idat/12288x12288.png: [-5.0077% -1.4750% +1.4708%] (p = 0.43 > 0.05) generated-noncompressed-2g-idat/12288x12288.png: [-9.1860% -8.2857% -7.3934%] (p = 0.00 < 0.05) Some regressions were observed in 2 benchmarks: generated-noncompressed-4k-idat/12288x12288.png: [+2.5010% +3.1616% +3.8445%] (p = 0.00 < 0.05) [+3.6046% +4.6592% +5.8580%] (p = 0.00 < 0.05) [+4.6484% +5.4718% +6.4193%] (p = 0.00 < 0.05) generated-noncompressed-2g-idat/2048x2048.png: [-0.6455% +1.9676% +3.9191%] (p = 0.13 > 0.05) [+6.7491% +8.4227% +10.791%] (p = 0.00 < 0.05) [+5.9926% +7.2249% +8.5428%] (p = 0.00 < 0.05)
eb8b004
to
31d161a
Compare
@fintelia, can you PTAL?
/cc @Shnatsel |
No description provided.