Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

groot/internal/rcompress: reuse buffers when possible #1023

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bburghgr
Copy link

Fixes #712, but the current implementation is a bit delicate (from pre-generics sync use), so I'm not entirely sure it's worth it for the marginal performance improvements.

This reuses buffers for lz4, zlib and zstd. The other compression algs did not appear to expose an API allowing for buffer reuse. All of this should be threadsafe, but hit hasn't been tested very thoroughly. Nothing is flagged if I build groot-bench with -race and generate files or run benchmarks, for whatever that's worth.

Any performance gains would come from reducing allocations, seen in e.g. go tool pprof -alloc-space on a memprofile, along with a reduction in gc frequency from the reduced allocation rate. There's very little change to overall throughput or memory use (small improvements in some cases):

$ benchstat default.txt pooled.txt 
goos: linux
goarch: amd64
pkg: github.com/go-hep/groot-bench
cpu: Intel(R) Core(TM) i5-10300H CPU @ 2.50GHz
                                       │ default.txt │             pooled.txt             │
                                       │   sec/op    │   sec/op     vs base               │
ReadScalar/GoHEP/None-8                  515.0m ± 1%   512.8m ± 1%       ~ (p=0.221 n=20)
ReadScalar/GoHEP/LZ4-8                   516.6m ± 1%   517.1m ± 0%       ~ (p=0.495 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             474.5m ± 1%   473.2m ± 1%       ~ (p=0.904 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             482.3m ± 1%   479.6m ± 1%       ~ (p=0.355 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8              1.250 ± 1%    1.213 ± 5%  -2.94% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/None-8        941.4m ± 1%   933.2m ± 1%       ~ (p=0.056 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         949.4m ± 1%   946.4m ± 1%       ~ (p=0.620 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   946.8m ± 1%   937.1m ± 1%       ~ (p=0.121 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   953.2m ± 1%   940.0m ± 1%  -1.38% (p=0.013 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8    1.982 ± 1%    1.972 ± 1%       ~ (p=0.091 n=20)
ReadScalar/ROOT-TreeReader/None-8         1.125 ± 2%    1.105 ± 2%       ~ (p=0.091 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8          1.145 ± 2%    1.137 ± 2%       ~ (p=0.165 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8    1.120 ± 1%    1.115 ± 1%       ~ (p=0.583 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8    1.138 ± 2%    1.136 ± 2%       ~ (p=0.989 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8    2.161 ± 0%    2.153 ± 1%       ~ (p=0.091 n=20)
geomean                                  945.5m        938.2m       -0.78%

                                       │ default.txt │             pooled.txt              │
                                       │   rss/op    │   rss/op     vs base                │
ReadScalar/GoHEP/None-8                  9.984M ± 1%   9.984M ± 0%        ~ (p=0.966 n=20)
ReadScalar/GoHEP/LZ4-8                   9.984M ± 1%   9.984M ± 1%        ~ (p=0.795 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             9.984M ± 1%   9.856M ± 1%   -1.28% (p=0.016 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             11.26M ± 1%   10.11M ± 1%  -10.23% (p=0.000 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8             12.88M ± 1%   12.68M ± 1%   -1.58% (p=0.004 n=20)
ReadScalar/ROOT-TreeBranch/None-8        320.3M ± 0%   319.9M ± 0%   -0.10% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         320.5M ± 0%   320.2M ± 0%   -0.08% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   320.3M ± 0%   320.1M ± 0%   -0.07% (p=0.001 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   320.4M ± 0%   320.1M ± 0%   -0.08% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8   320.2M ± 0%   320.0M ± 0%   -0.08% (p=0.000 n=20)
ReadScalar/ROOT-TreeReader/None-8        330.8M ± 0%   330.9M ± 0%        ~ (p=0.825 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8         331.2M ± 0%   331.3M ± 0%        ~ (p=0.899 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8   331.2M ± 0%   331.2M ± 0%        ~ (p=0.733 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8   331.2M ± 0%   331.2M ± 0%        ~ (p=1.000 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8   331.2M ± 0%   331.2M ± 0%        ~ (p=0.560 n=20)
geomean                                  104.5M        103.5M        -0.93%

                                       │ default.txt │             pooled.txt             │
                                       │   vmem/op   │   vmem/op    vs base               │
ReadScalar/GoHEP/None-8                  1.232G ± 0%   1.232G ± 0%       ~ (p=0.841 n=20)
ReadScalar/GoHEP/LZ4-8                   1.232G ± 0%   1.232G ± 0%  +0.01% (p=0.044 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             1.232G ± 0%   1.232G ± 0%  +0.00% (p=0.021 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             1.233G ± 0%   1.232G ± 0%  -0.05% (p=0.000 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8             1.233G ± 0%   1.233G ± 0%  +0.02% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/None-8        538.0M ± 0%   538.0M ± 0%       ~ (p=0.175 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         538.0M ± 0%   538.0M ± 0%       ~ (p=0.987 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   538.0M ± 0%   538.0M ± 0%       ~ (p=0.471 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   538.1M ± 0%   538.1M ± 0%       ~ (p=0.386 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8   538.0M ± 0%   538.0M ± 0%       ~ (p=0.997 n=20)
ReadScalar/ROOT-TreeReader/None-8        555.6M ± 0%   555.6M ± 0%       ~ (p=0.552 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8         555.6M ± 0%   555.6M ± 0%       ~ (p=0.136 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8   555.7M ± 0%   555.7M ± 0%       ~ (p=0.374 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8   555.7M ± 0%   555.7M ± 0%       ~ (p=0.393 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8   555.6M ± 0%   555.6M ± 0%       ~ (p=0.473 n=20)
geomean                                  716.9M        716.9M       -0.00%

                                       │ default.txt │             pooled.txt             │
                                       │   wall/op   │   wall/op    vs base               │
ReadScalar/GoHEP/None-8                  513.8m ± 1%   511.8m ± 1%       ~ (p=0.280 n=20)
ReadScalar/GoHEP/LZ4-8                   514.3m ± 1%   515.3m ± 1%       ~ (p=0.888 n=20)
ReadScalar/GoHEP/Zlib-Lvl1-8             475.8m ± 1%   474.1m ± 1%       ~ (p=0.602 n=20)
ReadScalar/GoHEP/Zlib-Lvl6-8             482.4m ± 1%   480.0m ± 0%       ~ (p=0.433 n=20)
ReadScalar/GoHEP/Zlib-Lvl9-8              1.250 ± 1%    1.212 ± 4%  -2.96% (p=0.000 n=20)
ReadScalar/ROOT-TreeBranch/None-8        935.6m ± 2%   935.1m ± 1%       ~ (p=0.718 n=20)
ReadScalar/ROOT-TreeBranch/LZ4-8         947.1m ± 2%   950.1m ± 2%       ~ (p=0.947 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl1-8   952.2m ± 2%   932.4m ± 1%  -2.09% (p=0.013 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl6-8   945.6m ± 2%   933.4m ± 0%  -1.28% (p=0.006 n=20)
ReadScalar/ROOT-TreeBranch/Zlib-Lvl9-8    1.982 ± 1%    1.972 ± 1%       ~ (p=0.082 n=20)
ReadScalar/ROOT-TreeReader/None-8         1.124 ± 2%    1.106 ± 2%       ~ (p=0.090 n=20)
ReadScalar/ROOT-TreeReader/LZ4-8          1.145 ± 2%    1.136 ± 2%       ~ (p=0.151 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl1-8    1.121 ± 1%    1.115 ± 1%       ~ (p=0.597 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl6-8    1.137 ± 2%    1.135 ± 2%       ~ (p=0.984 n=20)
ReadScalar/ROOT-TreeReader/Zlib-Lvl9-8    2.160 ± 0%    2.153 ± 1%       ~ (p=0.087 n=20)
geomean                                  944.5m        937.5m       -0.74%

@sbinet
Copy link
Member

sbinet commented Nov 18, 2024

yeah, the results are a bit of a let down.

perhaps we could revisit this when/if a generic sync.Map is introduced, as described in:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

groot/internal/rcompress: better buffer reuse
2 participants