Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
flate: add TryWriteCopy to inline dictionary operations
Most LZ77 dictionary copies involve short distances and short lengths. It can be expensive making a WriteCopy function call for all of these cases. Instead, we add a new method TryWriteCopy that only handles this case and is simple enough such that it can be inlined into the caller. benchmark old MB/s new MB/s speedup BenchmarkDecodeDigitsSpeed1e4-4 74.49 80.41 1.08x BenchmarkDecodeDigitsSpeed1e5-4 84.39 94.42 1.12x BenchmarkDecodeDigitsSpeed1e6-4 87.19 98.66 1.13x BenchmarkDecodeDigitsDefault1e4-4 75.46 80.85 1.07x BenchmarkDecodeDigitsDefault1e5-4 90.90 95.78 1.05x BenchmarkDecodeDigitsDefault1e6-4 95.28 98.92 1.04x BenchmarkDecodeDigitsCompress1e4-4 75.36 81.33 1.08x BenchmarkDecodeDigitsCompress1e5-4 91.63 95.74 1.04x BenchmarkDecodeDigitsCompress1e6-4 95.05 98.90 1.04x BenchmarkDecodeTwainSpeed1e4-4 73.05 79.83 1.09x BenchmarkDecodeTwainSpeed1e5-4 91.18 97.11 1.07x BenchmarkDecodeTwainSpeed1e6-4 96.61 102.12 1.06x BenchmarkDecodeTwainDefault1e4-4 78.37 82.23 1.05x BenchmarkDecodeTwainDefault1e5-4 108.44 113.87 1.05x BenchmarkDecodeTwainDefault1e6-4 118.41 120.97 1.02x BenchmarkDecodeTwainCompress1e4-4 78.25 81.77 1.04x BenchmarkDecodeTwainCompress1e5-4 110.09 112.78 1.02x BenchmarkDecodeTwainCompress1e6-4 119.64 122.13 1.02x
- Loading branch information
It is probably worth checking if this is still a problem with the current compiler.