Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decoders: replace binary.Read with a version without reflection and allocations #141

Merged
merged 5 commits into from
Aug 19, 2023

Conversation

vincentbernat
Copy link
Contributor

See #140. Not really tested. Still need to get rid of bytes.Buffer.

@vincentbernat vincentbernat force-pushed the feature/no-binary-read branch from 7e5362b to e65432c Compare January 19, 2023 07:56
@vincentbernat
Copy link
Contributor Author

By using the Next() method on bytes.Buffer, we can avoid allocating arrays on the heap. This provides a 33% speedup with sFlow. This is a far bigger win than not using reflection (as I suppose the paths where reflection were used were not that important).

@vincentbernat vincentbernat changed the title decoders: replace binary.Read with a version not using reflection decoders: replace binary.Read with a version without reflection and allocations Jan 19, 2023
@vincentbernat vincentbernat marked this pull request as ready for review January 19, 2023 08:00
@vincentbernat vincentbernat marked this pull request as draft January 19, 2023 10:54
@vincentbernat
Copy link
Contributor Author

I may have introduced a regression, let me investigate!

@vincentbernat vincentbernat force-pushed the feature/no-binary-read branch from e65432c to c797534 Compare January 19, 2023 11:11
@vincentbernat
Copy link
Contributor Author

OK for now. I have fixed the bug. Previously, when we provide *[]uint8, this was done with reflection. My bug was I didn't handle this case (only []uint8) and it's easy to miss because some callers provide a pointer to the slice instead of just the slice. I have fixed them to just provide the slice, but it shows this change may be a bit dangerous.

@vincentbernat vincentbernat marked this pull request as ready for review January 19, 2023 11:13
vincentbernat added a commit to akvorado/akvorado that referenced this pull request Jan 19, 2023
```
benchmark                                            old ns/op     new ns/op     delta
BenchmarkDecodeEncodeNetflow/with_encoding-12        7664          7548          -1.51%
BenchmarkDecodeEncodeNetflow/without_encoding-12     7273          6872          -5.51%
BenchmarkDecodeEncodeSflow/with_encoding-12          14029         9905          -29.40%
BenchmarkDecodeEncodeSflow/without_encoding-12       13784         9245          -32.93%

benchmark                                            old allocs     new allocs     delta
BenchmarkDecodeEncodeNetflow/with_encoding-12        143            134            -6.29%
BenchmarkDecodeEncodeNetflow/without_encoding-12     143            134            -6.29%
BenchmarkDecodeEncodeSflow/with_encoding-12          295            137            -53.56%
BenchmarkDecodeEncodeSflow/without_encoding-12       295            137            -53.56%

benchmark                                            old bytes     new bytes     delta
BenchmarkDecodeEncodeNetflow/with_encoding-12        8272          8160          -1.35%
BenchmarkDecodeEncodeNetflow/without_encoding-12     8272          8160          -1.35%
BenchmarkDecodeEncodeSflow/with_encoding-12          9816          8216          -16.30%
BenchmarkDecodeEncodeSflow/without_encoding-12       9816          8216          -16.30%
```

"old" is still before using the fork. See
netsampler/goflow2#141 for details of the fix.
Instead of allocating small slices, we rely on the fact that most call
sites are providing a `bytes.Buffer` and use the `Next()` method. For
sFlow decoding, in my case, we get a 33% speedup.

A `bytes.Reader` would even be more efficient, but unfortunately, they
don't have a `Next()` method.

While Go should be smart enough to make the allocation of `bs` on the
stack, it does not, even when `io.ReadFull()` is inlines.

```
decoders/utils/utils.go:23:13: make([]byte, n) escapes to heap
```
@lspgn lspgn added ipfix Related to IPFIX protocol sflow Related to sFlow protocol labels Mar 24, 2023
@lspgn lspgn mentioned this pull request Apr 16, 2023
@lspgn lspgn changed the base branch from main to v1 August 15, 2023 01:43
@lspgn
Copy link
Member

lspgn commented Aug 15, 2023

Added benchmark:

BenchmarkBinaryReadIntegerBase-4         	85853754	        14.17 ns/op
BenchmarkBinaryReadIntegerComparison-4   	21691910	        48.44 ns/op
BenchmarkBinaryReadByteBase-4            	98613631	        12.20 ns/op
BenchmarkBinaryReadBytesBase-4           	77762986	        14.36 ns/op
BenchmarkBinaryReadBytesComparison-4     	23859307	        46.16 ns/op
BenchmarkBinaryReadUintsBase-4           	38840542	        29.84 ns/op
BenchmarkBinaryReadUintsComparison-4     	14845839	        76.42 ns/op

@lspgn
Copy link
Member

lspgn commented Aug 15, 2023

Will integrate it with v2 as well.

@lspgn lspgn merged commit 66b47ee into netsampler:v1 Aug 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ipfix Related to IPFIX protocol sflow Related to sFlow protocol
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants