-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile: named returns are faster than inline returns #40638
Comments
As part of #792, we moved to typed.ReadBuffer which is much more readable than direct binary translations, but we lost a little bit of performance on the KV iterator, Original: ``` BenchmarkKeyValIterator-12 52558426 66.7 ns/op ``` With typed.ReadBuffer: ``` BenchmarkKeyValIterator-12 34918740 101 ns/op ``` WIth these optimizations, performance is close to the original: ``` BenchmarkKeyValIterator-12 46994628 76.1 ns/op ``` This change has the following optimizations: * Make ReadBuffer smaller by 2 words (replace []byte with int). This gets rid of the Fill functionality, which was never used in any production code anyway. * Don't create a read buffer for single Uint16 read when creating KeyValueIterator * Instead of using io.EOF to check for end of all headers, add a Remaining method which returns a bool (vs the Next method which returns > 10 words). We were spending time just copying the large struct values on the stack after the last header read. * Simplify how the KeyValueIterator.Next method creates remaining by adding method directly to the ReadBuffer. * Use named returns in KeyValueIterator.Next which give an unexpected ~6% performance improvement. Filed golang/go#40638 since this was unexpected.
As part of #792, we moved to typed.ReadBuffer which is much more readable than direct binary translations, but we lost a little bit of performance on the KV iterator, Original: ``` BenchmarkKeyValIterator-12 52558426 66.7 ns/op ``` With typed.ReadBuffer: ``` BenchmarkKeyValIterator-12 34918740 101 ns/op ``` WIth these optimizations, performance is close to the original: ``` BenchmarkKeyValIterator-12 46994628 76.1 ns/op ``` This change has the following optimizations: * Make ReadBuffer smaller by 2 words (replace []byte with int). This gets rid of the Fill functionality, which was never used in any production code anyway. * Don't create a read buffer for single Uint16 read when creating KeyValueIterator * Instead of using io.EOF to check for end of all headers, add a Remaining method which returns a bool (vs the Next method which returns > 10 words). We were spending time just copying the large struct values on the stack after the last header read. * Simplify how the KeyValueIterator.Next method creates remaining by adding method directly to the ReadBuffer. * Use named returns in KeyValueIterator.Next which give an unexpected ~6% performance improvement. Filed golang/go#40638 as we'd like to avoid named returns here.
It's a pretty small effect, but it does seem that using a local variable causes the compiler to assemble the value on the stack and then copy it to the result parameter stack location. Using a named result parameter causes the compiler to assemble the value directly in the result parameter stack location. When using gccgo I see that CC @randall77 |
Duplicate of #20859 ? |
Thanks. Closing as dup. |
As part of #792, we moved to typed.ReadBuffer which is much more readable than direct binary translations, but we lost a little bit of performance on the KV iterator, Original: ``` BenchmarkKeyValIterator-12 52558426 66.7 ns/op ``` With typed.ReadBuffer: ``` BenchmarkKeyValIterator-12 34918740 101 ns/op ``` WIth these optimizations, performance is close to the original: ``` BenchmarkKeyValIterator-12 46994628 76.1 ns/op ``` This change has the following optimizations: * Make ReadBuffer smaller by 2 words (replace []byte with int). This gets rid of the Fill functionality, which was never used in any production code anyway. * Don't create a read buffer for single Uint16 read when creating KeyValueIterator * Instead of using io.EOF to check for end of all headers, add a Remaining method which returns a bool (vs the Next method which returns > 10 words). We were spending time just copying the large struct values on the stack after the last header read. * Simplify how the KeyValueIterator.Next method creates remaining by adding method directly to the ReadBuffer. * Use named returns in KeyValueIterator.Next which give an unexpected ~6% performance improvement. Filed golang/go#40638 as we'd like to avoid named returns here.
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
Return a medium sized struct (E.g., 4 byte slices) in similar, but slightly different ways (e.g., using named returns, as part of
return
), and saw unexpected slow performance for the return "immediate"E.g., this approach was slower:
as compared to using named returns:
Full repro:
What did you expect to see?
Expected the performance to be the same for the different approaches.
What did you see instead?
Named returns has a performance advantage over returning a value as part of a
return
statement.The text was updated successfully, but these errors were encountered: