-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: math: make Int.Size() faster by computation not len(MarshalledBytes) #16263
Conversation
This comment has been minimized.
This comment has been minimized.
107fe85
to
ae4bd37
Compare
ae4bd37
to
3972f74
Compare
math/int.go
Outdated
} | ||
|
||
// Use Log10(x) for values less than (1<<64)-1, given it is only defined for [1, (1<<64)-1] | ||
if i.Cmp(bigMaxUint64) <= 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn;t this equivalent to bitlen <= 64?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is and the result in changing it to bitLen <= 64
isn't significant! I used bigMaxUint64 to make for clear readability and because when writing out the algorithm I was thinking in terms of log2 where a value like: 18446744073709551616, that value is +1 greater than maxUint64. If you pass in math.Log10(float64(i.Uint64()))
would result in -Inf. Sure I'll change it to just that comparison
|
||
// At this point we should just keep reducing by 10^19 as that's the smallest multiple | ||
// of 10 that matches the digit length of (1<<64)-1 | ||
var ri *big.Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we pass this in as a scratch variable each loop, to avoid re-allocations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping @odeke-em
return size + sizeBigInt(ii, alreadyMadeCopy) | ||
} | ||
|
||
func sizeBigInt(i *big.Int, alreadyMadeCopy bool) (size int) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit confused, since this feels like we could solve this with more pre-computation. E.g. for every bitlen, store "markers" in a map for values of that bitlength that corresspond to different sizes.
(And then we'd have 0 alloc's per size call)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, it isn't as simple. I had independently tried this approach out, there are a bunch of numbers which are in the boundary of bit lengths but require you to search between the closest values: and roughly I also checked for the allocations reduction and it wasn't significant. I shall refine it in the future, but for purpose of clear wins this approach as is will work great: the approach to get accuracy is to:
- while precomputing the lookup table, for the same bit length B: store []*big.Int{startingBitLenValue, ...endBitLenValue}
- when you lookup by bit length, perform a binary search (range search) in the retrieved list and see if the value falls within that range
This was the code for my prior experiment which I'll refine after this PR lands:
type savings struct {
bi *big.Int
bl int
}
var bLenMap map[int]*savings
func computeLUT() {
// 1. Compute the tables on which we have a divergence in digits
// for log values.
// Goal to lookup the number of digits given a bit length
bLenMap = map[int]*savings{
0: {new(big.Int).SetUint64(1), 1},
}
for i := uint(0); i <= 258; i++ {
ii := new(big.Int).SetUint64(1)
ii = ii.Lsh(ii, i)
len10Digits := len(ii.String())
if false {
fmt.Printf("%-79s bitLen: %-4d %d\n", ii, ii.BitLen(), len10Digits)
}
bLenMap[ii.BitLen()] = &savings{bl: len10Digits, bi: ii}
}
}
func init() {
computeLUT()
}
func lookup(ii *big.Int) int {
ii = ii.Abs(ii) // Could be lazily computed using sync.Once only when needed
bits := ii.BitLen()
closest, ok := bLenMap[bits]
if !ok {
return 1
}
switch cmp := ii.Cmp(closest.bi); {
case cmp == 0:
return closest.bl
case cmp < 0:
for {
bits--
retr, ok := bLenMap[bits]
if !ok {
break
}
if ii.Cmp(retr.bi) >= 0 {
break
}
closest = retr
}
case cmp > 0:
for {
bits++
retr, ok := bLenMap[bits]
if !ok {
break
}
if ii.Cmp(retr.bi) <= 0 {
if retr.bi.BitLen() == ii.BitLen() {
closest = retr
}
break
}
closest = retr
}
}
return closest.bl
}
3972f74
to
481f32a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, nice job!
f7834c0
to
34980a3
Compare
…ytes) This change computes Int.Size() by checking bit lengths and translating those to base 10 values. This is instead of firstly invoking .Marshal to check the length, yet .MarshalTo requires invoking .Size() then .Marshal which is double work. The results show improvements: ```shell $ benchstat before.txt after.txt name old time/op new time/op delta IntSize-8 23.9µs ± 3% 20.3µs ± 1% -15.15% (p=0.000 n=9+9) name old alloc/op new alloc/op delta IntSize-8 6.62kB ± 0% 5.09kB ± 0% -23.19% (p=0.000 n=10+10) name old allocs/op new allocs/op delta IntSize-8 186 ± 0% 177 ± 0% -4.84% (p=0.000 n=10+10) ``` Fixes #10331
34980a3
to
a90b021
Compare
This is a simpler version of cosmos#16263 that only optimizes values that fit in 53 bits. It is possible to optimize values that fit in 64 bits by using big.Int.BitLen and math.Log2(10), but it doesn't seem woth the complexity, especially given the revert of cosmos#16263.
This is a simpler version of cosmos#16263 that only optimizes values that fit in 53 bits. It is possible to optimize values that fit in 64 bits by using big.Int.BitLen and math.Log2(10), but it doesn't seem worth the complexity, especially given the revert of cosmos#16263. I failed to beat big.Int.Marshal for values that don't fit in 64 bits.
This is a simpler version of cosmos#16263 that only optimizes values that fit in 53 bits. It is possible to optimize values that fit in 64 bits by using big.Int.BitLen and math.Log2(10), but it doesn't seem worth the complexity, especially given the revert of cosmos#16263. I failed to beat big.Int.Marshal for values that don't fit in 64 bits.
This change computes Int.Size() by checking bit lengths and translating those to base 10 values. This is instead of firstly invoking .Marshal to check the length, yet .MarshalTo requires invoking .Size() then .Marshal which is double work.
The results show improvements:
Fixes #10331