-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance improvements for PointVector, RayVector, SubObjectIterator #3369
Conversation
Thanks for working on this, this does seem to introduce a few new invalidations, can we do something about these? I am a bit unsure which of these are really needed and why, I tried this in the shell on master without the changes here and it seems that mostly the hint for size is important for this speed improvement: julia> v = PointVector{ZZRingElem}(matrix(ZZ, [1 2 3 ; ]));
julia> @btime prod(v)
687.593 ns (21 allocations: 336 bytes)
6
julia> Base.size(po::PointVector) = (size(po.p, 2)::Int,)
julia> @btime prod(v)
118.364 ns (5 allocations: 80 bytes)
6 A bit confused why this type cannot be inferred automatically, we just use the size for MatrixElem. PS: What julia version / what machine is that? It seems a lot slower than on my laptop. |
0c73435
to
74fc037
Compare
Add type assertions to their getindex methods to improve type stability. Similarly ensure that the return value of `size` is recognized as Tuple{Int}. Silly microbenchmark, before: julia> v = PointVector{ZZRingElem}(matrix(ZZ, [1 2 3 ; ])); julia> @Btime prod(v) 1.217 μs (21 allocations: 336 bytes) After: julia> @Btime prod(v); 288.449 ns (5 allocations: 80 bytes)
74fc037
to
bda943d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
Thank you for the feedback, Benjamin!
I think they were a fluke (?) or caused by an update of a dependency; at least now this seems to be gone? (I pushed an update to this branch earlier)
Indeed, my mistake -- I overlooked the inheritance and just added it -- should have checked whether it really has an effect! Anyway I removed those now.
Yes, that's the main part; but together they still are better than just the The code where this came up initially used a Anyway, here is a new micro benchmark using a function f(v)
x = v[1]
for i in 2:length(v)
x *= v[i]
end
x
end And now some measurements: Before this PR:
With just the
With just the
With the full PR:
So
Because the member
but it can't infer the return type of That suggests two things to me:
So hopefully the
Julia 1.10.1 on my MacBook Pro M1 Max. |
Timings with Julia 1.10.1 on an AMD EPYC 9554 64-Core Processor (which, despite the high core count, is currently the fastest machine I have access to, in terms of single-core performance): Before this PR:
With this PR:
|
Thanks for the explanations! It makes sense to do this here and it doesn't hurt and we can try to fix nrows later. The julia> @btime f(v);
101.805 ns (5 allocations: 80 bytes)
julia> @btime prod(v);
102.918 ns (5 allocations: 80 bytes)
This microbenchmark seems surprisingly slow on the M1, maybe julia is missing some optimization there. |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #3369 +/- ##
==========================================
- Coverage 82.09% 82.07% -0.02%
==========================================
Files 556 556
Lines 74065 74218 +153
==========================================
+ Hits 60800 60914 +114
- Misses 13265 13304 +39
|
#3369) Add type assertions to their getindex methods to improve type stability. Similarly ensure that the return value of `size` is recognized as Tuple{Int}. Silly microbenchmark, before: julia> v = PointVector{ZZRingElem}(matrix(ZZ, [1 2 3 ; ])); julia> @Btime prod(v) 1.217 μs (21 allocations: 336 bytes) After: julia> @Btime prod(v); 288.449 ns (5 allocations: 80 bytes)
### Backported PRs - [x] #3367 - [x] #3379 - [x] #3369 - [x] #3291 - [x] #3325 - [x] #3350 - [x] #3351 - [x] #3365 - [x] #3366 - [x] #3382 - [x] #3373 - [x] #3341 - [x] #3346 - [x] #3381 - [x] #3385 - [x] #3387 - [x] #3398 - [x] #3399 - [x] #3374 - [x] #3406 - [x] #2823 - [x] #3298 - [x] #3386 - [x] #3412 - [x] #3392 - [x] #3415 - [x] #3394 - [x] #3391
Add type assertions to their
getindex
methods to improve type stability.Similarly ensure that the return value of
size
is recognized asTuple{Int}
.Silly microbenchmark, before:
After: