-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using GAP.jl makes julia slow (unusable?) #266
Comments
I can confirm the behavior with |
@thofma thanks for the report. That is of course unacceptable! @rbehrends can you look into this? We first need to establish the exact cause of this. Of course the immediate suspect is the stack scanning. Which makes me wonder if we can't implement one immediate workaround for the case were GAP is launched from Julia (the one that is relevant here): Namely, if we are not inside GAP code, we don't need to scan the code. Of course this needs to take nesting into account... Luckily we already have added some provisions for tracking whether GAP is "active" or not in GAP's Assuming stack scanning is the main problem, this should immediately help this particular case. Of course we'd still have a performance issue when GAP code is running; for that, we can also make use of the mentioned libgap-api code, to restrict the range of the stack we need to scan, though, which should also help there. Of course that won't help if GAP is the main process and it starts Julia, but that is less important for OSCAR (though of course it'd be nice to speed that up, too) |
Just to make it explicit, I am talking about |
I can kind of reproduce it, though in my case I "only" have a ~33% overhead. This may be due to architecture-specific differences, though. The culprit is Normally, scanning task stacks should not be that much of a problem. For starters, due to generational GC, unused task stacks should not be scanned every single time (or Julia would have a problem with highly recursive functions on those, too). Even so, I'm not sure why there's such an inordinate amount of time spent scanning these; we do scan the main stack each time without ill effects. I need to dig further into that to find out the actual underlying cause. (Architectural differences may be involved as macOS and Linux use slightly different task stack implementations, which could account for the differences in overhead that we are seeing.) |
The biggest issue right now seems to be that task stacks seem to be scanned every minor collection, regardless of whether they've been touched in the meantime, rather than just every major collection. This may be an issue for Julia with large task stacks, too, though I'll have to investigate that more closely. |
Which makes me wonder: did we ever test what happens if GAP is called from a task? Is this even supposed to work right now? If so: we definitely should add test cases for this |
GAP is currently being called from a task when loaded as a Julia module, as far as I can tell. And yes, it is supposed to work; we have specific logic in Which doesn't mean that some other things can't go wrong, but in principle that use case is accounted for. |
I've committed a quick fix here. I'm not yet positive that it handles the generational issues cleanly (and may require more tweaking for that), so I'm not yet going to turn it into a PR, but it does fix the performance regression for me. |
I've updated the fix (same branch as above) to be more robust. However, in its current version it is implemented in C++, and we need to think about whether that is okay for GAP (other parts of GAP do use C++ already, but I'm not sure what the constraints are) and whether we need more testing for it due to some non-trivial changes. However, for now it should work for anybody who cannot wait for a patch to arrive officially. |
@thofma @mohamed-barakat The fix has now been merged into the GAP master branch. Would be great if you were able to verify that it fixes the regression for you. Then we can get it into GAP 4.11, which I hope we'll then be able to release in August. |
With the fix in place, I am now getting this:
So there still is a slow down, but rather minor. It would still be good to know how e.g. those tests @thofma mentions (where GC takes 99% of the time) now fare. |
Hmm, this is a bigger slowdown than I experienced when I last tested it (where it was difficult to even reproduce a slowdown). I'll have to look at why there may still be residual issues. |
It is better. (I can't find the example with the 99% anymore). |
Works for me with no measurable regression. |
@mohamed-barakat so you are saying that you get identical benchmark results before and after @rbehrends can you reproduce the remaining slowdown I reported? (I had it on two machines, both on my MacBook and also on a Ubuntu server) |
I'm getting completely different results (though my version of Julia is a couple of months old). I'm wondering if there are different versions of GMP involved? Anyhow, here is what I'm getting:
This is a pretty fast machine, so I'm not sure why it would take so much longer per run. It may also mean that significant relative differences in GC performance are hidden because most of the time is spent doing calculations. I'll finally add that even with generational GC, part of the remaining performance difference may simply be that we're dealing with a significantly bigger heap. |
Yes, but maybe my tests do not test the GC heavily, at least, I was unable to measure regression. I'll try to make these test part of the Oscar CI. |
After looking at it in a profiler, I primarily see a bit more time being spent in |
@rbehrends can this issue be closed now? |
@fingolfin Yes, I believe we are good. |
Pretty cool that GAP.jl is working! Unfortunately we were seeing severe performance regressions after doing
using GAP
(although the code did not use any GAP functionality). We boiled it down to the following example. This is with a fresh install ofGAP.jl
andGAPTypes.jl
:So a slowdown by a factor of
3x
by just doingusing GAP
. I have other examples where, afterusing GAP
, a function spends 99% of its time in the GC. I don't know if we can use GAP.jl in this form.@CarloSircana @fieker
The text was updated successfully, but these errors were encountered: