-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce import time #438
Comments
I agree it's a pain. I'll check with @jeromekelleher and @benjeffery if there is a sensible solution in this case. |
One simple thing is to turn caching on/off using an environment variable. So you turn on caching for dev, and it's off by default: https://github.com/sgkit-dev/sgkit/blob/main/sgkit/accelerate.py Would this help? |
That sounds like a great suggestion. Thanks @jeromekelleher |
That's a good suggestion @jeromekelleher and should be fine for development, thanks. I'm also pondering moving some of the core routines into a rust extension, in the long term. |
eek! |
It's not that bad, I hope! What I'm trying to say is I'd like to move some of the low-level EP stuff into a compiled language extension, to save on import time, as this is just going to keep creeping up. Rust is great for numerical stuff and plugs in really nicely to Python. (One of the motivations here is that we're using tsdate in simulation-based-inference contexts where we generate a bunch of simulations in parallel to train a neural network-- waiting >60 sec for import when running tsdate from the CLI is really a huge time sink in this application). |
Do you want to implement the environment variable caching, or do you want me to, @nspope ? |
Should definitely make caching an option for people who know what they're doing, 60s is horrendous. If you paste in a profile I can comment on some possible ameliorations - that sounds much worse than it should be |
I'll play around with it a bit then ping you both. It's annoying but not super urgent. |
Copied from SGKIT. Fixes tskit-dev#438
I tried the approach using in sgkit, in #441, and am getting lots of messages like this:
I get this for the following 27 functions:
If I only cache the |
Yeah, I ran into the same issue-- it turns out all the EP stuff can't be cached (and that's the bulk of the compilation). Although I could probably drop the reliance on Scipy's Cython library (which is the root of the problem), I think the better way to go is to pull the EP stuff into a low-level extension. |
I think numba/numba#6972 seems relevant. For instance, this comment:
|
Hm ... it's not going to be possible to refactor things so as to avoid calling njit'd functions inside other njit'd functions (nor do I want to do this). |
Would doing something like: from numba import njit, types
fntype = types.FunctionType(types.void())
@njit(cache=True)
def bar():
pass
@njit(types.void(fntype), cache=True)
def foo(func):
func()
foo(bar) Be a simple refactor that would enable caching? |
Thanks @benjeffery -- I'm not quite understanding, would this work because the signature is explicit? If so, that's already the case in tsdate (we're using all explicit signatures) and the reason it won't cache is because I'm using some ctypes globals to interface with scipy's special functions library here: Lines 44 to 68 in 3c08bec
which results in
though now that I look at it carefully, it might be possible to avoid using scipy altogether here depending on what parts of |
The trick isn't the explicit signature, but passing the global as an argument to the function:
The issue here is that numba needs to know the exact type signature for all arguments to a function. Using a global makes it an implicit argument of the function (technically a "cell" object in the function's closure). But unlike normal arguments, it's complicated for numba to watch for changes to the global's type. Passing the global down from the nearest caller that is not a numba jit'ed should enable caching. |
Yeah, globals are to be avoided with numba for a few reasons. |
Got it-- thanks! |
The import time (due to numba AOT compilation) is driving me nuts-- I wonder if there's a way we can build an extension optionally on install (e.g. triggered by a flag to pip). Jerome and others have had bad experiences with numba extensions, but 30 to 60 seconds for import really slows down the debug cycle, and if extension-building is turned off by default I don't think it'd be an issue.
The text was updated successfully, but these errors were encountered: