-
Notifications
You must be signed in to change notification settings - Fork 229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow loading of the pipe scispacy_linker
#402
Comments
Do you need the full linker for the test? If not you could always make up a "test" version of it with a much smaller index, since that is what takes the most time to load. |
Hi @MichalMalyska, thank you for your reply! ideally we want to test using the same model. I there any computation that happens during loading we could cache? Or is the duration simply caused by loading the weights? |
I think it is just loading weights. The UMLS index is quite beefy from what I remember (~2Gb or sth) |
It should only be really slow the first time, because it needs to download some large files, including the umls index. These files are then cached, and subsequent calls should be fast. Is this not what you are experiencing? |
I think for me it's usually ~15-20 seconds to load it in, but this is on a laptop. |
Hi @dakinggg, files are effectively cached, so it is simply about loading the UMLS index. The profiler shows that most of the time is spent decoding
I am wondering if there is a more efficient way to store, load and query the data. Furthermore, the current solution is very memory intensive (RAM usage spikes at 8GB RAM when running the above example). Two ideas for improvement are:
Those are only suggestion as I don't know enough about the inner working of |
@vlievin are you aware of how aliasing is supported in faiss? |
Hi @MichalMalyska, I am only getting started with faiss, so unfortunately I don't know about aliasing in faiss. But if I get a definite answer in the near future, I'll let you know here. I am not an expert with nmsn either. So please take these suggestions for what they are: ideas and not recommendations. |
I'll add that for me at least, it is redownloads some of the file every single time. Check if it works for you in offline - if not, then that's part of it! @#535 |
Hi, loading an UMLS linker is particularly slow (~20-30s). It is a real issue when testing the code. I reported the profiler output bellow. Is there anything we can do to speed-up the loading of the linker?
Profiler output
Code to reproduce the above results
The text was updated successfully, but these errors were encountered: