-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: notebook serialization for large notebooks blocks cell execution #214040
Comments
I played around with an incremental serialization that would allow for more responsiveness (draft PR), basically serializing the code cells one at a time and checking for cancellation in between and the results were not very promising - serialization takes around 4x as long and still has some pretty significant hiccups in execution time. @rebornix - do you have any ideas for improvement here? Perhaps something like a dedicated worker thread on the EH? |
This comment was marked as off-topic.
This comment was marked as off-topic.
@arman258 I created another issue for that since it doesn't have anything to do with serializing large notebooks. |
Hi @amunger, we are are facing this issue as well and are interested in helping out. When I open my debug VSCode instance, I can't really mimic the exact behaviour as I'm missing some extensions in that instance: Is it possible to install extensions in the local VSCode instance? Does that require something special? In my code, I added some more logs to view some timings: Although I suspect the problem is worse when the actually graphs are rendered. Any pointers on what might be interesting to investigate next? |
@nojaf - yes, you can download the necessary extension vsix from the marketplace website and just use "install from visx" in the extensions sidebar menu. I don't think it actually needs to display though, the performance hit is all within the serialization of the data, which is there whether or not it displays |
@amunger thanks for the pointer. Thinking out loud, would |
If it makes that much of a difference, then it's worth considering, but it would cause a lot of headaches for people that are using version control for their notebooks as all the whitespace would suddenly change. We would probably need it behind a setting if we went that direction. |
@amunger, I'm considering adding a separate timeout setting for |
Do you mean a setting that is specific to notebooks? |
Yes, exactly. |
@amunger, I hate to trouble you again, but is this additional setting worth pursuing? I'm ready to proceed with a PR, but I'd appreciate some confirmation that it's a reasonable addition. |
looking at where that solution would be implemented - I don't think we can cleanly split the delay value by which editor has the file open, and it's probably not worth adding that complexity into the core saving mechanism for this issue. |
@amunger, ok that sounds fair. Any other avenues to explore here? This is really a problem at our company and we actively want to contribute to fixing this. |
@nojaf same issue experienced here. Notebook size is considerably big, and auto-save is on. When executing simple print("hello world") it takes from 20 to 60 seconds, it gets executed successfully, however the execution time displayed is 0 seconds. Is there any update here on workarounds or potential fix? |
I'm trying something behind a feature flag in #226632 |
Steps to Reproduce:
print(1)
and run it a few times🐛 sometimes cell execution takes several seconds
serialization still blocks the extension host for the entire time, which means execution cannot happen
code for large outputs:
The text was updated successfully, but these errors were encountered: