Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetModIfaceFromDisk has bad first call complexity #2304

Closed
pepeiborra opened this issue Oct 25, 2021 · 1 comment · Fixed by #2323
Closed

GetModIfaceFromDisk has bad first call complexity #2304

pepeiborra opened this issue Oct 25, 2021 · 1 comment · Fixed by #2323
Labels
performance Issues about memory consumption, responsiveness, etc. type: enhancement New feature or request

Comments

@pepeiborra
Copy link
Collaborator

pepeiborra commented Oct 25, 2021

The current implementation of GetModIface is O(# of transitive import statements), i.e. it does an amount of work proportional to the number of imports in the project. To see why, notice that to evaluate GetModIface M, that is to load the interface for a module M, we first need to load the interfaces of all its DD direct imports, which involves DD recursive calls to GetModIface. Transitively, this makes one GetModIface M call for every import in the module graph of M. This leads to bad startup performance in projects with a dense module graph since:

O(# of import statements) = O(edges in module graph) ==> O(modules^2).

Once GetModIface M has been called at least once the ongoing cost is O(direct imports), thanks to early cutoff, which is very good.

How can we improve the bad performance of the first call while preserving the good performance of the second call?

@pepeiborra
Copy link
Collaborator Author

pepeiborra commented Oct 25, 2021

An alternative implementation of GetModIface M with better first call complexity would go like this:

  1. Find the graph G of all the transitive imports
  2. Topologically sort G into a list of MM modules (the transitive imports)
  3. Load all the MM interfaces sequentially on a single HscEnv

In step 3 we avoid making any recursive calls to GetModIface and instead load all the interfaces "in place". This leads to O(transitive dependencies) complexity which is optimal and much better than the current O(# of transitive import statements).

The only problem is that, if we wanted to load all the modules in the project, the total cost would be O(modules^2) which is worse than the current one O(imports). Or put another way, the rebuild complexity would go from O(direct imports) to O(transitive imports).

Tradeoffs!

EDIT: The recursive calls to GetModIface would need to be replaced by calls to a hypothetical RefreshInterface build rule that blocks until the interface file on disk is fresh, regenerating it if needed. The tricky part of course is that, in order to check if the interface file is fresh with checkOldIface, we need to first load all the dependencies. So we end up in the same place where we started

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Issues about memory consumption, responsiveness, etc. type: enhancement New feature or request
Projects
None yet
2 participants