-
Notifications
You must be signed in to change notification settings - Fork 593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve syft performance by memoizing filetree #1480
Comments
nice change regarding:
Have an attempt at that here #1463 (not a maintainer) |
Thanks for your comment. and nice your PR! I think doublestar search is very costly. In fact, I think a search like And If I tried |
We are addressing this right now #1510 |
Though #1510 will be addressing the symptom (slower cataloging process) it won't be by memoizing Normalize() calls. Instead we took a look at the common calls done and added indexes to the underlying datastructure to be leveraged before searching the tree. |
What would you like to be added:
To speed up syft packages processing, it is effective to memoize
file.path.Normalize ()
in stereoscope/filetree.Why is this needed:
It thought filetree of stereoscope used in syft is wasting CPU resources.
Additional context:
Profiling for
syft packages dir:/
shows thatpath(*lazybuf).append()
and `path.Clean()' consume more than 20% of CPU resources at runtime, as follows.I profile stereoscope, and I found this node() function in the filetree stereoscope was called too many times.
func (t \*FileTree) node(p file.Path, strategy linkResolutionStrategy) (\*filenode.FileNode, error)
In my case, with 55148 paths system, node() was called 58,882,000 times. It was called 1067 times each a path on average.
Actually, file.Path.Normalize() was only needed to process once for the same path, I tried simply to implement memoization for file.Path.Normalize(), then
path.(\*lazubuf).append()
were not at the top of the ranking.In this case, the execution time is much shorter, from 385 seconds to 257 seconds.
It seems to improve speed by 33% .
Essentially, the filetree process should be improved, but since the performance improves on average 10%~20%, it would be a good idea to include this workaround first.
The text was updated successfully, but these errors were encountered: