Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelise artifact collection #2456

Merged
merged 2 commits into from
Oct 30, 2023
Merged

Parallelise artifact collection #2456

merged 2 commits into from
Oct 30, 2023

Conversation

DrJosh9000
Copy link
Contributor

It's possible to configure Paths with a lot of separate globs. It's not unreasonable that each of these could be of the form **/something, and each one requires walking the entire directory tree. go-zglob is not terribly slow because it uses a parallel walker under the hood (fastwalk). While I think about how to speed up zzglob, this is a reasonable first step.

This also fixes a bug where wd is replaced with "/" (or filepath.VolumeName(absolutePath) + "/" on Windows) on the first glob path in the list that is is absolute, changing wd for later glob paths that might not be absolute.

@DrJosh9000 DrJosh9000 requested a review from a team October 25, 2023 05:34
@DrJosh9000 DrJosh9000 force-pushed the faster-artifact-search branch 3 times, most recently from 3d7ab00 to 274b1b5 Compare October 25, 2023 05:42
@DrJosh9000 DrJosh9000 enabled auto-merge October 25, 2023 07:26
@inez
Copy link

inez commented Oct 27, 2023

Very interested in giving it a try!

@DrJosh9000 DrJosh9000 disabled auto-merge October 29, 2023 22:51
@DrJosh9000 DrJosh9000 force-pushed the faster-artifact-search branch from 274b1b5 to f858cd9 Compare October 29, 2023 23:10
Copy link
Contributor

@moskyb moskyb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. happy to see us getting more chan-y with it, for a go project, the agent has surprisingly few goroutines.


// Start N workers in a new context
wctx, cancel := context.WithCancelCause(ctx)
for i := 0; i < runtime.NumCPU(); i++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is runtime.NumCPU() being used here as "an arbitrary smallish number to parallelise by, which will be roughly proportional to the size of the box we're running on"? that's a totally valid use case, i just wanna be sure i'm reading right

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's not a great heuristic though. We could create N goroutines and let the runtime figure it out. The code would be simpler, and unlike a server, the amount of work is fixed ahead of time, so it's not unreasonable. Actually, that idea is growing on me...

@DrJosh9000 DrJosh9000 force-pushed the faster-artifact-search branch from d191162 to cf5edf7 Compare October 30, 2023 00:29
@DrJosh9000 DrJosh9000 force-pushed the faster-artifact-search branch from cf5edf7 to 333e16d Compare October 30, 2023 00:48
@DrJosh9000 DrJosh9000 enabled auto-merge October 30, 2023 00:50
@DrJosh9000 DrJosh9000 merged commit 6881356 into main Oct 30, 2023
@DrJosh9000 DrJosh9000 deleted the faster-artifact-search branch October 30, 2023 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants