Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change RemoteCache to use the same strategy as dynamic execution for downloading files #12454

Closed
coeuvre opened this issue Nov 11, 2020 · 2 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: feature request

Comments

@coeuvre
Copy link
Member

coeuvre commented Nov 11, 2020

Currently, RemoteCache download the output files and directory trees of a remotely executed action to the local machine by putting the temp file next to the output, with a .tmp suffix.

It would be nice to change RemoteCache to use the strategy described by @jmmv here #11340 (comment) for download files.

@coeuvre coeuvre self-assigned this Nov 11, 2020
@coeuvre coeuvre added team-Remote-Exec Issues and PRs for the Execution (Remote) team P2 We'll consider working on this in future. (Assignee optional) labels Nov 11, 2020
@coeuvre
Copy link
Member Author

coeuvre commented Nov 12, 2020

It's necessary to also change RemoteActionInputFetcher to use this strategy as well in order to solve a race described here #11339 (comment).

bazel-io pushed a commit that referenced this issue May 9, 2022
When building with build without bytes and dynamic execution, we need prefetch input files for local actions. Sometimes, multiple local actions could share the same input files, so there could be a case where multiple call sites share the same download instance. If the local action is cancelled (due to remote branch wins), the download it requested should also be cancelled only if that download is not shared with other local action (or all the releated local actions are cancelled).

Before this change, the inputs are written to their final destination directly. This is fine if we can make sure no race or bug in the prefetcher. However, this is not true: #15010.

The root cause is, when cancelling the downloads, sometimes, the partially downloaded files on the disk are not deleted.

By making the prefetcher download input to a temporary path first, we can:
  1. Mitigate the race: only the final move step will potentially cause the race condition.
  2. Provide a way to observe the race: if these is no race, all temporary files should be either moved or deleted. But when running with this change, many temporary files exist.

Working towards #12454.

PiperOrigin-RevId: 447473693
@coeuvre
Copy link
Member Author

coeuvre commented Sep 5, 2022

Fixed by bce7db0.

@coeuvre coeuvre closed this as completed Sep 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Remote-Exec Issues and PRs for the Execution (Remote) team type: feature request
Projects
None yet
Development

No branches or pull requests

1 participant