Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp run --pull: keep going even if pull fails #9470

Closed
dberenbaum opened this issue May 17, 2023 · 8 comments · Fixed by #9498
Closed

exp run --pull: keep going even if pull fails #9470

dberenbaum opened this issue May 17, 2023 · 8 comments · Fixed by #9498
Assignees
Labels
A: pipelines Related to the pipelines feature p1-important Important, aka current backlog of things to do

Comments

@dberenbaum
Copy link
Collaborator

See the conversation in #9434 (comment).

If dvc exp run --pull fails to pull anything from the cache or run-cache, the command will fail immediately. Instead, it should only show a warning and try to do as much as possible.

@dberenbaum dberenbaum added A: pipelines Related to the pipelines feature p1-important Important, aka current backlog of things to do p2-medium Medium priority, should be done, but less important and removed p1-important Important, aka current backlog of things to do p2-medium Medium priority, should be done, but less important labels May 17, 2023
@dberenbaum dberenbaum added this to DVC May 23, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC May 23, 2023
@dberenbaum
Copy link
Collaborator Author

Moving this back to p1.

It makes using --pull with any demo project or any project that uses an http filesystem impossible because the run cache is unsupported for http. Even dvc repro --pull --no-run-cache fails.

@daavoo
Copy link
Contributor

daavoo commented May 23, 2023

I am guessing the answer might be "there is no longer a reason" but why is run cache not supported in HTTP filesystem?

It makes using --pull with any demo project or any project that uses an http filesystem impossible because the run cache is unsupported for http. Even dvc repro --pull --no-run-cache fails.

Do you think we should skip or instead respect --no-run-cache? The later sounds better to me TBH

@daavoo daavoo self-assigned this May 23, 2023
@daavoo daavoo moved this from Backlog to In Progress in DVC May 23, 2023
@daavoo daavoo linked a pull request May 23, 2023 that will close this issue
daavoo added a commit that referenced this issue May 23, 2023
@dberenbaum
Copy link
Collaborator Author

I am guessing the answer might be "there is no longer a reason" but why is run cache not supported in HTTP filesystem?

Good question. We could look into supporting it. No idea why it's not supported or what level of effort it requires. Seems odd that it run cache would only work on some filesystems since I can't think of any particular nuances that wouldn't apply to cache.

Do you think we should skip or instead respect --no-run-cache? The later sounds better to me TBH

We should respect --no-run-cache and not try to pull the run-cache if this flag is included (it's also not supported in dvc exp run; see #9370).


Good points @daavoo. Agreed that they seem like better solutions and we can keep failing otherwise.

@skshetry
Copy link
Member

skshetry commented May 24, 2023

The run cache is structured in the way:

/runs/A[0:2]/A/B

eg:

/runs/87/8700c814f9a9d515cd669620818ff8081f3799608b27bedd164ac25a2abe3d50/636eb945f35d36ce1ae3666765ee3f14811dce05383e255aa4d6b1de76cd9095

where, A is the hash of the current input state of the stage and B is the resulting output state that was generated and cached before. The same input state can have multiple output states depending on the reproducibility of the stage.

Each run cache file corresponds to a single run of the stage at some point in DVC. DVC will use the first output state that it finds.

So, DVC knows the A hash but does not know B hash to load, so it has to list the directory to find one of the B hash to load the run cache.

There is no way to do that in http, unless you have direct url to the resource, which we don't have in this case.

@daavoo
Copy link
Contributor

daavoo commented May 24, 2023

Why we don't pull the entire runs directory?

@skshetry
Copy link
Member

How do we do that in http? That is exactly the issue here.

@dberenbaum
Copy link
Collaborator Author

@skshetry So I guess we have the same problem of not being able to list everything with dvc ls/ls-url and http dirs right?

@skshetry
Copy link
Member

dvc ls is about listing DVC repository, that works. ls-url does not work, it won't print anything beyond the url that you provided.

fsspec does support parsing html document looking for links, but we have not added support for that.
And I am not sure we should, not all servers support indexes.

@github-project-automation github-project-automation bot moved this from In Progress to Done in DVC May 25, 2023
daavoo added a commit that referenced this issue May 25, 2023
mergify bot pushed a commit that referenced this issue May 25, 2023
Closes #9470

(cherry picked from commit e98bf38)

# Conflicts:
#	dvc/repo/reproduce.py
daavoo added a commit that referenced this issue May 29, 2023
Closes #9470

(cherry picked from commit e98bf38)
daavoo added a commit that referenced this issue May 29, 2023
Closes #9470

(cherry picked from commit e98bf38)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: pipelines Related to the pipelines feature p1-important Important, aka current backlog of things to do
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants