Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull: misleading/unexpected warning #6105

Closed
shcheklein opened this issue Jun 3, 2021 · 5 comments
Closed

pull: misleading/unexpected warning #6105

shcheklein opened this issue Jun 3, 2021 · 5 comments
Assignees
Labels
bug Did we break something? p2-medium Medium priority, should be done, but less important

Comments

@shcheklein
Copy link
Member

Bug Report

Description

dvc pull complained about missing cache files (locally and remotely), but then downloads it successfully.

(.env) √ Projects/get-started-experiments % dvc pull
WARNING: Some of the cache files do not exist neither locally nor on remote. Missing cache files:
name: 8a31b1338b609cf1133d1df639de606c.dir, md5: 8a31b1338b609cf1133d1df639de606c.dir
A       data/fashion-mnist/prepared/
A       data/fashion-mnist/preprocessed/
A       models/fashion-mnist/model.h5
A       data/fashion-mnist/raw/
4 files added and 9 files fetched

(.env) √ Projects/get-started-experiments % dvc pull
Everything is up to date.

(.env) √ Projects/get-started-experiments % ls .dvc/cache/8a/31b1338b609cf1133d1df639de606c.dir
.dvc/cache/8a/31b1338b609cf1133d1df639de606c.dir

Reproduce

For me the sequence of commands was:

git clone [email protected]:iterative/get-started-experiments.git
cd get-started-experiments
virtualenv -p python3 .env
pip install -e "../dvc[all]"
dvc list . data/fashion-mnist/prepared
dvc pull

Expected

No warning.

Environment information

(.env) √ Projects/get-started-experiments % dvc version
DVC version: 2.3.0+eb46c0
---------------------------------
Platform: Python 3.8.9 on macOS-10.15.6-x86_64-i386-64bit
Supports: All remotes
Cache types: reflink, hardlink, symlink
Cache directory: apfs on /dev/disk1s1
Caches: local
Remotes: https
Workspace directory: apfs on /dev/disk1s1
Repo: dvc, git
@shcheklein
Copy link
Member Author

shcheklein commented Jun 3, 2021

I'm getting the same with other commands (e.g. dvc list . -R). I think the reason for this is that 8a31b1338b609cf1133d1df639de606c.dir corresponds to the import stage:

md5: f8caa1a6a0770ea1ba3acaec0b8a5466
frozen: true
deps:
- path: fashion-mnist/raw
  repo:
    url: https://github.com/iterative/dataset-registry
    rev_lock: ba014f40e29670421a67cb1c47543f402348aa13
outs:
- md5: 8a31b1338b609cf1133d1df639de606c.dir
  size: 30878645
  nfiles: 4
  path: raw

And pull seems to work fine, but somewhere we are trying to collect it in the same way as a regular output?

It leads to these sporadic warnings here and there.

@pared
Copy link
Contributor

pared commented Jun 3, 2021

I am able to reproduce:

def test_imported_status(tmp_dir, scm, dvc, erepo_dir, make_tmp_dir, caplog):
    with erepo_dir.chdir():
        erepo_dir.dvc_gen({"dir": {"file": "file content"}}, commit="init data")

    dvc.imp(str(erepo_dir), "dir", out="import_dir")
    tmp_dir.add_remote(url = str(make_tmp_dir("storage")), name="str", default=True)

    import shutil
    shutil.rmtree(tmp_dir/".dvc"/"cache")
    shutil.rmtree(tmp_dir/"import_dir")
    import logging
    caplog.clear()

    with caplog.at_level(logging.WARNING, "dvc"):
        dvc.ls(str(tmp_dir), recursive=True)

    assert "Missing cache files" not in caplog.text

This is interesting one. Seems like solving #4527 should help with it. But, since the imported data can be update-ed, doesn't it mean that status should work differently? Eg warning for import should show only if we don't have access to source repo?

@pared pared added bug Did we break something? p2-medium Medium priority, should be done, but less important labels Jun 3, 2021
@efiop
Copy link
Contributor

efiop commented Jun 3, 2021

@shcheklein Thanks for the report! 🙏 From a quick look, the bug is only present in the upstream, and is not present in stable versions (2.3.0 being the latest one). I will be fixed in the next release. Related to #6008

@pmrowla
Copy link
Contributor

pmrowla commented Jun 15, 2021

This will likely end up being resolved by the import changes in #6109

@efiop
Copy link
Contributor

efiop commented Jun 22, 2021

Fixed by #6109 . Thanks @pmrowla 🙏

@efiop efiop closed this as completed Jun 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Did we break something? p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

No branches or pull requests

4 participants