Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc experiment queue error: output does not exist #10654

Open
OS-leonardopratesi opened this issue Dec 17, 2024 · 1 comment
Open

dvc experiment queue error: output does not exist #10654

OS-leonardopratesi opened this issue Dec 17, 2024 · 1 comment
Labels
awaiting response we are waiting for your reply, please respond! :) triage Needs to be triaged

Comments

@OS-leonardopratesi
Copy link

OS-leonardopratesi commented Dec 17, 2024

Bug Report

Description

When running experiments with queue we encountered this error :

ERROR: failed to reproduce 'inference': output 'data/predictions.parquet' does not exist

Reproduce

This error happens when the directory has this specific structure:

workdir/
       dvc.yaml
       params.yaml
       data/
             predictions.parquet
       scripts/
             inference.py

and in inference.py we declare the path to the data folder as this:

LOCAL_FOLDER: Path = Path(__file__).parent.parent / "data"

An example of the dvc.yaml used:

stages:
  generate:
    deps:
      - data/dataset.parquet
    cmd: >-
     python scripts/inference.py
    outs:
      - data/predictions.parquet

In this configuration dvc exp run --queue will throw an error that it does not find the output of the step.

The fix we found is to declare the local folder path using cwd() instead: LOCAL_FOLDER: Path = Path.cwd() / "data"

Environment information

DVC version: 3.56.0 (pip)

Platform: Python 3.10.12 on Linux-5.15.167.4-microsoft-standard-WSL2-x86_64-with-glibc2.35
Subprojects:
dvc_data = 3.16.7
dvc_objects = 5.1.0
dvc_render = 1.0.2
dvc_task = 0.40.2
scmrepo = 3.3.8
Supports:
http (aiohttp = 3.11.4, aiohttp-retry = 2.9.1),
https (aiohttp = 3.11.4, aiohttp-retry = 2.9.1),
s3 (s3fs = 2024.10.0, boto3 = 1.35.36)
Config:
Global: /home/leopra96/.config/dvc
System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sdc
Caches: local
Remotes: s3
Workspace directory: ext4 on /dev/sdc
Repo: dvc (subdir), git
Repo.site_cache_dir: /var/tmp/dvc/repo/3a1e0f4a3b3ed376f7986e42bf618f01

@shcheklein
Copy link
Member

Hmm, I can reproduce it. @OS-leonardopratesi can you please make a reproducible small repo for this? Otherwise it's hard to guess what is going on here.

I was trying something like:

 stages:
  write_file:
    cmd: python src/inference.py
    deps:
      - src/inference.py
    outs:
      - data/predictions.parquet

where src/inference.py is:

from pathlib import Path

LOCAL_FOLDER: Path = Path(__file__).parent.parent / "data"


with open(LOCAL_FOLDER / "predictions.parquet", "w") as f:
    f.write(str(Path(__file__)))
    f.write(str(Path(__file__).parent.parent))

it was producing something like:

/Users/ivan/Projects/minimum-dvc/.dvc/tmp/exps/tmpve15_1g5/src/inference.py/Users/ivan/Projects/minimum-dvc/.dvc/tmp/exps/tmpve15_1g5%

in the data/predictions.parquet when I run it in the queue.

so, something is a bit more complicated happening. Do you have more stages? I can imagine that run caching can break things when scripts are dynamic and depend on the execution path. But we need something reproducible I think.

@shcheklein shcheklein added triage Needs to be triaged awaiting response we are waiting for your reply, please respond! :) labels Dec 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :) triage Needs to be triaged
Projects
None yet
Development

No branches or pull requests

2 participants