Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd ref: more updates on new dag command #1496

Merged
merged 6 commits into from
Jun 26, 2020
Merged
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 19 additions & 17 deletions content/docs/command-reference/dag.md
Original file line number Diff line number Diff line change
@@ -1,49 +1,51 @@
# dag

Show [stages](/doc/command-reference/run) in a pipeline that lead to the
specified stage. By default it lists
[DVC-files](/doc/user-guide/dvc-files-and-directories).
Visualize the pipeline(s) in
[`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvclock-file) as one or
more [DAGs](https://en.wikipedia.org/wiki/Directed_acyclic_graph) of connected
[stages](/doc/command-reference/run).

## Synopsis

```usage
usage: dvc dag [-h] [-q | -v] [--dot] [--full] [target]

positional arguments:
targets Stage or output to show pipeline for (optional)
Finds all stages in the workspace by default.
target Stage or output to show pipeline for (optional)
Uses all stages in the workspace by default.
```

## Description

A data pipeline, in general, is a series of data processing
[stages](/doc/command-reference/run) (for example console commands that take an
input and produce an <abbr>output</abbr>). A pipeline may produce intermediate
data, and has a final result. Machine learning (ML) pipelines typically start a
with large raw datasets, include intermediate featurization and training stages,
and produce a final model, as well as accuracy
[metrics](/doc/command-reference/metrics).
data, and has a final result.

Data processing or ML pipelines typically start a with large raw datasets,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all this text is pretty bad :( let's create a ticket to rewrite it since it's def not a focus of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted to #1500

include intermediate featurization and training stages, and produce a final
model, as well as accuracy [metrics](/doc/command-reference/metrics).

In DVC, pipeline stages and commands, their data I/O, interdependencies, and
results (intermediate or final) are specified with `dvc add` and `dvc run`,
among other commands. This allows DVC to restore one or more pipelines of stages
interconnected by their dependencies and outputs later. (See `dvc repro`.)
results (intermediate or final) are specified in `dvc.yaml`, which can be
written manually or built using the helper command `dvc run`. This allows DVC to
restore one or more pipelines later (see `dvc repro`).

> DVC builds a dependency graph
> ([DAG](https://en.wikipedia.org/wiki/Directed_acyclic_graph)) to do this.

`dvc dag` displays the stages of a pipeline up to the target stage. If `target`
is omitted, it will show the full project DAG.
`dvc dag` command displays the stages of a pipeline up to the target stage. If
`target` is omitted, it will show the full project DAG.

## Options

- `--full` - show full DAG that the `target` stage belongs too, instead of
showing only its ancestors.

- `--dot` - show DAG in
[DOT](<https://en.wikipedia.org/wiki/DOT_(graph_description_language)>)
format. It can be passed to third party visualization utilities.

- `--full` - show full DAG that the `target` belongs too, instead of showing the
part that consists only of the target ancestors.

- `-h`, `--help` - prints the usage/help message, and exit.

- `-q`, `--quiet` - do not write anything to standard output. Exit with 0 if no
1 change: 0 additions & 1 deletion content/docs/command-reference/push.md
Original file line number Diff line number Diff line change
@@ -158,7 +158,6 @@ a [pipeline](/doc/command-reference/pipeline) has been setup with these

```dvc
$ dvc pipeline show

data/Posts.xml.zip.dvc
Posts.xml.dvc
Posts.tsv.dvc
4 changes: 2 additions & 2 deletions content/docs/user-guide/running-dvc-on-windows.md
Original file line number Diff line number Diff line change
@@ -70,8 +70,8 @@ directory, as explained in
## Enabling paging with `less`

By default, DVC tries to use [Less](<https://en.wikipedia.org/wiki/Less_(Unix)>)
as pager for the output of `dvc dag`. Windows doesn't have the less command
available however. Fortunately, there is a easy way of installing `less` via
as pager for the output of `dvc dag`. Windows doesn't have the `less` command
available however. Fortunately, there is a easy way of installing it via
[Chocolatey](https://chocolatey.org/) (please install the tool first):

```dvc
4 changes: 2 additions & 2 deletions redirects-list.json
Original file line number Diff line number Diff line change
@@ -31,11 +31,11 @@
"^/doc/understanding-dvc(/.*)?$ /doc/user-guide/what-is-dvc",
"^/doc/commands-reference(/.*)?$ /doc/command-reference$1",
"^/doc/command-reference/plot$ /doc/command-reference/plots",
"^/doc/command-reference/lock$ /doc/command-reference/freeze",
"^/doc/command-reference/unlock$ /doc/command-reference/unfreeze",
"^/doc/command-reference/pipeline$ /doc/command-reference/dag",
"^/doc/command-reference/pipeline/show$ /doc/command-reference/dag",
"^/doc/command-reference/pipeline/list$ /doc/command-reference/dag",
"^/doc/command-reference/lock$ /doc/command-reference/freeze",
"^/doc/command-reference/unlock$ /doc/command-reference/unfreeze",

"^/(.+)/$ /$1"
]