diff --git a/content/docs/dvclive/how-it-works.md b/content/docs/dvclive/how-it-works.md index 184dc6f94f..34953bb69a 100644 --- a/content/docs/dvclive/how-it-works.md +++ b/content/docs/dvclive/how-it-works.md @@ -100,8 +100,8 @@ make Git ignore it. It will generate a `model.pt.dvc` metadata file, which can be tracked in Git and becomes part of the experiment. With this metadata file, you can [retrieve](/doc/start/data-management/data-versioning#retrieving) the versioned artifact from the Git commit. You can also use -`Live.log_artifact("model.pt", type="model")` to add it to the [DVC Studio Model -Registry]. +`Live.log_artifact("model.pt", type="model")` to add it to the +[DVC Studio Model Registry](/doc/studio/model-registry). Using `Live.log_image()` to log multiple images may also grow too large to track with Git, in which case you can use diff --git a/content/docs/dvclive/live/log_metric.md b/content/docs/dvclive/live/log_metric.md index 34ff936e5f..32d4881da8 100644 --- a/content/docs/dvclive/live/log_metric.md +++ b/content/docs/dvclive/live/log_metric.md @@ -1,7 +1,12 @@ # Live.log_metric() ```py - def log_metric(name: str, val: float, plot: Optional[bool] = True): +def log_metric( + name: str, + val: Union[int, float, str], + timestamp: bool = False, + plot: Optional[bool] = True +): ``` ## Usage @@ -41,7 +46,7 @@ timestamp step loss The metrics history can be visualized with `dvc plots`: ``` -dvc plots diff dvclive/plots +$ dvc plots diff dvclive/plots ``` @@ -51,8 +56,8 @@ Each subsequent call to `live.log_metric(name, val)` will add a new row to ```python live.next_step() -live.log_metric("train/loss", 0.2) -live.log_metric("val/loss", 0.4) +live.log_metric("train/loss", 0.2, timestamp=True) +live.log_metric("val/loss", 0.4, timestamp=True) ``` ```ts @@ -81,19 +86,22 @@ when exiting the `with` block: The metrics summary can be visualized with `dvc metrics`: -``` -dvc metrics diff dvclive/metrics.json +```cli +$ dvc metrics diff dvclive/metrics.json ``` ## Parameters -- `name` - Name of the metric being logged. +- `name` - name of the metric being logged. + +- `val` - the value to be logged. -- `val` - The value to be logged. +- `timestamp` - whether to automatically log timestamp in the _metrics history_ + file. -- `plot` - Whether to add the metric value to the _metrics history_ file for +- `plot` - whether to add the metric value to the _metrics history_ file for plotting. If `false`, the metric will only be saved to the metrics summary. ## Exceptions diff --git a/content/docs/dvclive/live/log_param.md b/content/docs/dvclive/live/log_param.md index 69bafc11c1..af3f708e3a 100644 --- a/content/docs/dvclive/live/log_param.md +++ b/content/docs/dvclive/live/log_param.md @@ -1,7 +1,7 @@ # Live.log_param() ```py - def log_param(name: str, val: ParamLike): +def log_param(name: str, val: ParamLike): ``` Also see `Live.log_params()`. @@ -41,9 +41,9 @@ automatically, and you can skip logging them with DVCLive. ## Parameters -- `name` - Name of the parameter being logged. +- `name` - name of the parameter being logged. -- `val` - The value to be logged. +- `val` - the value to be logged. ## Exceptions diff --git a/content/docs/dvclive/live/log_params.md b/content/docs/dvclive/live/log_params.md index 63b999a3c6..fe79ec71bd 100644 --- a/content/docs/dvclive/live/log_params.md +++ b/content/docs/dvclive/live/log_params.md @@ -1,7 +1,7 @@ # Live.log_params() ```py - def log_params(params: Dict[ParamLike]): +def log_params(params: Dict[ParamLike]): ``` Also see `Live.log_param()`. diff --git a/content/docs/sidebar.json b/content/docs/sidebar.json index 132b80f8b4..f501e80911 100644 --- a/content/docs/sidebar.json +++ b/content/docs/sidebar.json @@ -66,6 +66,21 @@ "slug": "experiment-pipelines" } ] + }, + { + "label": "Model Management", + "slug": "model-management", + "source": "model-management/index.md", + "children": [ + { + "label": "Model registry", + "slug": "model-registry" + }, + { + "label": "Using and deploying models", + "slug": "model-cicd" + } + ] } ] }, diff --git a/content/docs/start/index.md b/content/docs/start/index.md index 7d371e7c1a..4f77c303ce 100644 --- a/content/docs/start/index.md +++ b/content/docs/start/index.md @@ -62,9 +62,10 @@ Now you're ready to DVC! ## Following This Guide -To help you understand and use DVC better, consider the following two use-cases: -**data management** and **experiment tracking**. You may pick either one to -start learning about how DVC helps you "solve" that scenario! +To help you understand and use DVC better, consider the following three +use-cases: **data management**, **experiment tracking** and **model +management**. You may pick any to start learning about how DVC helps you "solve" +that scenario! Choose a trail to jump into its first chapter: @@ -76,8 +77,13 @@ Choose a trail to jump into its first chapter: by only instrumenting your code, and collaborate on ML experiments like software engineers do for code. +- **[Model Management]** - Use the DVC model registry to manage the lifecycle of + your models in an auditable way. Easily access your models and integrate your + model registry actions into CICD pipelines to follow GitOps best practices. + [Data Management]: /doc/start/data-management/data-versioning [Experiment Management]: /doc/start/experiments/experiment-tracking +[Model Management]: /doc/start/model-management/model-registry diff --git a/content/docs/start/model-management/index.md b/content/docs/start/model-management/index.md new file mode 100644 index 0000000000..e9e23f70b6 --- /dev/null +++ b/content/docs/start/model-management/index.md @@ -0,0 +1,42 @@ +--- +title: 'Get Started: Model Management' +description: + 'Get started with DVC model management. Use the DVC model registry to manage + the lifecycle of your models in an auditable way. Easily access your models + and integrate your model registry actions into CICD pipelines to follow GitOps + best practices.' +--- + +# Get Started: Model Management + +## Chapters + +- **[Model registry]** - Set up a git-based model registry with DVC to track and + manage models, their versions and lifecycle stages. + +- **[Using and deploying models]** - Easily download your models from the model + registry. Set up your CICD pipelines to be trigger by model registry actions + (such as assigning model stages) and deploy models directly form the model + registry. + +[model registry]: /doc/start/model-management/model-registry +[Using and deploying models]: /doc/start/model-management/model-cicd + + + +These are captured in our [example-get-started-experiments] repo (you can [fork +it][example-get-started-experiments-fork] to follow along). + +[example-get-started-experiments]: + https://github.com/iterative/example-get-started-experiments +[example-get-started-experiments-fork]: + https://github.com/iterative/example-get-started-experiments/fork + + + +## Where To Go Next + +Pick a page from the list above, the left-side navigation bar, or just click +`NEXT` below! + +Click [here](/doc/start/) to jump back to the Get Started landing page. diff --git a/content/docs/start/model-management/model-cicd.md b/content/docs/start/model-management/model-cicd.md new file mode 100644 index 0000000000..5e233cca5e --- /dev/null +++ b/content/docs/start/model-management/model-cicd.md @@ -0,0 +1,239 @@ +--- +title: 'Get Started: Using and deploying models' +description: + 'Easily download your models from the model registry. Set up your CICD + pipelines to be trigger by model registry actions (such as assigning model + stages) and deploy models directly form the model registry.' +--- + +# Get Started: Using and Deploying Models + +In the [model registry chapter](/doc/start/model-management/model-registry) we +registered the model in the model registry and assigned it to some lifecycle +stages. In this chapter, we will learn how to access and use models and how to +use the model registry to trigger automated CICD model workflows. + +If you are using the example repository, the models are already versioned on a +publicly readable DVC remote so you can access the model from there and use it. +If you are instead using your own repository you need to set up your own DVC +remote and push the data (including models) there. Have a look at our +[Data Management guide](/doc/start/data-management/data-versioning#configuring-a-remote) +to see how this is done. + +## Downloading models + +It is useful to download model artifacts for example for local testing or for +use in CICD workflows. With models versioned by DVC this can be done easily by +using the DVC Studio UI. + +Go to the detailed view of your model, select the desired model version under +the "Version info" and then click on the "Access model" button. + +DVC Studio will present you with several ways of downloading models - with the +CLI, in Python code and directly from your web browser. You can see all the +steps to download the model from the web browser here: + + + +And here's how to do it with the CLI: + +First, configure the +[DVC Studio Access Token](https://dvc.org/doc/studio/user-guide/account-and-billing#studio-access-token) +(this only needs to be done once): + +```cli +$ dvc config --global studio.token +``` + +Now you can use the following command to download the model: + +```cli +$ dvc artifacts get https://github.com//example-get-started-experiments pool-segmentation +``` + +Here you just need to replace `` with your GitHub user. This will download +the latest version of the `pool-segmentation` model from the DVC remote +associated with the Git repository in the URL. You can also specify a different +artifact version or a model registry stage. See the `dvc artifacts get` +documentation for all options. + +If you don't have a DVC Studio account at all, you can still use +`dvc artifacts get` to download models, but you will need to provide the correct +Git and DVC remote credentials manually. You can see more details in the +[documentation](/doc/command-reference/artifacts/get#description). + +## Connecting model registry actions to your CICD + +As we noted in the +[model registry chapter](/doc/start/model-management/model-registry), all DVC +model registry actions are captured in your Git repository as Git tags with a +specific format. + +This also means that we can create CICD actions in our Git repository which will +be triggered whenever versions are registered or stages are assigned. + +In the following, we will have a look at an example CICD workflow on GitHub +which runs whenever we assign a version of our model to the "prod" stage in the +model registry. The workflow simulates model deployment without the need to +actually set up a deployment environment (so that you can test it easier) but it +does include all the ingredients needed in an actual deployment job or any other +CICD action. + + + +To see a real-world model deployment example you can check out a +[similar workflow in our example repository](https://github.com/iterative/example-get-started-experiments/blob/main/.github/workflows/deploy-model-sagemaker.yml) +which deploys a specific version of the model to an Amazon Sagemaker endpoint +for inference whenever it is assigned to a stage. + + + +Go to the `.github/workflows/deploy-model-template.yml`. This is the file that +GitHub uses to run our CICD workflow. You can see +[runs of this workflow](https://github.com/iterative/example-get-started-experiments/actions/workflows/deploy-model-template.yml) +in our example repository. + +At the beginning of the workflow file you will see this code + +```yaml +on: + # the workflow is triggered whenever a tag is pushed to the repository + push: + tags: + - '*' +``` + +The code tells GitHub to run the workflow every time a tag is pushed to the +repository. + +This means that the workflow will run whenever we run model registry actions, +but we also want it to limit to specific ones for our specific workflow. That's +where our GTO GitHub action comes into play - in the "parse" job of our workflow +it parses all tags and if they are GTO tags, it gives us the name of the model, +its version, stage (if any) and the event in the model registry. + +This is captured in the "parse" job which you can simply copy and paste into +most CICD jobs of your own. + +```yaml +# This job parses the git tag with the GTO GitHub Action to identify model registry actions +parse: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - name: 'Parse GTO tag' + id: gto + uses: iterative/gto-action@v2 + outputs: + event: ${{ steps.gto.outputs.event }} + name: ${{ steps.gto.outputs.name }} + stage: ${{ steps.gto.outputs.stage }} + version: ${{ steps.gto.outputs.version }} +``` + + + +If you are not using GitHub or if you don't want to use the GTO GitHub Action +you can also use GTO directly with the +[gto check-ref](/doc/gto/command-reference/check-ref) command. + + + +The next job called "deploy-model" actually performs the action. First, it uses +the outputs of the parse job and checks whether the action should be performed. +If the tag was produced by the model registry and if the corresponding action +was assignment to the "prod" stage, it proceeds with the rest of the workflow. + +```yaml +deploy-model: + needs: parse + if: + ${{ needs.parse.outputs.event == 'assignment' && needs.parse.outputs.stage + == 'prod' }} +``` + +The next step of the workflow sets up DVC (using a GitHub Action, but this can +also be done manually, for example with pip). + +This allows us to run `dvc artifacts get` in the last step of the workflow to +download the correct version of the model which can then be deployed or +otherwise used in our CICD. + +```yaml +steps: + - uses: iterative/setup-dvc@v1 + - name: Get Model For Deployment + run: | + dvc config --global studio.token ${{ secrets.DVC_STUDIO_TOKEN }} + dvc artifacts get ${{ github.server_url }}/${{ github.repository }} ${{ needs.parse.outputs.name }} --rev ${{ needs.parse.outputs.version }} + echo "The right model is available and you can use the rest of this command to deploy it. Good job!" +``` + +Here, we are using the outputs of the `parse` job to specify the correct model +version. We are then setting up the DVC Studio token which we stored in our +GitHub repository as a +[secret](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions) +to manage authentication with the +[DVC remote storage](https://dvc.org/doc/user-guide/data-management/remote-storage#remote-storage). +This way we only need to keep the DVC Studio token saved on GitHub and let +Studio manage the specific storage credentials for us. + +Finally, `github.server_url` and `github.repository` are +[default environmental variables in GitHub](https://docs.github.com/en/actions/learn-github-actions/contexts#github-context) +which together form the URL of our repository on GitHub. We could of course also +specify the URL manually. + +If you don't use DVC Studio, you can still use `dvc artifacts get` but you will +need to keep your remote storage credentials on GitHub and use them to configure +DVC in the CICD workflow. You will also need to checkout the repository in the +workflow. You can see more details in the +[documentation](/doc/command-reference/artifacts/get#description). + +You can now use the following template to create your own Model Registry CICD +actions on GitHub! + +```yaml +name: Deploy Model (Template) + +on: + # the workflow is triggered whenever a tag is pushed to the repository + push: + tags: + - '*' +jobs: + # This job parses the git tag with the GTO GitHub Action to identify model registry actions + parse: + runs-on: ubuntu-latest + steps: + - uses: actions/checkout@v3 + - name: 'Parse GTO tag' + id: gto + uses: iterative/gto-action@v2 + outputs: + event: ${{ steps.gto.outputs.event }} + name: ${{ steps.gto.outputs.name }} + stage: ${{ steps.gto.outputs.stage }} + version: ${{ steps.gto.outputs.version }} + + deploy-model: + needs: parse + # using the outputs from the "parse" job, we run this job only for actions + # in the model registry and only when the model was assigned to a stage called "prod" + # You can replace this with your own conditions + if: + ${{ needs.parse.outputs.event == 'assignment' && needs.parse.outputs.stage + == 'prod' }} + runs-on: ubuntu-latest + steps: + - uses: iterative/setup-dvc@v1 + # this step uses DVC to download the model from the remote repository and deploys the model + # Model deployment is mocked here as it is specific to each deployment environment + # The DVC Studio token is used to avoid having to store specific remote storage credentials on GitHub + - name: Get Model For Deployment + run: | + dvc config --global studio.token ${{ secrets.DVC_STUDIO_TOKEN }} + dvc artifacts get ${{ github.server_url }}/${{ github.repository }} ${{ needs.parse.outputs.name }} --rev ${{ needs.parse.outputs.version }} + echo "The right model is available and you can use the rest of this command to deploy it. Good job!" +``` diff --git a/content/docs/start/model-management/model-registry.md b/content/docs/start/model-management/model-registry.md new file mode 100644 index 0000000000..07fe6e7200 --- /dev/null +++ b/content/docs/start/model-management/model-registry.md @@ -0,0 +1,298 @@ +--- +title: 'Get Started: Model registry' +description: + 'Set up a Git-based model registry with DVC to track and manage models, their + versions and lifecycle stages.' +--- + +# Get Started: Model Registry + +Just as we use experiment tracking to manage model development, it is a good +idea to keep a model registry to manage the lifecycle of the models +we get from our experiments. Using DVC and [DVC Studio](/doc/studio) we will set +up a model registry where we can discover, share, deploy and audit all our +models and which will serve as the single source of truth for our model +management. + + + +Behind the scenes, DVC Studio uses a command line tool called [GTO](/doc/gto) +for most model registry actions. + +With GTO you can also set up the model registry locally without DVC Studio. You +can see how this is done in the expandable "Under the hood" sections in this +chapter. + + + +
+ +#### 💡 Expand to see how to set things up to follow along with the guide + +You can +[fork our example repository](https://github.com/iterative/example-get-started-experiments/fork) +and follow the +[installation steps](https://github.com/iterative/example-get-started-experiments#installation) +to set it up locally. + +To then perform the model registry actions in this guide, follow +[these steps](/doc/studio/user-guide/experiments/create-a-project#connect-to-a-git-repository-and-add-a-project) +to sign in to DVC Studio, connect it to your Git account and add your forked +repository as a DVC Studio project. + +
+ +## Adding models + +Let's now train a model and add it to the model registry. We will be using +[DVCLive](/doc/dvclive) and add a model using Python code. This will also +automatically save the model to DVC. + +We use the [`log_artifact`](/doc/dvclive/live/log_artifact) method to +cache the model with DVC and add it to the the model registry. Open +the training script `src/train.py` in our example repository and have a look at +the following code under the `with Live(...)` context: + +```python +with Live(...) as live: + +... + + live.log_artifact( + path="models/model.pkl", + type="model", + name="pool-segmentation", + desc="This is a Computer Vision (CV) model that's segmenting out swimming pools from satellite images.", + labels=["cv", "segmentation", "satellite-images"], + ) +``` + +Here the `path` parameter tells DVC that our model is to be found under +`"models/model.pkl"`, the `type` parameter is `"model"` and so it will show up +in DVC Studio (other artifact types will not). The rest of the parameters are +descriptive and optional and will also show up in DVC Studio. + +
+ +#### 💡 Expand to see how adding models to the registry works under the hood + +When we call the `log_artifact()` method, DVC takes all the information we +provide in the call and edits the `dvc.yaml` file which will now contain the +following lines: + +```yaml +artifacts: + pool-segmentation: + path: models/model.pkl + type: model + desc: + This is a Computer Vision (CV) model that's segmenting out swimming pools + from satellite images. + labels: + - cv + - segmentation + - satellite-images +``` + +When you push the resulting file to your Git remote it is parsed by DVC Studio +and it then shows all model artifacts from your `dvc.yaml` files in the model +registry. + +
+ +Now we just need to run the Python script which includes this code to cache and +register the model. If you are +[following](/doc/start/model-management/model-registry#follow-along-instructions) +our example repository then this has already been done and we can continue to +the next section. + +If you are building your own repository, you will need to run the script and +push the result to your Git remote (e.g., GitHub) yourself. + +
+ +#### 💡 Expand to see other ways to add models + +The other two options are to use the DVC Studio's graphical user interface to +add models interactively or to manually edit `dvc.yaml` files to add information +about model artifacts. To get more details on the ways to add models have a look +at the +[Model registry documentation](/doc/studio/user-guide/model-registry/add-a-model). + +
+ +## DVC Model registry overview + +In this guide, we will be using [DVC Studio](https://studio.iterative.ai) to +manage our model registry. DVC Studio enables you to see models across all +projects, manage their lifecycle, and download them with only a token. You can +find out more about it [here](/doc/studio). + +From the Models tab in DVC Studio we will have an overview of all models, latest +model versions as well stages each of the model versions is assigned to. We can +get more details for each model by clicking on the model name. + +You can check out our +[example model](https://studio.iterative.ai/team/Iterative/models/b3P4bcYIrGYdzyjqzsf9Xw==/pool-segmentation/v1.0.0) +in DVC Studio to see what it will look like once we finish all the steps in this +guide. + +Now that we have added a model, you should see something like the following +picture in DVC Studio if you go to the Models tab and then select the +`pool-segmentation` model. + +![Newly added model in the Model Registry](/img/mr-newly-added-model.png) + +You can also see the +[state of the project at this point](https://github.com/iterative/example-get-started-experiments/releases/tag/2-dvc-pipeline) +captured in our example repository. + +
+ +#### 💡 Expand to see how the model registry works under the hood + +When you register model versions, assign or remove stages or deprecate models, +GTO assigns [particularly formatted](/doc/gto/user-guide#git-tags-format) Git +[tags](https://git-scm.com/book/en/v2/Git-Basics-Tagging) to selected commits +and these are then parsed by the model registry to keep track of the model +lifecycle history. + +This means that all the metadata used by the model registry is actually stored +in your Git repository! + +It also allows you to use GTO directly instead of the DVC Studio UI to manage +your model lifecycle. That can be useful for example if you want to trigger +certain model registry actions programmatically. You can learn more about the +details of GTO in its [documentation](/doc/gto). + +If you don't have a DVC Studio account at all, you will have to manage a model +registry separately for each Git repository however. + +
+ +## Versioning models + +Now that we have our first model in the model registry, we can start registering +model versions for the model. We do it by choosing a specific commit in our +model development history and attaching a version to it to make it easier to +keep track of it. You can now do that directly in the DVC Studio UI as follows. + + + +Since we saved our model to DVC and added it to the model registry in the latest +commit, we can just keep the commit which was selected by DVC Studio +automatically. We will also keep the suggested version number v1.0.0. + +For more details and other ways of registering model versions you can have a +look at the corresponding +[documentation](/doc/studio/user-guide/model-registry/register-version). + +Once we register our first model version, DVC Studio will also automatically +connect it to experiment tracking and all metrics which are tracked there will +also show up in the model registry for each model version. + +
+ +#### 💡 Expand to see how registering models works under the hood + +Registering the model version as we just did using DVC Studio is equivalent to +the following GTO command + +``` +gto register pool-segmentation [ref] --version v1.0.0 +``` + +Here, `[ref]` is the Git reference/hash we selected from the menu in DVC Studio. + +For more details you can have a look at the +[gto register command reference](/doc/gto/command-reference/register). + +
+ +## Assigning lifecycle stages + +We have a first version for our model and now it is a good time to assign a +model lifecycle stage to it. You can create any number of lifecycle stages with +any names you wish but in this example we will only create two stages called +"dev" and "prod". + +Stages are created whenever a model version is assigned to them. You can now +assign the model version 1.0.0 to the "dev" stage as follows. + + + +When we assign the model to a stage, it can automatically trigger actions in our +CICD workflows, like deploying the model to a new environment (we will explore +how this is done in the +[Using and Deploying models](/doc/start/model-management/model-cicd) chapter). + +
+ +#### 💡 Expand to see how assigning model stages works under the hood + +Assigning the "dev" stage to the model as we just did using DVC Studio is +equivalent to the following GTO command. + +``` +gto assign pool-segmentation --version v1.0.0 --stage dev +``` + +For more details you can have a look at the +[gto assign command reference](/doc/gto/command-reference/assign). + +
+ +## Changing (and removing) stage assignments + +Let's say that we've decided to promote our model version 1.0.0 to production +and denote that it is no longer in the "dev" stage. First, assign the model +version to the "prod" stage just like we did with the "dev" stage in the +previous section. + +Now, to remove the "dev" stage from our model version 1.0.0 and assign it only +to "prod", follow these steps: + + + +It is also possible to de-register model versions or deprecate and remove models +from the registry entirely. To see how, have a look at the +[documentation](/doc/studio/user-guide/model-registry/remove-a-model-or-its-details). + +The detailed view of our model in the registry should now match what we see +[in our example](https://studio.iterative.ai/team/Iterative/models/b3P4bcYIrGYdzyjqzsf9Xw==/pool-segmentation/v1.0.0). + +
+ +#### 💡 Expand to see how removing model stages works under the hood + +Whenever we un-assign stages, de-register model versions and deprecate models, +DVC Studio uses the GTO library under the hood. It is also possible to use GTO +manually to perform these actions. To see how, have a look at the +[gto deprecate command reference](/doc/gto/command-reference/deprecate). + +
+ +## Auditing model history + +Every action we performed in our model registry leaves a trace so that the model +history can be audited. If you now look at the model details page of our model, +you should see something like this: + +![Model history](/img/mr-model-history.png) + +As we noted +[above](/doc/start/model-management/model-registry#under-the-hood-model-registry), +DVC uses special Git tags to keep track of model registry actions, so all of +this history is actually stored directly in your Git repository. DVC Studio can +parse these tags and show them to us in a user-friendly way. + +If you look at the +[tags in our example repository](https://github.com/iterative/example-get-started-experiments/tags), +you can see that all the model registry actions that we performed are captured +by such tags. diff --git a/content/docs/studio/experiments.md b/content/docs/studio/experiments.md index 687f16ffb6..164be7884d 100644 --- a/content/docs/studio/experiments.md +++ b/content/docs/studio/experiments.md @@ -11,7 +11,7 @@ experiments with DVC Studio: - In your model training environment, install [DVCLive]: ```cli - pip install dvclive + $ pip install dvclive ``` - Copy your @@ -20,7 +20,7 @@ experiments with DVC Studio: [configure your model training environment to use the token](/doc/studio/user-guide/experiments/live-metrics-and-plots#set-up-an-access-token): ```cli - dvc config --global studio.token *** + $ dvc config --global studio.token *** ``` - Use the DVCLive [`log_metric()`](/doc/dvclive/live/log_metric#livelog_metric) @@ -38,7 +38,7 @@ experiments with DVC Studio: - Run the training job: ```cli - python train.py + $ python train.py ``` - The metrics and plots will be [tracked live][live-metrics-and-plots] in the @@ -49,9 +49,7 @@ experiments with DVC Studio: DVC Studio offers more ways to run and track experiments - you can: - set up reproducible pipelines with DVC, -- submit new experiments from the VS Code IDE, -- submit new experiments from DVC Studio, and have them run in your own cloud - infrastructure. +- submit new experiments from the VS Code IDE. For details on all these, check out the [`experiment management user guide`](/doc/studio/user-guide/experiments). diff --git a/content/docs/studio/index.md b/content/docs/studio/index.md index 453ff790c1..8617e238f8 100644 --- a/content/docs/studio/index.md +++ b/content/docs/studio/index.md @@ -25,8 +25,6 @@ https://www.youtube.com/watch?v=w-UjAbdpRY4 Git. - Keep your code, data and model connected at all times. - Apply your existing software engineering stack for ML teams. -- Automate your ML process by transitioning to a no-code interface for running - experiments on cloud resources of your choice. - Build an ML model registry for transparency and discovery across all your projects. - Because your model registry is built on top of Git, you can use diff --git a/content/docs/studio/model-registry.md b/content/docs/studio/model-registry.md index 07e9f63ce4..f672026b36 100644 --- a/content/docs/studio/model-registry.md +++ b/content/docs/studio/model-registry.md @@ -14,7 +14,7 @@ To quickly start tracking your models in the DVC Studio model registry: - In your model training environment, install [DVCLive](/doc/dvclive): ```cli - pip install dvclive + $ pip install dvclive ``` - Copy your @@ -22,7 +22,7 @@ To quickly start tracking your models in the DVC Studio model registry: and configure your model training environment to use the token: ```cli - dvc config --global studio.token *** + $ dvc config --global studio.token *** ``` - Use the DVCLive [`log_artifact()`](/doc/dvclive/live/log_artifact) method in @@ -37,7 +37,7 @@ To quickly start tracking your models in the DVC Studio model registry: - Run the training job: ```cli - python train.py + $ python train.py ``` - Once the training completes, commit and push the resultant `dvc.yaml` file to diff --git a/content/docs/studio/rest-api.md b/content/docs/studio/rest-api.md index d42555766f..7a4e9ddcd5 100644 --- a/content/docs/studio/rest-api.md +++ b/content/docs/studio/rest-api.md @@ -13,8 +13,8 @@ To use API, you need to generate ## Download model Get signed URL to download the model binaries for a model from the model -registry. Requires the model to be stored with DVC with S3 or Azure -[remote]. Note that you need to +registry. Requires the model to be stored with DVC with a S3, Azure, http +or https [remote]. Note, that you need to [set up remote cloud credentials](/doc/studio/user-guide/account-management#cloud-credentials) for DVC Studio to have rights to the signed URLs. The signed URLs expire after one hour. diff --git a/content/docs/studio/user-guide/account-and-billing/index.md b/content/docs/studio/user-guide/account-and-billing/index.md index 3f5f963948..2ffc08da38 100644 --- a/content/docs/studio/user-guide/account-and-billing/index.md +++ b/content/docs/studio/user-guide/account-and-billing/index.md @@ -68,133 +68,15 @@ handy if you suspect that your account security may have been compromised. ### Cloud credentials In the `Cloud Credentials` section of your [Profile] page, you can view, add and -update credentials for cloud resources. These credentials are used to: - -- [fetch project data from data remotes](/doc/studio/user-guide/experiments/configure-a-project#data-remotes--cloud-storage-credentials), - and -- [create cloud resources for running experiments](/doc/studio/user-guide/experiments/run-experiments#cloud-experiments). +update credentials for cloud resources. These credentials are used to +[fetch project data from data remotes](/doc/studio/user-guide/experiments/configure-a-project#data-remotes--cloud-storage-credentials). To add new credentials, click `Add credentials` and select the cloud provider. -Depending on the provider, you will be asked for more details. Note that -[cloud experiments](/doc/studio/user-guide/experiments/run-experiments#cloud-experiments) -currently support AWS and GCP. +Depending on the provider, you will be asked for more details. ![](https://static.iterative.ai/img/studio/s3_remote_settings_v2.png) The credentials must have the required permissions. For -[cloud experiments](/doc/studio/user-guide/experiments/run-experiments#cloud-experiments), -the following permissions are needed: - -
- -#### AWS - -``` -"autoscaling:CreateAutoScalingGroup", -"autoscaling:DeleteAutoScalingGroup", -"autoscaling:DescribeAutoScalingGroups", -"autoscaling:DescribeScalingActivities", -"autoscaling:UpdateAutoScalingGroup", -"ec2:AuthorizeSecurityGroupEgress", -"ec2:AuthorizeSecurityGroupIngress", -"ec2:CancelSpotInstanceRequests", -"ec2:CreateKeyPair", -"ec2:CreateLaunchTemplate", -"ec2:CreateSecurityGroup", -"ec2:CreateTags", -"ec2:DeleteKeyPair", -"ec2:DeleteLaunchTemplate", -"ec2:DeleteSecurityGroup", -"ec2:DescribeAutoScalingGroups", -"ec2:DescribeImages", -"ec2:DescribeInstanceTypeOfferings", -"ec2:DescribeInstances", -"ec2:DescribeKeyPairs", -"ec2:DescribeLaunchTemplates", -"ec2:DescribeScalingActivities", -"ec2:DescribeSecurityGroups", -"ec2:DescribeSpotInstanceRequests", -"ec2:DescribeSubnets", -"ec2:DescribeVpcs", -"ec2:GetLaunchTemplateData", -"ec2:ImportKeyPair", -"ec2:ModifyImageAttribute", -"ec2:ModifyLaunchTemplate", -"ec2:RequestSpotInstances", -"ec2:RevokeSecurityGroupEgress", -"ec2:RevokeSecurityGroupIngress", -"ec2:RunInstances", -"ec2:TerminateInstances", -"s3:CreateBucket", -"s3:DeleteBucket", -"s3:DeleteObject", -"s3:GetObject", -"s3:ListBucket", -"s3:PutObject", -``` - -See -https://github.com/iterative/terraform-provider-iterative/blob/a92499539f109821c021d1efb1fb01e51f1db47f/docs/guides/permissions/aws/main.tf - -
- -
- -#### GCP - -``` -"compute.acceleratorTypes.get", -"compute.diskTypes.get", -"compute.disks.create", -"compute.firewalls.create", -"compute.firewalls.delete", -"compute.firewalls.get", -"compute.globalOperations.get", -"compute.instanceGroupManagers.create", -"compute.instanceGroupManagers.delete", -"compute.instanceGroupManagers.get", -"compute.instanceGroupManagers.update", -"compute.instanceGroups.create", -"compute.instanceGroups.delete", -"compute.instanceGroups.get", -"compute.instanceTemplates.create", -"compute.instanceTemplates.delete", -"compute.instanceTemplates.get", -"compute.instanceTemplates.useReadOnly", -"compute.instances.create", -"compute.instances.delete", -"compute.instances.get", -"compute.instances.setMetadata", -"compute.instances.setServiceAccount", -"compute.instances.setTags", -"compute.machineTypes.get", -"compute.networks.create", -"compute.networks.get", -"compute.networks.updatePolicy", -"compute.subnetworks.use", -"compute.subnetworks.useExternalIp", -"compute.zoneOperations.get", -"iam.serviceAccounts.actAs", -"storage.buckets.create", -"storage.buckets.delete", -"storage.buckets.get", -"storage.multipartUploads.abort", -"storage.multipartUploads.create", -"storage.multipartUploads.list", -"storage.multipartUploads.listParts", -"storage.objects.create", -"storage.objects.delete", -"storage.objects.get", -"storage.objects.list", -"storage.objects.update", -``` - -See -https://github.com/iterative/terraform-provider-iterative/blob/a92499539f109821c021d1efb1fb01e51f1db47f/docs/guides/permissions/gcp/main.tf - -
- -For [data remotes](/doc/studio/user-guide/experiments/configure-a-project#data-remotes--cloud-storage-credentials), you can find more details about required permissions [here][data remote]. diff --git a/content/docs/studio/user-guide/experiments/explore-ml-experiments.md b/content/docs/studio/user-guide/experiments/explore-ml-experiments.md index 1afd601e9d..19967c8ad6 100644 --- a/content/docs/studio/user-guide/experiments/explore-ml-experiments.md +++ b/content/docs/studio/user-guide/experiments/explore-ml-experiments.md @@ -7,7 +7,7 @@ project name to open the project table, which contains: project - [Display preferences](#display-preferences) - Buttons to - [visualize, compare, and run experiments](#visualize-compare-and-run-experiments). + [visualize and compare experiments](#visualize-and-compare-experiments). - Button to [export project data](#export-project-data). ## Git history and live experiments @@ -211,16 +211,14 @@ and log back in later. ![Save or discard changes](https://static.iterative.ai/img/studio/save_discard_changes.png) -## Visualize, compare and run experiments +## Visualize and compare experiments -Use the following buttons to visualize, compare and run experiments: +Use the following buttons to visualize and compare experiments: - **Plots:** Open the `Plots` pane and [display plots] for the selected commits. - **Trends:** [Generate trend charts] to see how the metrics have changed over time. - **Compare:** [Compare experiments] side by side. -- **Run:** [Run experiments] and [track results in real - time][live-metrics-and-plots]. These buttons appear above your project table as shown below. ![example export to csv](https://static.iterative.ai/img/studio/project_action_buttons_big_screen.png) @@ -246,7 +244,6 @@ Below is an example of the downloaded CSV file. /doc/studio/user-guide/experiments/visualize-and-compare#display-plots-and-images [Compare experiments]: /doc/studio/user-guide/experiments/visualize-and-compare#compare-experiments -[run experiments]: /doc/studio/user-guide/experiments/run-experiments [live-metrics-and-plots]: /doc/studio/user-guide/experiments/live-metrics-and-plots [Generate trend charts]: diff --git a/content/docs/studio/user-guide/experiments/index.md b/content/docs/studio/user-guide/experiments/index.md index 573d23b1af..13425b0102 100644 --- a/content/docs/studio/user-guide/experiments/index.md +++ b/content/docs/studio/user-guide/experiments/index.md @@ -2,8 +2,7 @@ You can submit your experiments from your favorite interface - whether it is Jupyter Notebooks, a code editor or IDE like [VS Code](/doc/vs-code-extension), -the Python cli, the bash terminal, etc. You can also [submit new experiments -from DVC Studio][run-experiments]. +the Python cli, the bash terminal, etc. You can track live as well as completed experiments in DVC Studio. First, click on `Add a project` to connect DVC Studio to your ML project's Git repository. @@ -16,7 +15,7 @@ To quickly start tracking your experiments with DVC Studio: - In your model training environment, install [DVCLive]: ```cli - pip install dvclive + $ pip install dvclive ``` - Copy your @@ -24,7 +23,7 @@ To quickly start tracking your experiments with DVC Studio: and configure your model training environment to use the token: ```cli - dvc config --global studio.token *** + $ dvc config --global studio.token *** ``` - Use the DVCLive [`log_metric()`](/doc/dvclive/live/log_metric#livelog_metric) @@ -50,7 +49,7 @@ To quickly start tracking your experiments with DVC Studio: - Run the training job: ```cli - python train.py + $ python train.py ``` - The metrics and plots will be [tracked live][live-metrics-and-plots] in the @@ -121,8 +120,6 @@ Within a project, you can: have pushed to your Git repository. - [Visualize the experiments][visualize] using plots and trend charts. - [Compare experiments][compare]. -- [Run new experiments][run-experiments], including on your own cloud instances, - directly from DVC Studio. - [Track live metrics and plots][live-metrics-and-plots] by sending them to DVC Studio by using [DVCLive]. @@ -131,7 +128,6 @@ Within a project, you can: [visualize]: /doc/studio/user-guide/experiments/visualize-and-compare [compare]: /doc/studio/user-guide/experiments/visualize-and-compare#compare-experiments -[run-experiments]: /doc/studio/user-guide/experiments/run-experiments [live-metrics-and-plots]: /doc/studio/user-guide/experiments/live-metrics-and-plots [dvclive]: /doc/dvclive diff --git a/content/docs/studio/user-guide/experiments/run-experiments.md b/content/docs/studio/user-guide/experiments/run-experiments.md index a96b9be71b..278f757e54 100644 --- a/content/docs/studio/user-guide/experiments/run-experiments.md +++ b/content/docs/studio/user-guide/experiments/run-experiments.md @@ -1,73 +1,22 @@ # Run Experiments -DVC Studio can train your model and run experiments with different -hyperparameters or datasets. Experiments can be +The functionality to train your model and run experiments with different +hyperparameters or datasets is currently disabled in DVC Studio. However, DVC +experiments that are created from the terminal or the VS Code extension can be +monitored and managed from DVC Studio. -- [Cloud experiments](#cloud-experiments), that run on your own cloud, - independent of your CI/CD setup, or -- [CI-based experiments](#ci-based-experiments), that invoke your CI/CD setup - for model training. The training can run on your own cloud. +## Monitor a running experiment -> Due to access restrictions, you cannot run experiments on the demo projects -> that are provided to you by default (such as the [`example-get-started` demo -> project][example-get-started project]). Once you connect to your ML project -> repositories, you can follow the instructions given below to run experiments -> directly from DVC Studio. - -## Cloud experiments - - - -Cloud experiments are in alpha release and are subject to change. - - - -Run your experiments on your own cloud compute instances in AWS, GCP, and Azure -(coming soon). For this, you need to: - -1. Set up - [credentials for the cloud provider](/doc/studio/user-guide/account-and-billing#cloud-credentials) -2. Create a [DVC experiment pipeline] in your project. Note that while the - default [script for cloud experiments] runs a DVC pipeline, advanced users - can run experiments without a pipeline by modifying the script. -3. Create a `requirements.txt` file listing all Python packages required by the - project. The default [script for cloud experiments] installs all requirements - from `requirements.txt`, although advanced users can specify requirements - differently by modifying the script. - -[script for cloud experiments]: #submit-a-new-experiment - -### Submit a new experiment - -Once you have added credentials, navigate to the project and follow these steps: - -- Select the Git commit from which you want to iterate. Click the `Run` button - at the top. In the form that opens up, switch to the `Cloud` tab to run cloud - experiments. -- At the top of the form, configure the cloud instance by selecting the cloud - region, instance size, disk size, whether to use a spot instance, etc. -- Under the `Parameters` tab, optionally - [modify any parameters](#hyperparameters) of your [DVC experiment pipeline]. -- Under the `Script` tab, you can see the commands that will be run on the cloud - instance. You can modify this script, although this is not recommended unless - you know what you are doing, as it could cause the experiment to fail. -- At the bottom of the form, click `Run Experiment` to start the experiment. - -![DVC Studio run cloud experiment](/img/studio-run_cloud_exp_form.png) - -[dvc experiment pipeline]: /doc/start/experiments/experiment-pipelines - -### Monitor a running experiment - -Once you submit an experiment, DVC Studio creates the cloud instance and runs -the job script. A new row is created in the experiments table under the original -Git commit. [Live updates to metrics and plots] generated by [DVCLive] will show -up in this row, and you can click on the experiment name to view the status and -output log of the running experiment task. +Once you submit an experiment, a new row is created in the experiments table +under the original Git commit. +[Live updates to metrics and plots](/doc/studio/user-guide/experiments/live-metrics-and-plots) +generated by [DVCLive] will show up in this row, and you can click on the +experiment name to view the status and output log of the running experiment +task. ![DVC Studio View logs and live metrics of cloud experiments](/img/studio-cloud_exp_logs_and_live_metrics.gif) -### Manage a completed experiment +## Manage a completed experiment When the experiment completes, the files (including code, data, models, parameters, metrics, and plots) are pushed back to your Git and DVC remotes. @@ -80,173 +29,4 @@ easily compare your experiment with its baseline. ![DVC Studio Create a New Branch](/img/studio-branch.gif) -## CI-Based experiments - -DVC Studio can also use your regular CI/CD setup (e.g. GitHub Actions) to run -the experiments. To enable this, do the following: - -1. First, integrate your Git repository with a CI/CD setup that includes model - training process. You can - [use the wizard provided by DVC Studio](/doc/studio/user-guide/experiments/run-experiments#use-the-iterative-studio-wizard-to-set-up-your-ci-action) - to automatically generate the CI script, or you can write it on your own. - -2. Then, - [setup the yaml workflow environment variables as secrets](https://cml.dev/doc/self-hosted-runners#environment-variables). - This is needed so that your CI workflow can launch the runner in your desired - cloud provider. - -3. Now, [submit your experiments from DVC Studio](#submit-a-new-experiment). - Each submission will invoke your CI/CD setup, triggering the model training - process. - -### Use the DVC Studio wizard to set up your CI action - -Select a commit and click **Run**. You will see a message that invites you to -set up your CI. - -![](https://static.iterative.ai/img/studio/set_up_cml_message.png) - -The CI setup wizard opens as shown below. - -![](https://static.iterative.ai/img/studio/set_up_cml_full.png) - -This wizard has the following two sections, pre-filled with default values: - -- Left section with 2 sets of parameters: - - 1. Configuration of your self-hosted runner, which is used in the - `deploy-runner` step of your CI workflow. The parameters listed here are a - subset of the parameters for - [CML self-hosted runners](https://cml.dev/doc/self-hosted-runners). - - | Parameter | Meaning | - | ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | - | `Spot` | Whether you want to launch a spot cloud instance, cutting down the costs of your training. | - | `Cloud` | Your cloud provider. | - | `Region` | Cloud-vendor specific region or a CML synthetic region (an abstraction across all the cloud vendors). | - | `Type` | Cloud-vendor specific instance type or a CML synthetic type `M`/`L`/`XL` (an abstraction across all the cloud vendors). `Type` is also tied to GPU behavior. If you choose an instance with a selectable GPU (such as a CML instance type or any GCP instance), the `GPU` parameter will show up. | - | `HDD size` | Hard disk size in GB. We highly recommend you to enter a big enough value (eg, 100) to avoid unexpected runner termination due to hard disk exhaustion. | - | `Reuse` | Values for the CML flags `reuse` and `reuse-idle`. See all [CML options](https://cml.dev/doc/ref/runner#options) for details | - | `Labels` | Text labels to identify your CML runners from other self hosted runners that you might have. | - - 2. Job script, which is used in the `runner-job` step of your CI workflow - - | Parameter | Meaning | - | ------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | - | `Job script` | This is the script needed for your runner to execute your job, which would commonly include training your model. The default template is a very common combination of CML and DVC taking into account that DVC enables you to make the most of DVC Studio. You can update this script to reflect your exact model training process, whether you use DVC or not. | - -- Right section which displays the generated yaml to be used in your CI set up. - It reflects all your input parameters. Use the - `Copy to clipboard and paste in your CI Workflow file` link to copy the - generated yaml and create your CI script. - -That's it! At this point you should have CML in place within your CI/CD to run -your experiments. Now, you can submit your experiments. - -### Submit a new experiment - -Watch this video for an overview of how you can run CI-based experiments from -DVC Studio. - -https://www.youtube.com/watch?v=nXJXR-zBvHQ - -_Note that we have renamed DVC Studio mentioned in the above video to DVC Studio -and Views to Projects._ - -To run experiments from DVC Studio, determine the Git commit on which you want -to iterate. Select the commit and click the `Run` button. A form opens, with 2 -types of inputs that you can change: - -#### **Input data files**: - -You can change datasets that are used for model training. The list of files that -you can change will depend on your ML project. For instance, in the -[`example-get-started` ML project][example-get-started project], an authorized -user can change the `data.xml` file. DVC Studio identifies all the files used in -your ML project, which means that if you select the -`Show all input parameters (including hidden)` option, then you can also change -the hidden files such as the `model.pkl` model file and the `scores.json` -metrics file. You can also choose not to change any input data files if you only -wish to change the values of one or more hyperparameters. - -#### **Hyperparameters**: - -DVC Studio lists all the hyperparameters of your ML project and you can change -their values as per the new experiment that you want to run. For instance, in -the [`example-get-started` ML project][example-get-started project], an -authorized user can change `max_features` (the maximum number of features that -the model uses), `ngrams`, etc. You can also choose not to change any -hyperparameters if you only wish to change one or more input data files. - -The default values of the input data files and hyperparameters in this form are -extracted from your selected commit. - -![](https://static.iterative.ai/img/studio/cml_changes.png) - -### Enter commit details and submit the CI-Based experiment - -Once you have made all the required changes, enter your Git commit message and -description. - - - -If your CI job creates a new Git commit to write the experiment results to your -Git repository, you may want to hide the Git commit that you created when -submitting the experiment from your project table. In this case, add -`[skip studio]` in the commit message. For details, refer to [Display -preferences -> Hide commits]. - - - -Select the branch to commit to. You can commit to either the base branch or a -new branch. If you commit to a new branch, a Git pull request will automatically -be created from the new branch to the base branch. - -Click on `Commit changes`. - -![](https://static.iterative.ai/img/studio/cml_commit.png) - -### What happens after you submit a new CI-based experiment - -**Git commit (and pull request) are created:** DVC Studio will create a Git -commit with the changes you submitted. This commit appears in the project table. -If you had specified a new branch to commit the changes to, then a new pull -request will also be created from the new branch to the base branch. - -**Model training is invoked:** If your ML project is integrated with a CI/CD -setup (e.g. GitHub Actions), the CI/CD setup will get invoked. If this setup -includes a model training process, it will be triggered, which means that your -ML experiment will run automatically. - -The model training can happen on any cloud or Kubernetes. For more details on -how to set up -[CI/CD pipelines for your ML project](/doc/use-cases/ci-cd-for-machine-learning), -refer to [CML]. - -**Live metrics and plots are tracked:** In your model training CI action, you -can use [DVCLive] to send [live updates to metrics and plots] back to DVC -Studio, without writing them to your Git repository. The live metrics are -displayed alongside the corresponding experiment commits. - -[live updates to metrics and plots]: - /doc/studio/user-guide/experiments/live-metrics-and-plots - -**Metrics, plots and reports can be saved in Git:** In your model training CI -action, you can save the training results in Git. This means, once the -experiment completes, its metrics will be available in the project's experiment -table. You can then generate plots and trend charts for it, or compare it with -the other experiments. - -In your model training CI action, you can also use [CML] to create reports with -metrics, plots or other details. You can access the CML report by clicking on -the `CML report` icon next to the Git commit message in the experiment table. -The `CML Report` tooltip appears over the `CML report` icon on mouse hover. - -![](https://static.iterative.ai/img/studio/cml_report_icon.png) - -[example-get-started project]: - https://studio.iterative.ai/team/Iterative/projects/example-get-started-zde16i6c4g [dvclive]: /doc/dvclive -[cml]: https://cml.dev -[display preferences -> hide commits]: - /doc/studio/user-guide/experiments/explore-ml-experiments#hide-commits diff --git a/content/docs/studio/user-guide/model-registry/use-models.md b/content/docs/studio/user-guide/model-registry/use-models.md index 405b2cce37..9c57b1735f 100644 --- a/content/docs/studio/user-guide/model-registry/use-models.md +++ b/content/docs/studio/user-guide/model-registry/use-models.md @@ -9,67 +9,58 @@ capabilities. If your model file is DVC-tracked, you can download any of its registered versions using the DVC Studio [REST API], `dvc get`, or DVC [Python API]. -### Using DVC Studio API +Prerequisites: -Using DVC Studio API will require you to add your remote storage secrets to -respective projects but downloading models can be easier than DVC since you only -need to have the DVC Studio API token. You do not need direct access to your -remote storage, and you do not need to install DVC. DVC Studio APIs rely on -signing urls, which limits the number of remotes they're working with. You can -learn more on [REST API] pages. +- Model stored with DVC with S3, Azure, http or https [remote]. +- The DVC Studio project you like to download your model from needs access to + your [remote storage credentials]. +- Access to your [DVC Studio Access token]. -[REST API]: /doc/studio/rest-api -[Python API]: /doc/api-reference +Without these prerequisites, you can still [download a model artifact with DVC]. +However, it can be easier to use the DVC Studio API since you only need to have +the Studio access token. You do not need direct access to your remote storage or +Git repository, and you do not need to install DVC. -### Using `dvc get` +[remote]: /doc/user-guide/data-management/remote-storage +[remote storage credentials]: + /doc/studio/user-guide/experiments/configure-a-project#data-remotes--cloud-storage-credentials +[DVC Studio Access token]: + /doc/studio/user-guide/account-management#studio-access-token +[download a model artifact with DVC]: /doc/command-reference/artifacts/get -#### Looking up the right command in DVC Studio +You can download the files that make up your model directly from DVC Studio. +Head to the model details page of the model you would like to download and click +`Access Model`. Here, you find different ways to download your model. -The `Get the model file` section of a model's details page contains a `dvc get` -command ready to copy and use. + -![](/img/mr-dvc-get.png) + -#### Constructing the command manually +Use the [dvc artifacts get] command to download an artifact by name. Learn more +on the command reference page for `dvc artifacts get`. -You can also construct this command manually using this template: +[dvc artifacts get]: /doc/command-reference/artifacts/get -```cli -$ dvc get ${GIT_REPO} ${MODEL_PATH} --rev ${GIT_REV} -``` + -Let's download the `text-classification` model in the -[example-get-started](https://github.com/iterative/example-get-started) -repository. + -To find out the model file path (that is, `model.pkl`) you can check the model's -details page or check the `dvc.yaml` file to which the model annotation was -written. +Directly call the Studio [REST API](/doc/studio/rest-api) from your terminal +using `cURL` or in your `Python` code. -To find out the Git revision, you can check the -[list of Git tags](https://github.com/iterative/example-get-started/tags) or use -[GTO](/doc/gto/command-reference/show/): + -```cli -$ gto show text-classification@latest --ref -text-classification@v1.2.0 -``` + -Putting everything together, to download the model file, you can use the -following dvc command: +Here you can generate download links for your model files. After generation, +these download links are valid for 1 hour. You can click the link to directly +download the file. -```cli -$ dvc get https://github.com/iterative/example-get-started \ - model.pkl \ - --rev text-classification@v1.2.0 -``` +![Screenshot of access model button on the model details page](/img/mr-direct-download.png) -Note that `model.pkl` is the path that was specified in `dvc.yaml`, and not the -physical path of the model file (in the remote storage). + -DVC will figure out the right file to download from the remote storage for you. -Note that for running this command, you will need to have remote storage -credentials set up in your environment. + ## Deploying and publishing models in CI/CD diff --git a/content/docs/studio/user-guide/team-collaboration/index.md b/content/docs/studio/user-guide/team-collaboration/index.md index 2784c07c0a..03ffdc8b36 100644 --- a/content/docs/studio/user-guide/team-collaboration/index.md +++ b/content/docs/studio/user-guide/team-collaboration/index.md @@ -73,7 +73,6 @@ even if the project belongs to a team where you are an `Editor` or `Admin`. | Apply filters | Yes\* | Yes | Yes | Yes | | Show / hide columns | Yes\* | Yes | Yes | Yes | | Save the project filters and columns settings | No | No | Yes | Yes | -| Run experiments | No | No | Yes | Yes | | View CML reports | No | No | Yes | Yes | | Share a project | No | No | Yes | Yes | | Add a new project | No | No | Yes | Yes | diff --git a/content/docs/studio/user-guide/troubleshooting.md b/content/docs/studio/user-guide/troubleshooting.md index d78ddeb271..1ddfa46382 100644 --- a/content/docs/studio/user-guide/troubleshooting.md +++ b/content/docs/studio/user-guide/troubleshooting.md @@ -27,7 +27,6 @@ If you need further help, you can send us a message using `Help` on the - [Error: Missing metric or plot file(s)](#error-missing-metric-or-plot-files) - [Error: Base commit not found](#error-base-commit-not-found) -- [Error: Failed to push experiment to repository](#error-failed-to-push-experiment-to-repository) - [Project does not display live metrics and plots](#project-does-not-display-live-metrics-and-plots) - [Project does not display DVC experiments](#project-does-not-display-dvc-experiments) - [Error: `dvc.lock` validation failed](#error-dvclock-validation-failed) @@ -236,46 +235,11 @@ repository again. Files that are larger than 10 MB are currently skipped by DVC Studio. -## Error: Base commit not found - -The base commit which you have selected cannot be found in your Git repository. -Check if the commit has been removed from your Git repository. - -## Error: Failed to push experiment to repository - -This is a non-specific error with a range of possible causes. To resolve it, -check that: - -- Your account is able to push to the Git repository. - - - - If your Git account does not have write access on the Git repository connected - to a project, you cannot push changes (e.g., new experiments) to the - repository even if the project belongs to a team - [where you are an `Editor` or `Admin`](/doc/studio/user-guide/team-collaboration#roles). - - - -- The repository is **not** marked as archived / read only. -- In case of GitHub/GitLab/BitBucket enterprise organizations: there is no IP - whitelisting policy in place which limits access to the organization's - resources. -- Whether [GitHub][gh-status], [GitLab][gl-status], or [BitBucket][bb-status] - are experiencing service disruptions. In case of an on-premises deployment, do - check with your administrator. - -[gh-status]: https://www.githubstatus.com/ -[gl-status]: https://status.gitlab.com/ -[bb-status]: https://bitbucket.status.atlassian.com/ [local-dvc-remotes]: /doc/user-guide/data-management/remote-storage#file-systems-local-remotes [cloud-credentials]: /doc/studio/user-guide/experiments/configure-a-project#data-remotes--cloud-storage-credentials -If you get this error and none of the above applies, please -[get in touch with us](#support). - ## Project does not display live metrics and plots Confirm that you are correctly following the diff --git a/content/docs/user-guide/experiment-management/running-experiments.md b/content/docs/user-guide/experiment-management/running-experiments.md index 73e8e56098..566d647481 100644 --- a/content/docs/user-guide/experiment-management/running-experiments.md +++ b/content/docs/user-guide/experiment-management/running-experiments.md @@ -31,7 +31,7 @@ See [Get Started: Experimenting Using Pipelines] for an intro to this topic.
-### Running the pipeline(s) locally +### Running the pipeline(s) You can run the experiment pipelines using `dvc exp run`. It uses `./dvc.yaml` (in the current directory) by default. @@ -79,19 +79,6 @@ Pipelines] for differences between them. [dependency graph]: /doc/user-guide/pipelines/defining-pipelines [running pipelines]: /doc/user-guide/pipelines/running-pipelines -### Running the pipeline(s) on cloud - -You can use [DVC Studio] to run the experiment pipeline on a remote compute -instance using your own cloud infrastructure. This will spin up a cloud compute -instance on your cloud provider, run a setup script, execute your pipeline using -`dvc exp run`, and share the results back to DVC Studio. See the DVC Studio [Run -Experiments] guide. - -![DVC Studio run cloud experiments](/img/studio-run-cloud-experiments.gif) - -[dvc studio]: https://studio.iterative.ai -[run experiments]: /doc/studio/user-guide/experiments/run-experiments - ## Experiment results The results of the last `dvc exp run` can be seen in the workspace. diff --git a/content/docs/user-guide/project-structure/configuration.md b/content/docs/user-guide/project-structure/configuration.md index 4ee8562757..8975658698 100644 --- a/content/docs/user-guide/project-structure/configuration.md +++ b/content/docs/user-guide/project-structure/configuration.md @@ -215,7 +215,7 @@ section):
-### hydra +## hydra Sets the defaults for experiment configuration via [Hydra Composition]. diff --git a/package.json b/package.json index fbd244c093..f7b820c349 100644 --- a/package.json +++ b/package.json @@ -43,7 +43,7 @@ "node": ">=18.x <=20.x" }, "dependencies": { - "@dvcorg/gatsby-theme-iterative": "0.3.9", + "@dvcorg/gatsby-theme-iterative": "0.3.10", "@dvcorg/websites-server": "0.1.1", "@octokit/request": "8.1.1", "@reach/router": "1.3.4", diff --git a/src/components/DownloadButton/index.tsx b/src/components/DownloadButton/index.tsx index 2905db79dd..1e44522bc4 100644 --- a/src/components/DownloadButton/index.tsx +++ b/src/components/DownloadButton/index.tsx @@ -9,7 +9,7 @@ import { logEvent } from '@dvcorg/gatsby-theme-iterative/src/utils/front/plausib import * as styles from './styles.module.css' import { OS, useUserOS } from '../../utils/front/useUserOS' -const VERSION = `3.28.0` +const VERSION = `3.30.3` const dropdownItems = [ OS.UNKNOWN, diff --git a/static/img/mr-assign-model-stage.webm b/static/img/mr-assign-model-stage.webm new file mode 100644 index 0000000000..7ebdd91dc0 Binary files /dev/null and b/static/img/mr-assign-model-stage.webm differ diff --git a/static/img/mr-direct-download.png b/static/img/mr-direct-download.png new file mode 100644 index 0000000000..712a862016 Binary files /dev/null and b/static/img/mr-direct-download.png differ diff --git a/static/img/mr-dvc-get.png b/static/img/mr-dvc-get.png index 1de0243268..afcdde820e 100644 Binary files a/static/img/mr-dvc-get.png and b/static/img/mr-dvc-get.png differ diff --git a/static/img/mr-model-history.png b/static/img/mr-model-history.png new file mode 100644 index 0000000000..af4f0a4cae Binary files /dev/null and b/static/img/mr-model-history.png differ diff --git a/static/img/mr-newly-added-model.png b/static/img/mr-newly-added-model.png new file mode 100644 index 0000000000..875b8ccb9f Binary files /dev/null and b/static/img/mr-newly-added-model.png differ diff --git a/static/img/mr-register-model-version.webm b/static/img/mr-register-model-version.webm new file mode 100644 index 0000000000..cb09f957a0 Binary files /dev/null and b/static/img/mr-register-model-version.webm differ diff --git a/static/img/mr-remove-model-stage.webm b/static/img/mr-remove-model-stage.webm new file mode 100644 index 0000000000..36d5d61d01 Binary files /dev/null and b/static/img/mr-remove-model-stage.webm differ diff --git a/static/img/mr-studio-download-model.webm b/static/img/mr-studio-download-model.webm new file mode 100644 index 0000000000..26c72683ef Binary files /dev/null and b/static/img/mr-studio-download-model.webm differ diff --git a/static/img/placeholder-cat.gif b/static/img/placeholder-cat.gif new file mode 100644 index 0000000000..3424dd5cea Binary files /dev/null and b/static/img/placeholder-cat.gif differ diff --git a/yarn.lock b/yarn.lock index ce64222b14..924e57f145 100644 --- a/yarn.lock +++ b/yarn.lock @@ -1763,10 +1763,10 @@ resolved "https://registry.yarnpkg.com/@discoveryjs/json-ext/-/json-ext-0.5.7.tgz#1d572bfbbe14b7704e0ba0f39b74815b84870d70" integrity sha512-dBVuXR082gk3jsFp7Rd/JI4kytwGHecnCoTtXFb7DB6CNHp4rg5k1bhg0nWdLGLnOV71lmDzGQaLMy8iPLY0pw== -"@dvcorg/gatsby-theme-iterative@0.3.9": - version "0.3.9" - resolved "https://registry.yarnpkg.com/@dvcorg/gatsby-theme-iterative/-/gatsby-theme-iterative-0.3.9.tgz#bd16ebae38ee07cb7e2d5c8c741e2f7a0de05e3a" - integrity sha512-7G4io3GgRVNdyYBvhI2uJwN9CoQBhDtX7oy+9wSp3NBGPbQEMYS1xlYk2oVDtVCybfhm5RIMZ7qQjLSkjtYMug== +"@dvcorg/gatsby-theme-iterative@0.3.10": + version "0.3.10" + resolved "https://registry.yarnpkg.com/@dvcorg/gatsby-theme-iterative/-/gatsby-theme-iterative-0.3.10.tgz#630d1266bc3875333ee5a8f1eaf8394af11a19cc" + integrity sha512-J4ba+E/UcYMww1c6IwZAs51VoEFzwdMVgw8UqpSxvcZLSeGX5T7hAhqW56mxK0cRB/TJJALxvTVktntqtDFPiA== dependencies: "@reach/portal" "^0.18.0" "@reach/skip-nav" "^0.18.0"