Skip to content

Commit

Permalink
added Kubeflow Pipeline v2 caching doc (#2847)
Browse files Browse the repository at this point in the history
* added v2 caching doc

* fixed format

* fixed v2 cache format

* fixed nits

* fixed comments

* added link

* added cache explanation
  • Loading branch information
capri-xiyue authored Aug 11, 2021
1 parent 7135fac commit 8939db0
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 7 deletions.
33 changes: 33 additions & 0 deletions content/en/docs/components/pipelines/caching-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
+++
title = "Caching v2"
description = "Getting started with Kubeflow Pipelines caching v2"
weight = 50

+++
{{% beta-status
feedbacklink="https://github.com/kubeflow/pipelines/issues" %}}

Starting from [Kubeflow Pipelines SDK v2](https://www.kubeflow.org/docs/components/pipelines/sdk/v2/) and Kubeflow Pipelines 1.7.0, Kubeflow Pipelines supports step caching capabilities in both [standalone deployment](https://www.kubeflow.org/docs/components/pipelines/installation/standalone-deployment/) and [AI platform Pipelines](https://cloud.google.com/ai-platform/pipelines/docs).

## Before you start
This guide tells you the basic concepts of Kubeflow Pipelines caching and how to use it.
This guide assumes that you already have Kubeflow Pipelines installed or want to use standalone or AI platform Pipelines options in the [Kubeflow Pipelines deployment
guide](/docs/components/pipelines/installation/) to deploy Kubeflow Pipelines.

## What is step caching?

Kubeflow Pipelines caching provides step-level output caching, a process that helps to reduce costs by skipping computations that were completed in a previous pipeline run.
Caching is enabled by default for all tasks of pipelines built with [Kubeflow Pipelines SDK v2](https://www.kubeflow.org/docs/components/pipelines/sdk/v2/) using `kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE` mode.
When Kubeflow Pipeline runs a pipeline, it checks to see whether
an execution exists in Kubeflow Pipeline with the interface of each pipeline task.
The task's interface is defined as the combination of the pipeline task specification (base image, command, args), the pipeline task's inputs (the name and id of artifacts, the name and value of parameters),
and the pipeline task's outputs specification(artifacts and parameters).
Note: If the producer task which generates an artifact is not cached, then the producer task will generate a new artifact with different ID, and downstream task which uses the artifact generated by the producer task won't hit cache.

If there is a matching execution in Kubeflow Pipelines, the outputs of that execution are used, and the task is skipped.

## Disabling/enabling caching

Cache is enabled by default with [Kubeflow Pipelines SDK v2](https://www.kubeflow.org/docs/components/pipelines/sdk/v2/) using `kfp.dsl.PipelineExecutionMode.V2_COMPATIBLE` mode.

You can turn off execution caching for pipeline runs that are created using Python. When you run a pipeline using [create_run_from_pipeline_func](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp.Client.create_run_from_pipeline_func) or [create_run_from_pipeline_package](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp.Client.create_run_from_pipeline_package) or [run_pipeline](https://kubeflow-pipelines.readthedocs.io/en/latest/source/kfp.client.html#kfp.Client.run_pipeline,) you can use the `enable_caching` argument to specify that this pipeline run does not use caching.
11 changes: 4 additions & 7 deletions content/en/docs/components/pipelines/caching.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,15 @@ description = "Getting started with Kubeflow Pipelines step caching"
weight = 50

+++
{{% alert title="Out of date" color="warning" %}}
This guide contains outdated information pertaining to Kubeflow 1.0. This guide
needs to be updated for Kubeflow 1.1.
{{% /alert %}}
{{% alpha-status
feedbacklink="https://github.com/kubeflow/pipelines/issues" %}}

Starting from Kubeflow Pipelines 0.4, Kubeflow Pipelines supports step caching capabilities in both standalone deployment and GCP hosted deployment.
Starting from Kubeflow Pipelines 0.4, Kubeflow Pipelines supports step caching capabilities in both standalone deployment and AI platform Pipelines.

## Before you start

This guide tells you the basic concepts of Kubeflow Pipelines step caching and how to use it.
This guide assumes that you already have Kubeflow Pipelines installed or want to use standalone or GCP hosted deployment options in the [Kubeflow Pipelines deployment
guide](/docs/components/pipelines/installation/) to deploy Kubeflow Pipelines.
This guide assumes that you already have Kubeflow Pipelines installed or want to use options in the [Kubeflow Pipelines deployment guide](/docs/components/pipelines/installation/) to deploy Kubeflow Pipelines.

## What is step caching?

Expand Down

0 comments on commit 8939db0

Please sign in to comment.