Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add some nicer docs #3328

Merged
merged 8 commits into from
Jan 30, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion .gitbook.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -264,4 +264,8 @@ redirects:
how-to/advanced-topics/control-logging/disable-rich-traceback: how-to/control-logging/disable-rich-traceback.md
how-to/advanced-topics/control-logging/disable-colorful-logging: how-to/control-logging/disable-colorful-logging.md


how-to/pipeline-development/trigger-pipelines/: how-to/trigger-pipelines/README.md
how-to/pipeline-development/trigger-pipelines/use-templates-python: how-to/trigger-pipelines/use-templates-python.md
how-to/pipeline-development/trigger-pipelines/use-templates-cli: how-to/trigger-pipelines/use-templates-cli.md
how-to/pipeline-development/trigger-pipelines/use-templates-dashboard: how-to/trigger-pipelines/use-templates-dashboard.md
how-to/pipeline-development/trigger-pipelines/use-templates-rest-api: how-to/trigger-pipelines/use-templates-rest-api.md
2 changes: 1 addition & 1 deletion docs/book/getting-started/zenml-pro/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ that expand the functionality of the Open Source product. ZenML Pro adds a manag
- **User management with teams**: Create [organizations](./organization.md) and [teams](./teams.md) to easily manage users at scale.
- **Role-based access control and permissions**: Implement fine-grained access control using customizable [roles](./roles.md) to ensure secure and efficient resource management.
- **Enhanced model and artifact control plane**: Leverage the [Model Control Plane](../../user-guide/starter-guide/track-ml-models.md) and [Artifact Control Plane](../../user-guide/starter-guide/manage-artifacts.md) for improved tracking and management of your ML assets.
- **Triggers and run templates**: ZenML Pro enables you to [create and run templates](../../how-to/pipeline-development/trigger-pipelines/README.md#run-templates). This way, you can use the dashboard or our Client/REST API to run a pipeline with updated configuration, allowing you to iterate quickly with minimal friction.
- **Triggers and run templates**: ZenML Pro enables you to [create and run templates](../../how-to/trigger-pipelines/README.md#run-templates). This way, you can use the dashboard or our Client/REST API to run a pipeline with updated configuration, allowing you to iterate quickly with minimal friction.
- **Early-access features**: Get early access to pro-specific features such as triggers, filters, sorting, generating usage reports, and more.

Learn more about ZenML Pro on the [ZenML website](https://zenml.io/pro).
Expand Down
2 changes: 1 addition & 1 deletion docs/book/getting-started/zenml-pro/core-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ The image above shows the hierarchy of concepts in ZenML Pro.
- [**Teams**](./teams.md) are groups of users within an organization. They help in organizing users and managing access to resources.
- **Users** are single individual accounts on a ZenML Pro instance.
- [**Roles**](./roles.md) are used to control what actions users can perform within a tenant or inside an organization.
- [**Templates**](../../how-to/pipeline-development/trigger-pipelines/README.md) are pipeline runs that can be re-run with a different configuration.
- [**Templates**](../../how-to/trigger-pipelines/README.md) are pipeline runs that can be re-run with a different configuration.

More details about each of these concepts are available in their linked pages below:

Expand Down
2 changes: 1 addition & 1 deletion docs/book/getting-started/zenml-pro/self-hosted.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ ZenML Pro can be installed as a self-hosted deployment. You need to be granted a
This document will guide you through the process.

{% hint style="info" %}
Please note that the SSO (Single Sign-On) and [Run Templates](../../how-to/pipeline-development/trigger-pipelines/README.md) (i.e. running pipelines from the dashboard) features are currently not available in the on-prem version of ZenML Pro. These features are on our roadmap and will be added in future releases.
Please note that the SSO (Single Sign-On) and [Run Templates](../../how-to/trigger-pipelines/README.md) (i.e. running pipelines from the dashboard) features are currently not available in the on-prem version of ZenML Pro. These features are on our roadmap and will be added in future releases.
{% endhint %}


Expand Down
4 changes: 2 additions & 2 deletions docs/book/getting-started/zenml-pro/tenants.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@ Some Pro-only features that you can leverage in your tenant are as follows:

- [Model Control Plane](../../../../docs/book/how-to/model-management-metrics/model-control-plane/register-a-model.md)
- [Artifact Control Plane](../../how-to/data-artifact-management/handle-data-artifacts/README.md)
- [Ability to run pipelines from the Dashboard](../../../../docs/book/how-to/pipeline-development/trigger-pipelines/use-templates-rest-api.md),
- [Create templates out of your pipeline runs](../../../../docs/book/how-to/pipeline-development/trigger-pipelines/use-templates-rest-api.md)
- [Ability to run pipelines from the Dashboard](../../../../docs/book/how-to/trigger-pipelines/use-templates-rest-api.md),
- [Create templates out of your pipeline runs](../../../../docs/book/how-to/trigger-pipelines/use-templates-rest-api.md)

and [more](https://zenml.io/pro)!

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,6 @@ def example_pipeline():
example_pipeline()
```

You can see another example of using an `UnmaterializedArtifact` when triggering a [pipeline from another](../../pipeline-development/trigger-pipelines/use-templates-python.md#advanced-usage-run-a-template-from-another-pipeline).
You can see another example of using an `UnmaterializedArtifact` when triggering a [pipeline from another](../../trigger-pipelines/use-templates-python.md#advanced-usage-run-a-template-from-another-pipeline).

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,6 @@ locally or remotely. See our documentation on this [here](../../../getting-start

Check below for more advanced ways to build and interact with your pipeline.

<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td>Configure pipeline/step parameters</td><td></td><td></td><td><a href="use-pipeline-step-parameters.md">use-pipeline-step-parameters.md</a></td></tr><tr><td>Name and annotate step outputs</td><td></td><td></td><td><a href="step-output-typing-and-annotation.md">step-output-typing-and-annotation.md</a></td></tr><tr><td>Control caching behavior</td><td></td><td></td><td><a href="control-caching-behavior.md">control-caching-behavior.md</a></td></tr><tr><td>Run pipeline from a pipeline</td><td></td><td></td><td><a href="../trigger-pipelines/README.md">README.md</a></td></tr><tr><td>Control the execution order of steps</td><td></td><td></td><td><a href="control-execution-order-of-steps.md">control-execution-order-of-steps.md</a></td></tr><tr><td>Customize the step invocation ids</td><td></td><td></td><td><a href="using-a-custom-step-invocation-id.md">using-a-custom-step-invocation-id.md</a></td></tr><tr><td>Name your pipeline runs</td><td></td><td></td><td><a href="name-your-pipeline-runs.md">name-your-pipeline-runs.md</a></td></tr><tr><td>Use failure/success hooks</td><td></td><td></td><td><a href="use-failure-success-hooks.md">use-failure-success-hooks.md</a></td></tr><tr><td>Hyperparameter tuning</td><td></td><td></td><td><a href="hyper-parameter-tuning.md">hyper-parameter-tuning.md</a></td></tr><tr><td>Attach metadata to a step</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/attach-metadata-to-a-step.md">attach-metadata-to-a-step.md</a></td></tr><tr><td>Fetch metadata within steps</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/fetch-metadata-within-steps.md">fetch-metadata-within-steps.md</a></td></tr><tr><td>Fetch metadata during pipeline composition</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/fetch-metadata-within-pipeline.md">fetch-metadata-within-pipeline.md</a></td></tr><tr><td>Enable or disable logs storing</td><td></td><td></td><td><a href="../../control-logging/enable-or-disable-logs-storing.md">enable-or-disable-logs-storing.md</a></td></tr><tr><td>Special Metadata Types</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/logging-metadata.md">logging-metadata.md</a></td></tr><tr><td>Access secrets in a step</td><td></td><td></td><td><a href="access-secrets-in-a-step.md">access-secrets-in-a-step.md</a></td></tr></tbody></table>
<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td>Configure pipeline/step parameters</td><td></td><td></td><td><a href="use-pipeline-step-parameters.md">use-pipeline-step-parameters.md</a></td></tr><tr><td>Name and annotate step outputs</td><td></td><td></td><td><a href="step-output-typing-and-annotation.md">step-output-typing-and-annotation.md</a></td></tr><tr><td>Control caching behavior</td><td></td><td></td><td><a href="control-caching-behavior.md">control-caching-behavior.md</a></td></tr><tr><td>Customize the step invocation ids</td><td></td><td></td><td><a href="using-a-custom-step-invocation-id.md">using-a-custom-step-invocation-id.md</a></td></tr><tr><td>Name your pipeline runs</td><td></td><td></td><td><a href="name-your-pipeline-runs.md">name-your-pipeline-runs.md</a></td></tr><tr><td>Use failure/success hooks</td><td></td><td></td><td><a href="use-failure-success-hooks.md">use-failure-success-hooks.md</a></td></tr><tr><td>Hyperparameter tuning</td><td></td><td></td><td><a href="hyper-parameter-tuning.md">hyper-parameter-tuning.md</a></td></tr><tr><td>Attach metadata to a step</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/attach-metadata-to-a-step.md">attach-metadata-to-a-step.md</a></td></tr><tr><td>Fetch metadata within steps</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/fetch-metadata-within-steps.md">fetch-metadata-within-steps.md</a></td></tr><tr><td>Fetch metadata during pipeline composition</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/fetch-metadata-within-pipeline.md">fetch-metadata-within-pipeline.md</a></td></tr><tr><td>Enable or disable logs storing</td><td></td><td></td><td><a href="../../control-logging/enable-or-disable-logs-storing.md">enable-or-disable-logs-storing.md</a></td></tr><tr><td>Special Metadata Types</td><td></td><td></td><td><a href="../../model-management-metrics/track-metrics-metadata/logging-metadata.md">logging-metadata.md</a></td></tr><tr><td>Access secrets in a step</td><td></td><td></td><td><a href="access-secrets-in-a-step.md">access-secrets-in-a-step.md</a></td></tr></tbody></table>

<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ def training_pipeline():
```

{% hint style="info" %}
Here we are calling one pipeline from within another pipeline, so functionally the `data_loading_pipeline` is functioning as a step within the `training_pipeline`, i.e. the steps of the former are added to the latter. Only the parent pipeline will be visible in the dashboard. In order to actually trigger a pipeline from another, see [here](../../pipeline-development/trigger-pipelines/use-templates-python.md#advanced-usage-run-a-template-from-another-pipeline)
Here we are calling one pipeline from within another pipeline, so functionally the `data_loading_pipeline` is functioning as a step within the `training_pipeline`, i.e. the steps of the former are added to the latter. Only the parent pipeline will be visible in the dashboard. In order to actually trigger a pipeline from another, see [here](../../trigger-pipelines/use-templates-python.md#advanced-usage-run-a-template-from-another-pipeline)
{% endhint %}

<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td>Learn more about orchestrators here</td><td></td><td></td><td><a href="../../../component-guide/orchestrators/orchestrators.md">orchestrators.md</a></td></tr></tbody></table>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,9 @@ You can learn more about these options [here](../../pipeline-development/use-con

However, there is one exception: if you would like to trigger a pipeline from the client
or another pipeline, you would need to pass the `PipelineRunConfiguration` object.
Learn more about this [here](../../pipeline-development/trigger-pipelines/use-templates-python.md#advanced-usage-run-a-template-from-another-pipeline).
Learn more about this [here](../../trigger-pipelines/use-templates-python.md#advanced-usage-run-a-template-from-another-pipeline).

<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td>Using config files</td><td></td><td></td><td><a href="../../pipeline-development/use-configuration-files/README.md">../../pipeline-development/use-configuration-files/README.md</a></td></tr></tbody></table>
<table data-view="cards"><thead><tr><th></th><th></th><th></th><th data-hidden data-card-target data-type="content-ref"></th></tr></thead><tbody><tr><td>Using config files</td><td></td><td></td><td><a href="../../use-configuration-files/README.md">../../pipeline-development/use-configuration-files/README.md</a></td></tr></tbody></table>

<!-- For scarf -->
<figure><img alt="ZenML Scarf" referrerpolicy="no-referrer-when-downgrade" src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" /></figure>
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
description: Running steps in parallel.
---

# Fan-in and Fan-out Patterns

The fan-out/fan-in pattern is a common pipeline architecture where a single step splits into multiple parallel operations (fan-out) and then consolidates the results back into a single step (fan-in). This pattern is particularly useful for parallel processing, distributed workloads, or when you need to process data through different transformations and then aggregate the results. For example, you might want to process different chunks of data in parallel and then aggregate the results:

```python
from zenml import step, get_step_context, pipeline
from zenml.client import Client


@step
def load_step() -> str:
return "Hello from ZenML!"


@step
def process_step(input_data: str) -> str:
return input_data


@step
def combine_step(step_prefix: str, output_name: str) -> None:
run_name = get_step_context().pipeline_run.name
run = Client().get_pipeline_run(run_name)

# Fetch all results from parallel processing steps
processed_results = {}
for step_name, step_info in run.steps.items():
if step_name.startswith(step_prefix):
output = step_info.outputs[output_name][0]
processed_results[step_info.name] = output.load()

# Combine all results
print(",".join([f"{k}: {v}" for k, v in processed_results.items()]))


@pipeline(enable_cache=False)
def fan_out_fan_in_pipeline(parallel_count: int) -> None:
# Initial step (source)
input_data = load_step()

# Fan out: Process data in parallel branches
after = []
for i in range(parallel_count):
_ = process_step(input_data, id=f"process_{i}")
after.append(f"process_{i}")

# Fan in: Combine results from all parallel branches
combine_step(step_prefix="process_", output_name="output", after=after)


fan_out_fan_in_pipeline(parallel_count=8)
```

The fan-out pattern allows for parallel processing and better resource utilization, while the fan-in pattern enables aggregation and consolidation of results. This is particularly useful for:

- Parallel data processing
- Distributed model training
- Ensemble methods
- Batch processing
- Data validation across multiple sources
- [Hyperparameter tuning](./hyper-parameter-tuning.md)

Note that when implementing the fan-in step, you'll need to use the ZenML Client to query the results from previous parallel steps, as shown in the example above, and you can't pass in the result directly.

{% hint style="warning" %}
The fan-in, fan-out method has the following limitations:

1. Steps run sequentially rather than in parallel if the underlying orchestrator does not support parallel step runs (e.g. with the local orchestrator)
2. The number of steps need to be known ahead-of-time, and ZenML does not yet support the ability to dynamically create steps on the fly.
{% endhint %}


<figure><img src="https://static.scarf.sh/a.png?x-pxid=f0b4f458-0a54-4fcd-aa95-d5ee424815bc" alt="ZenML Scarf"><figcaption></figcaption></figure>
Loading