From a7cf6492c7852515d1f6b2454912a6c27e03faa1 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Tue, 16 Jan 2024 11:51:41 +0100 Subject: [PATCH 01/11] docs: add examples of replacing providers --- docs/user-guide/replacing-providers.ipynb | 277 ++++++++++++++++++++++ 1 file changed, 277 insertions(+) create mode 100644 docs/user-guide/replacing-providers.ipynb diff --git a/docs/user-guide/replacing-providers.ipynb b/docs/user-guide/replacing-providers.ipynb new file mode 100644 index 00000000..e2787314 --- /dev/null +++ b/docs/user-guide/replacing-providers.ipynb @@ -0,0 +1,277 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "86de99b0-3170-45d6-84eb-adbd622af936", + "metadata": {}, + "source": [ + "# Replacing providers\n", + "\n", + "## Overview\n", + "\n", + "It is a common need to be able to replace a provider, either with another provider or with a specific value.\n", + "\n", + "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", + " * loading the raw data\n", + " * cleaning the raw data\n", + " * computing a sum of the cleaned data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import NewType\n", + "\n", + "Filename = NewType('Filename', str)\n", + "RawData = NewType('RawData', list)\n", + "CleanData = NewType('CleanData', list)\n", + "Result = NewType('Result', list)\n", + "\n", + "filesystem = {'raw.txt': list(map(str, range(10)))}\n", + "\n", + "def load(filename: Filename) -> RawData:\n", + " \"\"\"Load the data from the filename.\"\"\"\n", + " data = filesystem[filename]\n", + " return RawData(data)\n", + "\n", + "def clean(raw_data: RawData) -> CleanData:\n", + " \"\"\"Clean the data, convert from str.\"\"\"\n", + " return CleanData(list(map(float, raw_data)))\n", + "\n", + "def process(clean_data: CleanData) -> Result:\n", + " \"\"\"Compute the sum of the clean data.\"\"\"\n", + " return Result(sum(clean_data))\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", + "metadata": {}, + "outputs": [], + "source": [ + "import sciline\n", + "\n", + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "8fa7a168-be19-419a-b7b0-dfe6e150134b", + "metadata": {}, + "source": [ + "## Replacing a provider with a value" + ] + }, + { + "cell_type": "markdown", + "id": "e68f8022-0369-4abd-a516-ac99432812f3", + "metadata": {}, + "source": [ + "Select `Result` the task graph will use the `Filename` input, because it needs to read the data from the file system:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "837135eb-c858-484e-91b5-b47917aefe57", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "markdown", + "id": "3f504103-daa3-427d-9e8c-a4aa332b1f72", + "metadata": {}, + "source": [ + "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", + "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", + "metadata": {}, + "outputs": [], + "source": [ + "data = pipeline.compute(CleanData)\n", + "pipeline[CleanData] = data\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "0596304b-d38a-4b35-b85d-41a7a4dc2605", + "metadata": {}, + "source": [ + "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2594ab08-03cd-474a-8690-9b54978d8cf0", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] + }, + { + "cell_type": "markdown", + "id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55", + "metadata": {}, + "source": [ + "## Replacing a provider with another provider" + ] + }, + { + "cell_type": "markdown", + "id": "9ecd237c-c70c-41e1-aebb-5bae714f5031", + "metadata": {}, + "source": [ + "If the current provider doesn't do what we want it to do we can replace it with another provider." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a12634a5-072a-4587-8fb0-fc531b54bfc7", + "metadata": {}, + "outputs": [], + "source": [ + "import sciline\n", + "\n", + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "c9ff03a4-726a-4686-a31c-99fa420f57b2", + "metadata": {}, + "source": [ + "Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d56d330b-955c-4778-96e0-9201212b341f", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import Literal, Union\n", + "\n", + "Target = NewType('Target', str)\n", + "\n", + "def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n", + " if target == 'odd':\n", + " return [n for n in map(float, raw_data) if n % 2 == 1]\n", + " if target == 'even':\n", + " return [n for n in map(float, raw_data) if n % 2 == 0]\n", + " raise ValueError" + ] + }, + { + "cell_type": "markdown", + "id": "fb76644b-d81a-4866-aba8-5e644050439c", + "metadata": {}, + "source": [ + "To replace the old `CleanData` provider we need to use `Pipeline.insert`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5fcc69e2-9617-4025-99dc-3ce9badfa16a", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.insert(clean_and_remove_some)\n", + "pipeline[Target] = 'odd'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39e740e6-89f8-4693-85e7-8443071da426", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "e9c9006a-76e2-42d2-b35c-8a8e3abf8323", + "metadata": {}, + "source": [ + "Now if we select the `Result` we see that the new provider will be used in the computation:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "deba51d9-a348-4f6c-9e06-df30c8c8ca38", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "30bb4ff5-5fc0-4095-b4a2-a2e5e94824e8", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From f1a198c2dee38c998f7d1f0b32218975c0cbbae5 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Tue, 16 Jan 2024 11:58:40 +0100 Subject: [PATCH 02/11] fix --- docs/user-guide/replacing-providers.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/user-guide/replacing-providers.ipynb b/docs/user-guide/replacing-providers.ipynb index e2787314..0d150412 100644 --- a/docs/user-guide/replacing-providers.ipynb +++ b/docs/user-guide/replacing-providers.ipynb @@ -75,7 +75,7 @@ "id": "e68f8022-0369-4abd-a516-ac99432812f3", "metadata": {}, "source": [ - "Select `Result` the task graph will use the `Filename` input, because it needs to read the data from the file system:" + "Select `Result`, the task graph will use the `Filename` input because it needs to read the data from the file system:" ] }, { From 7c501a342e9a8d5c45328c19b9572d21c4e82ac1 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Tue, 16 Jan 2024 12:40:09 +0100 Subject: [PATCH 03/11] update index --- docs/user-guide/index.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/user-guide/index.md b/docs/user-guide/index.md index 40df22dd..89aeb10a 100644 --- a/docs/user-guide/index.md +++ b/docs/user-guide/index.md @@ -8,4 +8,5 @@ maxdepth: 2 getting-started parameter-tables generic-providers +replacing-providers ``` From 97c76820e24517714883c91022032f59b5b576cf Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Tue, 16 Jan 2024 14:56:16 +0100 Subject: [PATCH 04/11] put in recipes and split into two --- .../continue-from-intermediate-results.ipynb | 173 ++++++++++++++++++ .../replacing-providers.ipynb | 110 +---------- 2 files changed, 183 insertions(+), 100 deletions(-) create mode 100644 docs/recipes/continue-from-intermediate-results.ipynb rename docs/{user-guide => recipes}/replacing-providers.ipynb (63%) diff --git a/docs/recipes/continue-from-intermediate-results.ipynb b/docs/recipes/continue-from-intermediate-results.ipynb new file mode 100644 index 00000000..91f3a09e --- /dev/null +++ b/docs/recipes/continue-from-intermediate-results.ipynb @@ -0,0 +1,173 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "86de99b0-3170-45d6-84eb-adbd622af936", + "metadata": {}, + "source": [ + "# Continue from intermediate results\n", + "\n", + "## Overview\n", + "\n", + "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "35b9c4ac-f1a3-4b96-8ad0-8bc6b2588e58", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "## Setup\n", + "\n", + "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", + " * loading the raw data\n", + " * cleaning the raw data\n", + " * computing a sum of the cleaned data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import NewType\n", + "\n", + "Filename = NewType('Filename', str)\n", + "RawData = NewType('RawData', list)\n", + "CleanData = NewType('CleanData', list)\n", + "Result = NewType('Result', list)\n", + "\n", + "filesystem = {'raw.txt': list(map(str, range(10)))}\n", + "\n", + "def load(filename: Filename) -> RawData:\n", + " \"\"\"Load the data from the filename.\"\"\"\n", + " data = filesystem[filename]\n", + " return RawData(data)\n", + "\n", + "def clean(raw_data: RawData) -> CleanData:\n", + " \"\"\"Clean the data, convert from str.\"\"\"\n", + " return CleanData(list(map(float, raw_data)))\n", + "\n", + "def process(clean_data: CleanData) -> Result:\n", + " \"\"\"Compute the sum of the clean data.\"\"\"\n", + " return Result(sum(clean_data))\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", + "metadata": {}, + "outputs": [], + "source": [ + "import sciline\n", + "\n", + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "8fa7a168-be19-419a-b7b0-dfe6e150134b", + "metadata": {}, + "source": [ + "## Setting intermediate results" + ] + }, + { + "cell_type": "markdown", + "id": "e68f8022-0369-4abd-a516-ac99432812f3", + "metadata": {}, + "source": [ + "If we select `Result` the task graph will use the `Filename` input because it needs to read the raw data from the file system:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "837135eb-c858-484e-91b5-b47917aefe57", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "markdown", + "id": "3f504103-daa3-427d-9e8c-a4aa332b1f72", + "metadata": {}, + "source": [ + "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", + "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", + "metadata": {}, + "outputs": [], + "source": [ + "data = pipeline.compute(CleanData)\n", + "pipeline[CleanData] = data\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "0596304b-d38a-4b35-b85d-41a7a4dc2605", + "metadata": {}, + "source": [ + "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2594ab08-03cd-474a-8690-9b54978d8cf0", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/user-guide/replacing-providers.ipynb b/docs/recipes/replacing-providers.ipynb similarity index 63% rename from docs/user-guide/replacing-providers.ipynb rename to docs/recipes/replacing-providers.ipynb index 0d150412..e0a50dca 100644 --- a/docs/user-guide/replacing-providers.ipynb +++ b/docs/recipes/replacing-providers.ipynb @@ -9,7 +9,15 @@ "\n", "## Overview\n", "\n", - "It is a common need to be able to replace a provider, either with another provider or with a specific value.\n", + "This example shows how to replace a provider in the pipeline using the `Pipeline.insert` method." + ] + }, + { + "cell_type": "markdown", + "id": "f05644be-ac0e-46c2-81a0-a4afe057e411", + "metadata": {}, + "source": [ + "## Setup\n", "\n", "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", " * loading the raw data\n", @@ -62,110 +70,12 @@ "pipeline" ] }, - { - "cell_type": "markdown", - "id": "8fa7a168-be19-419a-b7b0-dfe6e150134b", - "metadata": {}, - "source": [ - "## Replacing a provider with a value" - ] - }, - { - "cell_type": "markdown", - "id": "e68f8022-0369-4abd-a516-ac99432812f3", - "metadata": {}, - "source": [ - "Select `Result`, the task graph will use the `Filename` input because it needs to read the data from the file system:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "837135eb-c858-484e-91b5-b47917aefe57", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "markdown", - "id": "3f504103-daa3-427d-9e8c-a4aa332b1f72", - "metadata": {}, - "source": [ - "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", - "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", - "metadata": {}, - "outputs": [], - "source": [ - "data = pipeline.compute(CleanData)\n", - "pipeline[CleanData] = data\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "0596304b-d38a-4b35-b85d-41a7a4dc2605", - "metadata": {}, - "source": [ - "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2594ab08-03cd-474a-8690-9b54978d8cf0", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.compute(Result)" - ] - }, { "cell_type": "markdown", "id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55", "metadata": {}, "source": [ - "## Replacing a provider with another provider" - ] - }, - { - "cell_type": "markdown", - "id": "9ecd237c-c70c-41e1-aebb-5bae714f5031", - "metadata": {}, - "source": [ - "If the current provider doesn't do what we want it to do we can replace it with another provider." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "a12634a5-072a-4587-8fb0-fc531b54bfc7", - "metadata": {}, - "outputs": [], - "source": [ - "import sciline\n", - "\n", - "pipeline = sciline.Pipeline(\n", - " [load, clean, process,],\n", - " params={ Filename: 'raw.txt', })\n", - "pipeline" + "## Replacing a provider using `Pipeline.insert`" ] }, { From d58101843a6b5a6c65317f6ab27e30d84732182a Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Tue, 16 Jan 2024 15:06:47 +0100 Subject: [PATCH 05/11] merge recipes --- .../continue-from-intermediate-results.ipynb | 173 ----------- docs/recipes/recipes.ipynb | 272 +++++++++++++++++- docs/recipes/replacing-providers.ipynb | 187 ------------ 3 files changed, 271 insertions(+), 361 deletions(-) delete mode 100644 docs/recipes/continue-from-intermediate-results.ipynb delete mode 100644 docs/recipes/replacing-providers.ipynb diff --git a/docs/recipes/continue-from-intermediate-results.ipynb b/docs/recipes/continue-from-intermediate-results.ipynb deleted file mode 100644 index 91f3a09e..00000000 --- a/docs/recipes/continue-from-intermediate-results.ipynb +++ /dev/null @@ -1,173 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "86de99b0-3170-45d6-84eb-adbd622af936", - "metadata": {}, - "source": [ - "# Continue from intermediate results\n", - "\n", - "## Overview\n", - "\n", - "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "id": "35b9c4ac-f1a3-4b96-8ad0-8bc6b2588e58", - "metadata": { - "jp-MarkdownHeadingCollapsed": true - }, - "source": [ - "## Setup\n", - "\n", - "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", - " * loading the raw data\n", - " * cleaning the raw data\n", - " * computing a sum of the cleaned data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import NewType\n", - "\n", - "Filename = NewType('Filename', str)\n", - "RawData = NewType('RawData', list)\n", - "CleanData = NewType('CleanData', list)\n", - "Result = NewType('Result', list)\n", - "\n", - "filesystem = {'raw.txt': list(map(str, range(10)))}\n", - "\n", - "def load(filename: Filename) -> RawData:\n", - " \"\"\"Load the data from the filename.\"\"\"\n", - " data = filesystem[filename]\n", - " return RawData(data)\n", - "\n", - "def clean(raw_data: RawData) -> CleanData:\n", - " \"\"\"Clean the data, convert from str.\"\"\"\n", - " return CleanData(list(map(float, raw_data)))\n", - "\n", - "def process(clean_data: CleanData) -> Result:\n", - " \"\"\"Compute the sum of the clean data.\"\"\"\n", - " return Result(sum(clean_data))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", - "metadata": {}, - "outputs": [], - "source": [ - "import sciline\n", - "\n", - "pipeline = sciline.Pipeline(\n", - " [load, clean, process,],\n", - " params={ Filename: 'raw.txt', })\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "8fa7a168-be19-419a-b7b0-dfe6e150134b", - "metadata": {}, - "source": [ - "## Setting intermediate results" - ] - }, - { - "cell_type": "markdown", - "id": "e68f8022-0369-4abd-a516-ac99432812f3", - "metadata": {}, - "source": [ - "If we select `Result` the task graph will use the `Filename` input because it needs to read the raw data from the file system:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "837135eb-c858-484e-91b5-b47917aefe57", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "markdown", - "id": "3f504103-daa3-427d-9e8c-a4aa332b1f72", - "metadata": {}, - "source": [ - "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", - "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", - "metadata": {}, - "outputs": [], - "source": [ - "data = pipeline.compute(CleanData)\n", - "pipeline[CleanData] = data\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "0596304b-d38a-4b35-b85d-41a7a4dc2605", - "metadata": {}, - "source": [ - "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2594ab08-03cd-474a-8690-9b54978d8cf0", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.compute(Result)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/docs/recipes/recipes.ipynb b/docs/recipes/recipes.ipynb index 024753fc..f3340bc7 100644 --- a/docs/recipes/recipes.ipynb +++ b/docs/recipes/recipes.ipynb @@ -103,6 +103,276 @@ "Using `bind_and_call` guarantees that the file gets written and that it gets written after the pipeline.\n", "The latter prevents providers from accidentally relying on the file." ] + }, + { + "cell_type": "markdown", + "id": "60f9b2a8-0557-43da-a8d2-41df2794bd49", + "metadata": {}, + "source": [ + "## Continue from intermediate results\n", + "\n", + "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "aa1df625-9f6a-44d3-9169-1c79ff8e647f", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "### Setup\n", + "\n", + "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", + " * loading the raw data\n", + " * cleaning the raw data\n", + " * computing a sum of the cleaned data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import NewType\n", + "\n", + "Filename = NewType('Filename', str)\n", + "RawData = NewType('RawData', list)\n", + "CleanData = NewType('CleanData', list)\n", + "Result = NewType('Result', list)\n", + "\n", + "filesystem = {'raw.txt': list(map(str, range(10)))}\n", + "\n", + "def load(filename: Filename) -> RawData:\n", + " \"\"\"Load the data from the filename.\"\"\"\n", + " data = filesystem[filename]\n", + " return RawData(data)\n", + "\n", + "def clean(raw_data: RawData) -> CleanData:\n", + " \"\"\"Clean the data, convert from str.\"\"\"\n", + " return CleanData(list(map(float, raw_data)))\n", + "\n", + "def process(clean_data: CleanData) -> Result:\n", + " \"\"\"Compute the sum of the clean data.\"\"\"\n", + " return Result(sum(clean_data))\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", + "metadata": {}, + "outputs": [], + "source": [ + "import sciline\n", + "\n", + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "a7870758-aa28-497d-adc9-863efe23e463", + "metadata": {}, + "source": [ + "### Setting intermediate results" + ] + }, + { + "cell_type": "markdown", + "id": "74ace6e8-2e32-420b-b96d-dc67c8ce4ab7", + "metadata": {}, + "source": [ + "If we select `Result` the task graph will use the `Filename` input because it needs to read the raw data from the file system:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "837135eb-c858-484e-91b5-b47917aefe57", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "markdown", + "id": "99fe0c69-597b-4384-96de-92b25b6e95cd", + "metadata": {}, + "source": [ + "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", + "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", + "metadata": {}, + "outputs": [], + "source": [ + "data = pipeline.compute(CleanData)\n", + "pipeline[CleanData] = data\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "bda0ae77-9911-4cc1-9727-45a64b712875", + "metadata": {}, + "source": [ + "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "2594ab08-03cd-474a-8690-9b54978d8cf0", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] + }, + { + "cell_type": "markdown", + "id": "f3d48a4a", + "metadata": {}, + "source": [ + "## Replacing providers\n", + "\n", + "This example shows how to replace a provider in the pipeline using the `Pipeline.insert` method." + ] + }, + { + "cell_type": "markdown", + "id": "f05644be-ac0e-46c2-81a0-a4afe057e411", + "metadata": {}, + "source": [ + "### Setup\n", + "Same setup as in [Continue from intermediate results](#continue-from-intermediate-results)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "73fc851b", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55", + "metadata": {}, + "source": [ + "### Replacing a provider using `Pipeline.insert`" + ] + }, + { + "cell_type": "markdown", + "id": "c9ff03a4-726a-4686-a31c-99fa420f57b2", + "metadata": {}, + "source": [ + "Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d56d330b-955c-4778-96e0-9201212b341f", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import Literal, Union\n", + "\n", + "Target = NewType('Target', str)\n", + "\n", + "def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n", + " if target == 'odd':\n", + " return [n for n in map(float, raw_data) if n % 2 == 1]\n", + " if target == 'even':\n", + " return [n for n in map(float, raw_data) if n % 2 == 0]\n", + " raise ValueError" + ] + }, + { + "cell_type": "markdown", + "id": "fb76644b-d81a-4866-aba8-5e644050439c", + "metadata": {}, + "source": [ + "To replace the old `CleanData` provider we need to use `Pipeline.insert`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5fcc69e2-9617-4025-99dc-3ce9badfa16a", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.insert(clean_and_remove_some)\n", + "pipeline[Target] = 'odd'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "39e740e6-89f8-4693-85e7-8443071da426", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "e9c9006a-76e2-42d2-b35c-8a8e3abf8323", + "metadata": {}, + "source": [ + "Now if we select the `Result` we see that the new provider will be used in the computation:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "deba51d9-a348-4f6c-9e06-df30c8c8ca38", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "30bb4ff5-5fc0-4095-b4a2-a2e5e94824e8", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] } ], "metadata": { @@ -121,7 +391,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.11.5" + "version": "3.10.12" } }, "nbformat": 4, diff --git a/docs/recipes/replacing-providers.ipynb b/docs/recipes/replacing-providers.ipynb deleted file mode 100644 index e0a50dca..00000000 --- a/docs/recipes/replacing-providers.ipynb +++ /dev/null @@ -1,187 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "86de99b0-3170-45d6-84eb-adbd622af936", - "metadata": {}, - "source": [ - "# Replacing providers\n", - "\n", - "## Overview\n", - "\n", - "This example shows how to replace a provider in the pipeline using the `Pipeline.insert` method." - ] - }, - { - "cell_type": "markdown", - "id": "f05644be-ac0e-46c2-81a0-a4afe057e411", - "metadata": {}, - "source": [ - "## Setup\n", - "\n", - "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", - " * loading the raw data\n", - " * cleaning the raw data\n", - " * computing a sum of the cleaned data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import NewType\n", - "\n", - "Filename = NewType('Filename', str)\n", - "RawData = NewType('RawData', list)\n", - "CleanData = NewType('CleanData', list)\n", - "Result = NewType('Result', list)\n", - "\n", - "filesystem = {'raw.txt': list(map(str, range(10)))}\n", - "\n", - "def load(filename: Filename) -> RawData:\n", - " \"\"\"Load the data from the filename.\"\"\"\n", - " data = filesystem[filename]\n", - " return RawData(data)\n", - "\n", - "def clean(raw_data: RawData) -> CleanData:\n", - " \"\"\"Clean the data, convert from str.\"\"\"\n", - " return CleanData(list(map(float, raw_data)))\n", - "\n", - "def process(clean_data: CleanData) -> Result:\n", - " \"\"\"Compute the sum of the clean data.\"\"\"\n", - " return Result(sum(clean_data))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", - "metadata": {}, - "outputs": [], - "source": [ - "import sciline\n", - "\n", - "pipeline = sciline.Pipeline(\n", - " [load, clean, process,],\n", - " params={ Filename: 'raw.txt', })\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55", - "metadata": {}, - "source": [ - "## Replacing a provider using `Pipeline.insert`" - ] - }, - { - "cell_type": "markdown", - "id": "c9ff03a4-726a-4686-a31c-99fa420f57b2", - "metadata": {}, - "source": [ - "Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d56d330b-955c-4778-96e0-9201212b341f", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import Literal, Union\n", - "\n", - "Target = NewType('Target', str)\n", - "\n", - "def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n", - " if target == 'odd':\n", - " return [n for n in map(float, raw_data) if n % 2 == 1]\n", - " if target == 'even':\n", - " return [n for n in map(float, raw_data) if n % 2 == 0]\n", - " raise ValueError" - ] - }, - { - "cell_type": "markdown", - "id": "fb76644b-d81a-4866-aba8-5e644050439c", - "metadata": {}, - "source": [ - "To replace the old `CleanData` provider we need to use `Pipeline.insert`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5fcc69e2-9617-4025-99dc-3ce9badfa16a", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.insert(clean_and_remove_some)\n", - "pipeline[Target] = 'odd'" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "39e740e6-89f8-4693-85e7-8443071da426", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "e9c9006a-76e2-42d2-b35c-8a8e3abf8323", - "metadata": {}, - "source": [ - "Now if we select the `Result` we see that the new provider will be used in the computation:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "deba51d9-a348-4f6c-9e06-df30c8c8ca38", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "30bb4ff5-5fc0-4095-b4a2-a2e5e94824e8", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.compute(Result)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} From ca75ca4b24eb6c419eeec7f2617a9f4fb52a9281 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Tue, 16 Jan 2024 15:07:43 +0100 Subject: [PATCH 06/11] remove from user-guide --- docs/user-guide/index.md | 1 - 1 file changed, 1 deletion(-) diff --git a/docs/user-guide/index.md b/docs/user-guide/index.md index 89aeb10a..40df22dd 100644 --- a/docs/user-guide/index.md +++ b/docs/user-guide/index.md @@ -8,5 +8,4 @@ maxdepth: 2 getting-started parameter-tables generic-providers -replacing-providers ``` From 32d18d35d70992512f3fcb27d7916e6d9e360334 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Wed, 17 Jan 2024 14:03:34 +0100 Subject: [PATCH 07/11] split recipes to separate files --- .../continue-from-intermediate-results.ipynb | 171 ++++++++ docs/recipes/index.md | 11 + docs/recipes/recipes.ipynb | 399 ------------------ docs/recipes/replacing-providers.ipynb | 149 +++++++ .../side-effects-and-file-writing.ipynb | 121 ++++++ 5 files changed, 452 insertions(+), 399 deletions(-) create mode 100644 docs/recipes/continue-from-intermediate-results.ipynb create mode 100644 docs/recipes/index.md delete mode 100644 docs/recipes/recipes.ipynb create mode 100644 docs/recipes/replacing-providers.ipynb create mode 100644 docs/recipes/side-effects-and-file-writing.ipynb diff --git a/docs/recipes/continue-from-intermediate-results.ipynb b/docs/recipes/continue-from-intermediate-results.ipynb new file mode 100644 index 00000000..17cc389c --- /dev/null +++ b/docs/recipes/continue-from-intermediate-results.ipynb @@ -0,0 +1,171 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "1b29e65b-73cb-4fc0-b9ad-1d7384a16578", + "metadata": {}, + "source": [ + "## Continue from intermediate results\n", + "\n", + "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", + "\n" + ] + }, + { + "cell_type": "markdown", + "id": "f239f707-bc9d-4f6f-997c-fb1c73e68223", + "metadata": { + "jp-MarkdownHeadingCollapsed": true + }, + "source": [ + "### Setup\n", + "\n", + "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", + " * loading the raw data\n", + " * cleaning the raw data\n", + " * computing a sum of the cleaned data." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d2c46df9-43ad-4422-816a-a402df169587", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import NewType\n", + "\n", + "Filename = NewType('Filename', str)\n", + "RawData = NewType('RawData', list)\n", + "CleanData = NewType('CleanData', list)\n", + "Result = NewType('Result', list)\n", + "\n", + "filesystem = {'raw.txt': list(map(str, range(10)))}\n", + "\n", + "def load(filename: Filename) -> RawData:\n", + " \"\"\"Load the data from the filename.\"\"\"\n", + " data = filesystem[filename]\n", + " return RawData(data)\n", + "\n", + "def clean(raw_data: RawData) -> CleanData:\n", + " \"\"\"Clean the data, convert from str.\"\"\"\n", + " return CleanData(list(map(float, raw_data)))\n", + "\n", + "def process(clean_data: CleanData) -> Result:\n", + " \"\"\"Compute the sum of the clean data.\"\"\"\n", + " return Result(sum(clean_data))\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c7ae3a94-3259-4be1-bffc-720da04df9ed", + "metadata": {}, + "outputs": [], + "source": [ + "import sciline\n", + "\n", + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "f5c4d999-6a85-4d7b-9b2f-751e261690e7", + "metadata": {}, + "source": [ + "### Setting intermediate results" + ] + }, + { + "cell_type": "markdown", + "id": "fdf05422-2a96-4127-a265-75c55122e582", + "metadata": {}, + "source": [ + "If we select `Result` the task graph will use the `Filename` input because it needs to read the raw data from the file system:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d6140e38-494a-4a91-aa85-57832aab64ad", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "markdown", + "id": "91f0cfac-1440-4fbb-98f6-c6c9451f3275", + "metadata": {}, + "source": [ + "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", + "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "23fe22b8-b59e-4d63-9255-f510bbd8bec7", + "metadata": {}, + "outputs": [], + "source": [ + "data = pipeline.compute(CleanData)\n", + "pipeline[CleanData] = data\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "be8abca9-50d1-4be3-8959-6c213cd35b7b", + "metadata": {}, + "source": [ + "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b1bd87c0-7633-4761-a34a-be1a2ac6a071", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "c94a40c9-d965-42ec-a739-b45d73ad3260", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/recipes/index.md b/docs/recipes/index.md new file mode 100644 index 00000000..2acc5784 --- /dev/null +++ b/docs/recipes/index.md @@ -0,0 +1,11 @@ +# Recipes + +```{toctree} +--- +maxdepth: 2 +--- + +side-effects-and-file-writing +continue-from-intermediate-results +replacing-providers +``` diff --git a/docs/recipes/recipes.ipynb b/docs/recipes/recipes.ipynb deleted file mode 100644 index f3340bc7..00000000 --- a/docs/recipes/recipes.ipynb +++ /dev/null @@ -1,399 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "id": "d7608bb4-d7ce-4a36-8021-f1a52ad27ec5", - "metadata": {}, - "source": [ - "# Recipes" - ] - }, - { - "cell_type": "markdown", - "id": "0835a63d-f22a-477e-b887-9cdba171d8f0", - "metadata": {}, - "source": [ - "## Avoiding side effects\n", - "\n", - "It is strongly discouraged to use [side effects](https://en.wikipedia.org/wiki/Side_effect_%28computer_science%29) in code that runs as part of a pipeline.\n", - "This applies to, among others, file output, setting global variables, or communicating over a network.\n", - "The reason is that side effects rely on code running in a specific order.\n", - "But pipelines in Sciline have a relaxed notion of time in that the scheduler determines when and if a provider runs." - ] - }, - { - "cell_type": "markdown", - "id": "8abc6b15-7749-4247-9af0-84132ef81da1", - "metadata": {}, - "source": [ - "### File output\n", - "\n", - "Files typically only need to be written at the end of a pipeline.\n", - "We can use [Pipeline.bind_and_call](../generated/classes/sciline.Pipeline.rst#sciline.Pipeline.bind_and_call) to call a function which writes the file:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "8d39f143-c9da-4a7c-9ec8-ea7697b68b1b", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import NewType\n", - "\n", - "import sciline\n", - "\n", - "_fake_filesystem = {}\n", - "\n", - "Param = NewType('Param', float)\n", - "Data = NewType('Data', float)\n", - "Filename = NewType('Filename', str)\n", - "\n", - "\n", - "def foo(p: Param) -> Data:\n", - " return Data(2 * p)\n", - "\n", - "\n", - "def write_file(d: Data, filename: Filename) -> None:\n", - " _fake_filesystem[filename] = d\n", - "\n", - "\n", - "pipeline = sciline.Pipeline([foo], params={Param: 3.1, Filename: 'output.dat'})\n", - "\n", - "pipeline.bind_and_call(write_file)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "b7d08320-6e51-4c70-9e94-99533a6d12bb", - "metadata": {}, - "outputs": [], - "source": [ - "_fake_filesystem" - ] - }, - { - "cell_type": "markdown", - "id": "48134a65-6b21-4086-baf2-16ec31285331", - "metadata": {}, - "source": [ - "We could also write the file using" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5e313561-6b54-441a-b496-d31864bf423c", - "metadata": {}, - "outputs": [], - "source": [ - "write_file(pipeline.compute(Data), 'output.dat')" - ] - }, - { - "cell_type": "markdown", - "id": "7c6f0501-cdee-4028-8e48-05588cd4a306", - "metadata": {}, - "source": [ - "But `bind_and_call` allows us to request additional parameters like the file name from the pipeline.\n", - "This is especially useful in combination with [generic providers](../user-guide/generic-providers.ipynb) or [parameter tables](../user-guide/parameter-tables.ipynb).\n", - "\n", - "**Why is this better than writing a file in a provider?**\n", - "Using `bind_and_call` guarantees that the file gets written and that it gets written after the pipeline.\n", - "The latter prevents providers from accidentally relying on the file." - ] - }, - { - "cell_type": "markdown", - "id": "60f9b2a8-0557-43da-a8d2-41df2794bd49", - "metadata": {}, - "source": [ - "## Continue from intermediate results\n", - "\n", - "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", - "\n" - ] - }, - { - "cell_type": "markdown", - "id": "aa1df625-9f6a-44d3-9169-1c79ff8e647f", - "metadata": { - "jp-MarkdownHeadingCollapsed": true - }, - "source": [ - "### Setup\n", - "\n", - "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", - " * loading the raw data\n", - " * cleaning the raw data\n", - " * computing a sum of the cleaned data." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import NewType\n", - "\n", - "Filename = NewType('Filename', str)\n", - "RawData = NewType('RawData', list)\n", - "CleanData = NewType('CleanData', list)\n", - "Result = NewType('Result', list)\n", - "\n", - "filesystem = {'raw.txt': list(map(str, range(10)))}\n", - "\n", - "def load(filename: Filename) -> RawData:\n", - " \"\"\"Load the data from the filename.\"\"\"\n", - " data = filesystem[filename]\n", - " return RawData(data)\n", - "\n", - "def clean(raw_data: RawData) -> CleanData:\n", - " \"\"\"Clean the data, convert from str.\"\"\"\n", - " return CleanData(list(map(float, raw_data)))\n", - "\n", - "def process(clean_data: CleanData) -> Result:\n", - " \"\"\"Compute the sum of the clean data.\"\"\"\n", - " return Result(sum(clean_data))\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", - "metadata": {}, - "outputs": [], - "source": [ - "import sciline\n", - "\n", - "pipeline = sciline.Pipeline(\n", - " [load, clean, process,],\n", - " params={ Filename: 'raw.txt', })\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "a7870758-aa28-497d-adc9-863efe23e463", - "metadata": {}, - "source": [ - "### Setting intermediate results" - ] - }, - { - "cell_type": "markdown", - "id": "74ace6e8-2e32-420b-b96d-dc67c8ce4ab7", - "metadata": {}, - "source": [ - "If we select `Result` the task graph will use the `Filename` input because it needs to read the raw data from the file system:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "837135eb-c858-484e-91b5-b47917aefe57", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "markdown", - "id": "99fe0c69-597b-4384-96de-92b25b6e95cd", - "metadata": {}, - "source": [ - "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", - "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", - "metadata": {}, - "outputs": [], - "source": [ - "data = pipeline.compute(CleanData)\n", - "pipeline[CleanData] = data\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "bda0ae77-9911-4cc1-9727-45a64b712875", - "metadata": {}, - "source": [ - "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "2594ab08-03cd-474a-8690-9b54978d8cf0", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.compute(Result)" - ] - }, - { - "cell_type": "markdown", - "id": "f3d48a4a", - "metadata": {}, - "source": [ - "## Replacing providers\n", - "\n", - "This example shows how to replace a provider in the pipeline using the `Pipeline.insert` method." - ] - }, - { - "cell_type": "markdown", - "id": "f05644be-ac0e-46c2-81a0-a4afe057e411", - "metadata": {}, - "source": [ - "### Setup\n", - "Same setup as in [Continue from intermediate results](#continue-from-intermediate-results)." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "73fc851b", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline = sciline.Pipeline(\n", - " [load, clean, process,],\n", - " params={ Filename: 'raw.txt', })\n", - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55", - "metadata": {}, - "source": [ - "### Replacing a provider using `Pipeline.insert`" - ] - }, - { - "cell_type": "markdown", - "id": "c9ff03a4-726a-4686-a31c-99fa420f57b2", - "metadata": {}, - "source": [ - "Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "d56d330b-955c-4778-96e0-9201212b341f", - "metadata": {}, - "outputs": [], - "source": [ - "from typing import Literal, Union\n", - "\n", - "Target = NewType('Target', str)\n", - "\n", - "def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n", - " if target == 'odd':\n", - " return [n for n in map(float, raw_data) if n % 2 == 1]\n", - " if target == 'even':\n", - " return [n for n in map(float, raw_data) if n % 2 == 0]\n", - " raise ValueError" - ] - }, - { - "cell_type": "markdown", - "id": "fb76644b-d81a-4866-aba8-5e644050439c", - "metadata": {}, - "source": [ - "To replace the old `CleanData` provider we need to use `Pipeline.insert`:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "5fcc69e2-9617-4025-99dc-3ce9badfa16a", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.insert(clean_and_remove_some)\n", - "pipeline[Target] = 'odd'" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "39e740e6-89f8-4693-85e7-8443071da426", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline" - ] - }, - { - "cell_type": "markdown", - "id": "e9c9006a-76e2-42d2-b35c-8a8e3abf8323", - "metadata": {}, - "source": [ - "Now if we select the `Result` we see that the new provider will be used in the computation:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "deba51d9-a348-4f6c-9e06-df30c8c8ca38", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "30bb4ff5-5fc0-4095-b4a2-a2e5e94824e8", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.compute(Result)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.10.12" - } - }, - "nbformat": 4, - "nbformat_minor": 5 -} diff --git a/docs/recipes/replacing-providers.ipynb b/docs/recipes/replacing-providers.ipynb new file mode 100644 index 00000000..36d508d4 --- /dev/null +++ b/docs/recipes/replacing-providers.ipynb @@ -0,0 +1,149 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "f839e644-4535-4a3d-8066-01a65e6b7f84", + "metadata": {}, + "source": [ + "## Replacing providers\n", + "\n", + "This example shows how to replace a provider in the pipeline using the `Pipeline.insert` method." + ] + }, + { + "cell_type": "markdown", + "id": "0aad9941-2294-4293-843f-b3b3037690c0", + "metadata": {}, + "source": [ + "### Setup\n", + "Same setup as in [Continue from intermediate results](#continue-from-intermediate-results)." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5f43e5d7-94c9-4e96-9587-03bd063f1f24", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline = sciline.Pipeline(\n", + " [load, clean, process,],\n", + " params={ Filename: 'raw.txt', })\n", + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "589842f1-24f4-4cab-87a1-0018949facaf", + "metadata": {}, + "source": [ + "### Replacing a provider using `Pipeline.insert`" + ] + }, + { + "cell_type": "markdown", + "id": "a5454985-9b45-4416-95ca-0ba9d2603e79", + "metadata": {}, + "source": [ + "Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "872863d7-4919-4e7a-855a-e7df5f86d488", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import Literal, Union\n", + "\n", + "Target = NewType('Target', str)\n", + "\n", + "def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n", + " if target == 'odd':\n", + " return [n for n in map(float, raw_data) if n % 2 == 1]\n", + " if target == 'even':\n", + " return [n for n in map(float, raw_data) if n % 2 == 0]\n", + " raise ValueError" + ] + }, + { + "cell_type": "markdown", + "id": "751f7f91-0b9e-4da9-92b8-4376c9317e93", + "metadata": {}, + "source": [ + "To replace the old `CleanData` provider we need to use `Pipeline.insert`:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f7f38468-7194-4735-bc4b-7e4de5866a3a", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.insert(clean_and_remove_some)\n", + "pipeline[Target] = 'odd'" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "56033cd9-6a99-40e3-b2ec-920de65b11bc", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline" + ] + }, + { + "cell_type": "markdown", + "id": "85cc2c84-cda2-49ee-83ea-0155e6f54f26", + "metadata": {}, + "source": [ + "Now if we select the `Result` we see that the new provider will be used in the computation:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "41ed69dd-dd2a-41bf-9e1a-0d3f83d55a35", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.get(Result)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3ff92573-3944-4911-b32e-76774bef0c4d", + "metadata": {}, + "outputs": [], + "source": [ + "pipeline.compute(Result)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/docs/recipes/side-effects-and-file-writing.ipynb b/docs/recipes/side-effects-and-file-writing.ipynb new file mode 100644 index 00000000..6509157b --- /dev/null +++ b/docs/recipes/side-effects-and-file-writing.ipynb @@ -0,0 +1,121 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "0835a63d-f22a-477e-b887-9cdba171d8f0", + "metadata": {}, + "source": [ + "## Avoiding side effects\n", + "\n", + "It is strongly discouraged to use [side effects](https://en.wikipedia.org/wiki/Side_effect_%28computer_science%29) in code that runs as part of a pipeline.\n", + "This applies to, among others, file output, setting global variables, or communicating over a network.\n", + "The reason is that side effects rely on code running in a specific order.\n", + "But pipelines in Sciline have a relaxed notion of time in that the scheduler determines when and if a provider runs." + ] + }, + { + "cell_type": "markdown", + "id": "8abc6b15-7749-4247-9af0-84132ef81da1", + "metadata": {}, + "source": [ + "### File output\n", + "\n", + "Files typically only need to be written at the end of a pipeline.\n", + "We can use [Pipeline.bind_and_call](../generated/classes/sciline.Pipeline.rst#sciline.Pipeline.bind_and_call) to call a function which writes the file:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "8d39f143-c9da-4a7c-9ec8-ea7697b68b1b", + "metadata": {}, + "outputs": [], + "source": [ + "from typing import NewType\n", + "\n", + "import sciline\n", + "\n", + "_fake_filesystem = {}\n", + "\n", + "Param = NewType('Param', float)\n", + "Data = NewType('Data', float)\n", + "Filename = NewType('Filename', str)\n", + "\n", + "\n", + "def foo(p: Param) -> Data:\n", + " return Data(2 * p)\n", + "\n", + "\n", + "def write_file(d: Data, filename: Filename) -> None:\n", + " _fake_filesystem[filename] = d\n", + "\n", + "\n", + "pipeline = sciline.Pipeline([foo], params={Param: 3.1, Filename: 'output.dat'})\n", + "\n", + "pipeline.bind_and_call(write_file)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b7d08320-6e51-4c70-9e94-99533a6d12bb", + "metadata": {}, + "outputs": [], + "source": [ + "_fake_filesystem" + ] + }, + { + "cell_type": "markdown", + "id": "48134a65-6b21-4086-baf2-16ec31285331", + "metadata": {}, + "source": [ + "We could also write the file using" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "5e313561-6b54-441a-b496-d31864bf423c", + "metadata": {}, + "outputs": [], + "source": [ + "write_file(pipeline.compute(Data), 'output.dat')" + ] + }, + { + "cell_type": "markdown", + "id": "7c6f0501-cdee-4028-8e48-05588cd4a306", + "metadata": {}, + "source": [ + "But `bind_and_call` allows us to request additional parameters like the file name from the pipeline.\n", + "This is especially useful in combination with [generic providers](../user-guide/generic-providers.ipynb) or [parameter tables](../user-guide/parameter-tables.ipynb).\n", + "\n", + "**Why is this better than writing a file in a provider?**\n", + "Using `bind_and_call` guarantees that the file gets written and that it gets written after the pipeline.\n", + "The latter prevents providers from accidentally relying on the file." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} From 1fd7efd5d541592d519077b10e6d3db6d6adcb6c Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Wed, 17 Jan 2024 14:12:38 +0100 Subject: [PATCH 08/11] fix --- .../continue-from-intermediate-results.ipynb | 49 ++++++++----------- docs/recipes/replacing-providers.ipynb | 33 +++++++++++-- 2 files changed, 51 insertions(+), 31 deletions(-) diff --git a/docs/recipes/continue-from-intermediate-results.ipynb b/docs/recipes/continue-from-intermediate-results.ipynb index 17cc389c..8c9d7147 100644 --- a/docs/recipes/continue-from-intermediate-results.ipynb +++ b/docs/recipes/continue-from-intermediate-results.ipynb @@ -8,15 +8,20 @@ "## Continue from intermediate results\n", "\n", "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", - "\n" + "\n", + "TLDR\n", + "```python\n", + "# Pipeline: Input -> CleanData -> Result\n", + "data = pipeline.compute(CleanData)\n", + "pipeline[CleanData] = data\n", + "result = pipeline.compute(Result)\n", + "```\n" ] }, { "cell_type": "markdown", "id": "f239f707-bc9d-4f6f-997c-fb1c73e68223", - "metadata": { - "jp-MarkdownHeadingCollapsed": true - }, + "metadata": {}, "source": [ "### Setup\n", "\n", @@ -81,68 +86,56 @@ }, { "cell_type": "markdown", - "id": "fdf05422-2a96-4127-a265-75c55122e582", + "id": "91f0cfac-1440-4fbb-98f6-c6c9451f3275", "metadata": {}, "source": [ - "If we select `Result` the task graph will use the `Filename` input because it needs to read the raw data from the file system:" + "Given a pipeline, we may want to compute an intermediate result for inspection:" ] }, { "cell_type": "code", "execution_count": null, - "id": "d6140e38-494a-4a91-aa85-57832aab64ad", + "id": "affba12d-dcc5-45b1-83c4-bcc61a6bbc92", "metadata": {}, "outputs": [], "source": [ - "pipeline.get(Result)" + "data = pipeline.compute(CleanData)" ] }, { "cell_type": "markdown", - "id": "91f0cfac-1440-4fbb-98f6-c6c9451f3275", + "id": "ca41d0e1-19c5-4e08-bd56-db7ec7e437e2", "metadata": {}, "source": [ - "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", - "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" + "If later on we wish to compute a result further down the pipeline (derived from `CleanData`), this would cause potentially costly re-computation of `CleanData`, since Sciline does not perform any caching:" ] }, { "cell_type": "code", "execution_count": null, - "id": "23fe22b8-b59e-4d63-9255-f510bbd8bec7", + "id": "a54bf2e6-e00a-4dc3-a442-b16dd55c0031", "metadata": {}, "outputs": [], "source": [ - "data = pipeline.compute(CleanData)\n", - "pipeline[CleanData] = data\n", - "pipeline" + "result = pipeline.compute(Result) # re-computes CleanData" ] }, { "cell_type": "markdown", - "id": "be8abca9-50d1-4be3-8959-6c213cd35b7b", + "id": "b62ab5df-3010-4f0e-87a9-4dc416680929", "metadata": {}, "source": [ - "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" + "To avoid this, we can use `Pipeline.__setitem__` to replace the provider of `CleanData` by the previously computed data:" ] }, { "cell_type": "code", "execution_count": null, - "id": "b1bd87c0-7633-4761-a34a-be1a2ac6a071", - "metadata": {}, - "outputs": [], - "source": [ - "pipeline.get(Result)" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "id": "c94a40c9-d965-42ec-a739-b45d73ad3260", + "id": "23fe22b8-b59e-4d63-9255-f510bbd8bec7", "metadata": {}, "outputs": [], "source": [ + "pipeline[CleanData] = data\n", "pipeline.compute(Result)" ] } diff --git a/docs/recipes/replacing-providers.ipynb b/docs/recipes/replacing-providers.ipynb index 36d508d4..3b505853 100644 --- a/docs/recipes/replacing-providers.ipynb +++ b/docs/recipes/replacing-providers.ipynb @@ -12,11 +12,15 @@ }, { "cell_type": "markdown", - "id": "0aad9941-2294-4293-843f-b3b3037690c0", + "id": "a7117710-24cb-4c0c-be0c-3980898dc508", "metadata": {}, "source": [ "### Setup\n", - "Same setup as in [Continue from intermediate results](#continue-from-intermediate-results)." + "\n", + "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", + " * loading the raw data\n", + " * cleaning the raw data\n", + " * computing a sum of the cleaned data." ] }, { @@ -26,6 +30,29 @@ "metadata": {}, "outputs": [], "source": [ + "from typing import NewType\n", + "import sciline\n", + "\n", + "Filename = NewType('Filename', str)\n", + "RawData = NewType('RawData', list)\n", + "CleanData = NewType('CleanData', list)\n", + "Result = NewType('Result', list)\n", + "\n", + "filesystem = {'raw.txt': list(map(str, range(10)))}\n", + "\n", + "def load(filename: Filename) -> RawData:\n", + " \"\"\"Load the data from the filename.\"\"\"\n", + " data = filesystem[filename]\n", + " return RawData(data)\n", + "\n", + "def clean(raw_data: RawData) -> CleanData:\n", + " \"\"\"Clean the data, convert from str.\"\"\"\n", + " return CleanData(list(map(float, raw_data)))\n", + "\n", + "def process(clean_data: CleanData) -> Result:\n", + " \"\"\"Compute the sum of the clean data.\"\"\"\n", + " return Result(sum(clean_data))\n", + "\n", "pipeline = sciline.Pipeline(\n", " [load, clean, process,],\n", " params={ Filename: 'raw.txt', })\n", @@ -55,7 +82,7 @@ "metadata": {}, "outputs": [], "source": [ - "from typing import Literal, Union\n", + "from typing import Literal, Union, NewType\n", "\n", "Target = NewType('Target', str)\n", "\n", From 55f7f80a5fb301813df492d0549dbc98c42d3575 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Wed, 17 Jan 2024 15:06:41 +0100 Subject: [PATCH 09/11] fix headings --- docs/recipes/continue-from-intermediate-results.ipynb | 6 +++--- docs/recipes/replacing-providers.ipynb | 6 +++--- docs/recipes/side-effects-and-file-writing.ipynb | 4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/recipes/continue-from-intermediate-results.ipynb b/docs/recipes/continue-from-intermediate-results.ipynb index 8c9d7147..b39aef92 100644 --- a/docs/recipes/continue-from-intermediate-results.ipynb +++ b/docs/recipes/continue-from-intermediate-results.ipynb @@ -5,7 +5,7 @@ "id": "1b29e65b-73cb-4fc0-b9ad-1d7384a16578", "metadata": {}, "source": [ - "## Continue from intermediate results\n", + "# Continue from intermediate results\n", "\n", "It is a common need to be able to continue the pipeline from some intermediate result computed earlier.\n", "\n", @@ -23,7 +23,7 @@ "id": "f239f707-bc9d-4f6f-997c-fb1c73e68223", "metadata": {}, "source": [ - "### Setup\n", + "## Setup\n", "\n", "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", " * loading the raw data\n", @@ -81,7 +81,7 @@ "id": "f5c4d999-6a85-4d7b-9b2f-751e261690e7", "metadata": {}, "source": [ - "### Setting intermediate results" + "## Setting intermediate results" ] }, { diff --git a/docs/recipes/replacing-providers.ipynb b/docs/recipes/replacing-providers.ipynb index 3b505853..ea0f7d44 100644 --- a/docs/recipes/replacing-providers.ipynb +++ b/docs/recipes/replacing-providers.ipynb @@ -5,7 +5,7 @@ "id": "f839e644-4535-4a3d-8066-01a65e6b7f84", "metadata": {}, "source": [ - "## Replacing providers\n", + "# Replacing providers\n", "\n", "This example shows how to replace a provider in the pipeline using the `Pipeline.insert` method." ] @@ -15,7 +15,7 @@ "id": "a7117710-24cb-4c0c-be0c-3980898dc508", "metadata": {}, "source": [ - "### Setup\n", + "## Setup\n", "\n", "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", " * loading the raw data\n", @@ -64,7 +64,7 @@ "id": "589842f1-24f4-4cab-87a1-0018949facaf", "metadata": {}, "source": [ - "### Replacing a provider using `Pipeline.insert`" + "## Replacing a provider using `Pipeline.insert`" ] }, { diff --git a/docs/recipes/side-effects-and-file-writing.ipynb b/docs/recipes/side-effects-and-file-writing.ipynb index 6509157b..30816d03 100644 --- a/docs/recipes/side-effects-and-file-writing.ipynb +++ b/docs/recipes/side-effects-and-file-writing.ipynb @@ -5,7 +5,7 @@ "id": "0835a63d-f22a-477e-b887-9cdba171d8f0", "metadata": {}, "source": [ - "## Avoiding side effects\n", + "# Avoiding side effects\n", "\n", "It is strongly discouraged to use [side effects](https://en.wikipedia.org/wiki/Side_effect_%28computer_science%29) in code that runs as part of a pipeline.\n", "This applies to, among others, file output, setting global variables, or communicating over a network.\n", @@ -18,7 +18,7 @@ "id": "8abc6b15-7749-4247-9af0-84132ef81da1", "metadata": {}, "source": [ - "### File output\n", + "## File output\n", "\n", "Files typically only need to be written at the end of a pipeline.\n", "We can use [Pipeline.bind_and_call](../generated/classes/sciline.Pipeline.rst#sciline.Pipeline.bind_and_call) to call a function which writes the file:" From 65ae199c07343d39345a1bc7af17b4d84b472919 Mon Sep 17 00:00:00 2001 From: jokasimr Date: Wed, 17 Jan 2024 15:07:17 +0100 Subject: [PATCH 10/11] Update docs/recipes/continue-from-intermediate-results.ipynb Co-authored-by: Simon Heybrock <12912489+SimonHeybrock@users.noreply.github.com> --- docs/recipes/continue-from-intermediate-results.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/recipes/continue-from-intermediate-results.ipynb b/docs/recipes/continue-from-intermediate-results.ipynb index b39aef92..d2d05930 100644 --- a/docs/recipes/continue-from-intermediate-results.ipynb +++ b/docs/recipes/continue-from-intermediate-results.ipynb @@ -136,7 +136,7 @@ "outputs": [], "source": [ "pipeline[CleanData] = data\n", - "pipeline.compute(Result)" + "result = pipeline.compute(Result)" ] } ], From f6d40e94be8065c77fec0eae739cea5368f7f338 Mon Sep 17 00:00:00 2001 From: Johannes Kasimir Date: Wed, 17 Jan 2024 15:29:47 +0100 Subject: [PATCH 11/11] fix --- docs/index.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/index.md b/docs/index.md index 44b6841a..f8d3d87d 100644 --- a/docs/index.md +++ b/docs/index.md @@ -108,7 +108,7 @@ hidden: --- user-guide/index -recipes/recipes +recipes/index api-reference/index developer/index about/index