scipp · jokasimr · Jan 17, 2024 · Jan 16, 2024 · Jan 16, 2024 · Jan 16, 2024
diff --git a/docs/user-guide/index.md b/docs/user-guide/index.md
@@ -8,4 +8,5 @@ maxdepth: 2
 getting-started
 parameter-tables
 generic-providers
+replacing-providers
 ```
diff --git a/docs/user-guide/replacing-providers.ipynb b/docs/user-guide/replacing-providers.ipynb
@@ -0,0 +1,277 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "86de99b0-3170-45d6-84eb-adbd622af936",
+   "metadata": {},
+   "source": [
+    "# Replacing providers\n",
+    "\n",
+    "## Overview\n",
+    "\n",
+    "It is a common need to be able to replace a provider, either with another provider or with a specific value.\n",
+    "\n",
+    "Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n",
+    "  * loading the raw data\n",
+    "  * cleaning the raw data\n",
+    "  * computing a sum of the cleaned data."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import NewType\n",
+    "\n",
+    "Filename = NewType('Filename', str)\n",
+    "RawData = NewType('RawData', list)\n",
+    "CleanData = NewType('CleanData', list)\n",
+    "Result = NewType('Result', list)\n",
+    "\n",
+    "filesystem = {'raw.txt': list(map(str, range(10)))}\n",
+    "\n",
+    "def load(filename: Filename) -> RawData:\n",
+    "    \"\"\"Load the data from the filename.\"\"\"\n",
+    "    data = filesystem[filename]\n",
+    "    return RawData(data)\n",
+    "\n",
+    "def clean(raw_data: RawData) -> CleanData:\n",
+    "    \"\"\"Clean the data, convert from str.\"\"\"\n",
+    "    return CleanData(list(map(float, raw_data)))\n",
+    "\n",
+    "def process(clean_data: CleanData) -> Result:\n",
+    "    \"\"\"Compute the sum of the clean data.\"\"\"\n",
+    "    return Result(sum(clean_data))\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sciline\n",
+    "\n",
+    "pipeline = sciline.Pipeline(\n",
+    "    [load, clean, process,],\n",
+    "    params={ Filename: 'raw.txt', })\n",
+    "pipeline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "8fa7a168-be19-419a-b7b0-dfe6e150134b",
+   "metadata": {},
+   "source": [
+    "## Replacing a provider with a value"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e68f8022-0369-4abd-a516-ac99432812f3",
+   "metadata": {},
+   "source": [
+    "Select `Result`, the task graph will use the `Filename` input because it needs to read the data from the file system:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "837135eb-c858-484e-91b5-b47917aefe57",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.get(Result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3f504103-daa3-427d-9e8c-a4aa332b1f72",
+   "metadata": {},
+   "source": [
+    "But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n",
+    "To do this we replace the `CleanData` provider with the data that was loaded and cleaned:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "0b54414b-19d0-43aa-b44a-05ae94bd8086",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "data = pipeline.compute(CleanData)\n",
+    "pipeline[CleanData] = data\n",
+    "pipeline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "0596304b-d38a-4b35-b85d-41a7a4dc2605",
+   "metadata": {},
+   "source": [
+    "Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "71a0c735-b095-4fb0-bf92-1e424e7ea744",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.get(Result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "2594ab08-03cd-474a-8690-9b54978d8cf0",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.compute(Result)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55",
+   "metadata": {},
+   "source": [
+    "## Replacing a provider with another provider"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9ecd237c-c70c-41e1-aebb-5bae714f5031",
+   "metadata": {},
+   "source": [
+    "If the current provider doesn't do what we want it to do we can replace it with another provider."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a12634a5-072a-4587-8fb0-fc531b54bfc7",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import sciline\n",
+    "\n",
+    "pipeline = sciline.Pipeline(\n",
+    "    [load, clean, process,],\n",
+    "    params={ Filename: 'raw.txt', })\n",
+    "pipeline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c9ff03a4-726a-4686-a31c-99fa420f57b2",
+   "metadata": {},
+   "source": [
+    "Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d56d330b-955c-4778-96e0-9201212b341f",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Literal, Union\n",
+    "\n",
+    "Target = NewType('Target', str)\n",
+    "\n",
+    "def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n",
+    "    if target == 'odd':\n",
+    "        return [n for n in map(float, raw_data) if n % 2 == 1]\n",
+    "    if target == 'even':\n",
+    "        return [n for n in map(float, raw_data) if n % 2 == 0]\n",
+    "    raise ValueError"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fb76644b-d81a-4866-aba8-5e644050439c",
+   "metadata": {},
+   "source": [
+    "To replace the old `CleanData` provider we need to use `Pipeline.insert`:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5fcc69e2-9617-4025-99dc-3ce9badfa16a",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.insert(clean_and_remove_some)\n",
+    "pipeline[Target] = 'odd'"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "39e740e6-89f8-4693-85e7-8443071da426",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e9c9006a-76e2-42d2-b35c-8a8e3abf8323",
+   "metadata": {},
+   "source": [
+    "Now if we select the `Result` we see that the new provider will be used in the computation:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "deba51d9-a348-4f6c-9e06-df30c8c8ca38",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.get(Result)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "30bb4ff5-5fc0-4095-b4a2-a2e5e94824e8",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "pipeline.compute(Result)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}