-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: add examples of replacing providers #101
Changes from 3 commits
a7cf649
f1a198c
7c501a3
97c7682
d581018
ca75ca4
32d18d3
1fd7efd
55f7f80
65ae199
f6d40e9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,4 +8,5 @@ maxdepth: 2 | |
getting-started | ||
parameter-tables | ||
generic-providers | ||
replacing-providers | ||
``` |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,277 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "86de99b0-3170-45d6-84eb-adbd622af936", | ||
"metadata": {}, | ||
"source": [ | ||
"# Replacing providers\n", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is not exactly what the issue was about. I think If you want to keep the "replacing provider" example, I'd suggest to make this into another recipe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good! |
||
"\n", | ||
"## Overview\n", | ||
"\n", | ||
"It is a common need to be able to replace a provider, either with another provider or with a specific value.\n", | ||
"\n", | ||
"Lets look at a situation where we have some \"raw\" data files and the workflow consists of three steps\n", | ||
" * loading the raw data\n", | ||
" * cleaning the raw data\n", | ||
" * computing a sum of the cleaned data." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "bb0ecea3-5b0d-44da-a363-2a0e861b0235", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from typing import NewType\n", | ||
"\n", | ||
"Filename = NewType('Filename', str)\n", | ||
"RawData = NewType('RawData', list)\n", | ||
"CleanData = NewType('CleanData', list)\n", | ||
"Result = NewType('Result', list)\n", | ||
"\n", | ||
"filesystem = {'raw.txt': list(map(str, range(10)))}\n", | ||
"\n", | ||
"def load(filename: Filename) -> RawData:\n", | ||
" \"\"\"Load the data from the filename.\"\"\"\n", | ||
" data = filesystem[filename]\n", | ||
" return RawData(data)\n", | ||
"\n", | ||
"def clean(raw_data: RawData) -> CleanData:\n", | ||
" \"\"\"Clean the data, convert from str.\"\"\"\n", | ||
" return CleanData(list(map(float, raw_data)))\n", | ||
"\n", | ||
"def process(clean_data: CleanData) -> Result:\n", | ||
" \"\"\"Compute the sum of the clean data.\"\"\"\n", | ||
" return Result(sum(clean_data))\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "1a4d59ab-4022-4eef-8566-d1b37f9cea7f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import sciline\n", | ||
"\n", | ||
"pipeline = sciline.Pipeline(\n", | ||
" [load, clean, process,],\n", | ||
" params={ Filename: 'raw.txt', })\n", | ||
"pipeline" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "8fa7a168-be19-419a-b7b0-dfe6e150134b", | ||
"metadata": {}, | ||
"source": [ | ||
"## Replacing a provider with a value" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e68f8022-0369-4abd-a516-ac99432812f3", | ||
"metadata": {}, | ||
"source": [ | ||
"Select `Result`, the task graph will use the `Filename` input because it needs to read the data from the file system:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "837135eb-c858-484e-91b5-b47917aefe57", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline.get(Result)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "3f504103-daa3-427d-9e8c-a4aa332b1f72", | ||
"metadata": {}, | ||
"source": [ | ||
"But if the cleaned data has already been produced it is unnecessary to \"re-clean\" it, in that case we can proceed directly from the clean data to the compute sum step.\n", | ||
"To do this we replace the `CleanData` provider with the data that was loaded and cleaned:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "0b54414b-19d0-43aa-b44a-05ae94bd8086", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"data = pipeline.compute(CleanData)\n", | ||
"pipeline[CleanData] = data\n", | ||
"pipeline" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "0596304b-d38a-4b35-b85d-41a7a4dc2605", | ||
"metadata": {}, | ||
"source": [ | ||
"Then if we select `Result` the task graph will no longer use the `Filename` input and instead it will proceed directly from the `CleanData` as input:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "71a0c735-b095-4fb0-bf92-1e424e7ea744", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline.get(Result)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "2594ab08-03cd-474a-8690-9b54978d8cf0", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline.compute(Result)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e9b190a1-cca3-4c12-aef0-2169a9a90f55", | ||
"metadata": {}, | ||
"source": [ | ||
"## Replacing a provider with another provider" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "9ecd237c-c70c-41e1-aebb-5bae714f5031", | ||
"metadata": {}, | ||
"source": [ | ||
"If the current provider doesn't do what we want it to do we can replace it with another provider." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "a12634a5-072a-4587-8fb0-fc531b54bfc7", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import sciline\n", | ||
"\n", | ||
"pipeline = sciline.Pipeline(\n", | ||
" [load, clean, process,],\n", | ||
" params={ Filename: 'raw.txt', })\n", | ||
"pipeline" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "c9ff03a4-726a-4686-a31c-99fa420f57b2", | ||
"metadata": {}, | ||
"source": [ | ||
"Let's say the `clean` provider doesn't do all the preprocessing that we want it to do, we also want to remove either the odd or even numbers before processing:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d56d330b-955c-4778-96e0-9201212b341f", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from typing import Literal, Union\n", | ||
"\n", | ||
"Target = NewType('Target', str)\n", | ||
"\n", | ||
"def clean_and_remove_some(raw_data: RawData, target: Target) -> CleanData:\n", | ||
" if target == 'odd':\n", | ||
" return [n for n in map(float, raw_data) if n % 2 == 1]\n", | ||
" if target == 'even':\n", | ||
" return [n for n in map(float, raw_data) if n % 2 == 0]\n", | ||
" raise ValueError" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "fb76644b-d81a-4866-aba8-5e644050439c", | ||
"metadata": {}, | ||
"source": [ | ||
"To replace the old `CleanData` provider we need to use `Pipeline.insert`:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "5fcc69e2-9617-4025-99dc-3ce9badfa16a", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline.insert(clean_and_remove_some)\n", | ||
"pipeline[Target] = 'odd'" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "39e740e6-89f8-4693-85e7-8443071da426", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e9c9006a-76e2-42d2-b35c-8a8e3abf8323", | ||
"metadata": {}, | ||
"source": [ | ||
"Now if we select the `Result` we see that the new provider will be used in the computation:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "deba51d9-a348-4f6c-9e06-df30c8c8ca38", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline.get(Result)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "30bb4ff5-5fc0-4095-b4a2-a2e5e94824e8", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"pipeline.compute(Result)" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we add this in
Recipes
, instead ofUser Guide