[Doc] Add flex flow doc (microsoft#2981)

# Description Please add an informative description that covers that changes made by the pull request and link all relevant issues. # All Promptflow Contribution checklist: - [ ] **The pull request does not introduce [breaking changes].** - [ ] **CHANGELOG is updated for new features, bug fixes or other significant changes.** - [ ] **I have read the [contribution guidelines](../CONTRIBUTING.md).** - [ ] **Create an issue and link to the pull request to get dedicated review from promptflow team. Learn more: [suggested workflow](../CONTRIBUTING.md#suggested-workflow).** ## General Guidelines and Best Practices - [ ] Title of the pull request is clear and informative. - [ ] There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, [see this page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md). ### Testing Guidelines - [ ] Pull request includes test coverage for the included changes. --------- Co-authored-by: Zhengfei Wang <[email protected]> Co-authored-by: Clement Wang <[email protected]> Co-authored-by: Clement Wang <[email protected]>
changliu2 · Apr 25, 2024 · 9cb7869 · 9cb7869
1 parent 0a7cde3
commit 9cb7869
Show file tree

Hide file tree

Showing 25 changed files with 572 additions and 36 deletions.
diff --git a/docs/README.md b/docs/README.md
@@ -10,7 +10,7 @@ Below is a table of important doc pages.
 |----------------|----------------|
 |Quick start|[Getting started with prompt flow](./how-to-guides/quick-start.md)|
 |Concepts|[Flows](./concepts/concept-flows.md)<br> [Tools](./concepts/concept-tools.md)<br> [Connections](./concepts/concept-connections.md)<br> [Variants](./concepts/concept-variants.md)<br> |
-|How-to guides|[How to initialize and test a flow](./how-to-guides/develop-a-flow/init-and-test-a-flow.md) <br>[How to run and evaluate a flow](./how-to-guides/run-and-evaluate-a-flow/index.md)<br> [How to tune prompts using variants](./how-to-guides/tune-prompts-with-variants.md)<br>[How to deploy a flow](./how-to-guides/deploy-a-flow/index.md)<br>[How to create and use your own tool package](./how-to-guides/develop-a-tool/create-and-use-tool-package.md)|
+|How-to guides|[How to initialize and test a flow](./how-to-guides/develop-a-dag-flow/init-and-test-a-flow.md) <br>[How to run and evaluate a flow](./how-to-guides/run-and-evaluate-a-flow/index.md)<br> [How to tune prompts using variants](./how-to-guides/tune-prompts-with-variants.md)<br>[How to deploy a flow](./how-to-guides/deploy-a-flow/index.md)<br>[How to create and use your own tool package](./how-to-guides/develop-a-tool/create-and-use-tool-package.md)|
 |Tools reference|[LLM tool](./reference/tools-reference/llm-tool.md)<br> [Prompt tool](./reference/tools-reference/prompt-tool.md)<br> [Python tool](./reference/tools-reference/python-tool.md)<br> [Embedding tool](./reference/tools-reference/embedding_tool.md)<br>[SERP API tool](./reference/tools-reference/serp-api-tool.md) ||
 
 

diff --git a/docs/cloud/azureai/run-promptflow-in-azure-ai.md b/docs/cloud/azureai/run-promptflow-in-azure-ai.md
@@ -155,7 +155,7 @@ At the end of stream logs, you can find the `portal_url` of the submitted run, c
 
 ### Run snapshot of the flow with additional includes
 
-Flows that enabled [additional include](../../how-to-guides/develop-a-flow/referencing-external-files-or-folders-in-a-flow.md) files can also be submitted for execution in the workspace. Please note that the specific additional include files or folders will be uploaded and organized within the **Files** folder of the run snapshot in the cloud.
+Flows that enabled [additional include](../../how-to-guides/develop-a-dag-flow/referencing-external-files-or-folders-in-a-flow.md) files can also be submitted for execution in the workspace. Please note that the specific additional include files or folders will be uploaded and organized within the **Files** folder of the run snapshot in the cloud.
 
 ![img](../../media/cloud/azureml/run-with-additional-includes.png)
 

diff --git a/docs/cloud/azureai/use-flow-in-azure-ml-pipeline.md b/docs/cloud/azureai/use-flow-in-azure-ml-pipeline.md
@@ -1,7 +1,7 @@
 # Use flow in Azure ML pipeline job
 In practical scenarios, flows fulfill various functions. For example, consider an offline flow specifically designed to assess the relevance score for communication sessions between humans and agents. This flow is triggered nightly and processes a substantial amount of session data. In such a context, Parallel component and AzureML pipeline emerge as the optimal choices for handling large-scale, highly resilient, and efficient offline batch requirements. 
 
-Once you’ve developed and thoroughly tested your flow using the guidelines in the [init and test a flow](../../how-to-guides/develop-a-flow/init-and-test-a-flow.md) section, this guide will walk you through utilizing your flow as a parallel component within an AzureML pipeline job.
+Once you’ve developed and thoroughly tested your flow, this guide will walk you through utilizing your flow as a parallel component within an AzureML pipeline job.
 
 :::{admonition} Pre-requirements
 To enable this feature, customer need to:
@@ -329,7 +329,7 @@ Given above, if your flow has logic relying on identity or environment variable,
 | key         | source | type                   | description                                                  |
 | ----------- | ------ | ---------------------- | ------------------------------------------------------------ |
 | data        | fixed  | uri_folder or uri_file | required; to pass in input data. Supported format includes [`mltable`](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-mltable?view=azureml-api-2&tabs=cli#authoring-mltable-files) and list of jsonl files. |
-| run_outputs | fixed  | uri_folder             | optional; to pass in output of a standard flow for [an evaluation flow](../../how-to-guides/develop-a-flow/develop-evaluation-flow.md). Should be linked to a `flow_outputs` of a previous flow node in the pipeline. |
+| run_outputs | fixed  | uri_folder             | optional; to pass in output of a standard flow for [an evaluation flow](../../how-to-guides/develop-a-dag-flow/develop-evaluation-flow.md). Should be linked to a `flow_outputs` of a previous flow node in the pipeline. |
 
 ### Output ports
 

diff --git a/docs/concepts/concept-flows.md b/docs/concepts/concept-flows.md
@@ -12,14 +12,25 @@ Our [examples](https://github.com/microsoft/promptflow/tree/main/examples/flex-f
 
 Thus LLM apps can be defined as Directed Acyclic Graphs (DAGs) of function calls. These DAGs are flows in prompt flow.
 
-A flow in prompt flow is a DAG of functions (we call them [tools](./concept-tools.md)). These functions/tools connected via input/output dependencies and executed based on the topology by prompt flow executor.
+A `DAG flow` in prompt flow is a DAG of functions (we call them [tools](./concept-tools.md)). These functions/tools connected via input/output dependencies and executed based on the topology by prompt flow executor.
 
 A flow is represented as a YAML file and can be visualized with our [Prompt flow for VS Code extension](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow). Here is an example `flow.dag.yaml`:
 
 ![flow_dag](../media/how-to-guides/quick-start/flow_dag.png)
 
 Please refer to our [examples](https://github.com/microsoft/promptflow/tree/main/examples/flows) to learn how to write a `DAG flow`. 
 
+## When to use Flex or DAG flow
+
+`Dag flow` provides a UI-friendly way to develop your LLM app, which has the following benifits:
+- **Low code**: user can drag-and-drop in UI to create a LLM app.
+- **DAG Visualization**: user can easily understand the logic structure of the app with DAG view.
+
+`Flex flow` provides a code-friendly way to develop your LLM app, which has the following benifits:
+- **Quick start**: Users can quickly test with a simple prompt, then customize with python code with Tracing visualization UI.
+- **More advanced orchestration**: Users can write complex flow with Python built-in control operators (if-else, foreach) or other 3rd party / open-source library. 
+- **Easy onboard from other platforms**: user might already onboard platforms like `langchain` and `sematic kernel` with existing code. User can easily onboard promptflow with a few code changes.
+
 ## Flow types
 
 Prompt flow examples organize flows by three categories:
@@ -42,6 +53,6 @@ DAG flow [examples](https://github.com/microsoft/promptflow/tree/main/examples/f
 ## Next steps
 
 - [Quick start](../how-to-guides/quick-start.md)
-- [Initialize and test a flow](../how-to-guides/develop-a-flow/init-and-test-a-flow.md)
+- [Initialize and test a flow](../how-to-guides/develop-a-dag-flow/init-and-test-a-flow.md)
 - [Run and evaluate a flow](../how-to-guides/run-and-evaluate-a-flow/index.md)
 - [Tune prompts using variants](../how-to-guides/tune-prompts-with-variants.md)
diff --git a/...flow/add-conditional-control-to-a-flow.md → ...flow/add-conditional-control-to-a-flow.md b/...flow/add-conditional-control-to-a-flow.md → ...flow/add-conditional-control-to-a-flow.md
diff --git a/...uides/develop-a-flow/develop-chat-flow.md → ...s/develop-a-dag-flow/develop-chat-flow.md b/...uides/develop-a-flow/develop-chat-flow.md → ...s/develop-a-dag-flow/develop-chat-flow.md
diff --git a/...develop-a-flow/develop-evaluation-flow.md → ...lop-a-dag-flow/develop-evaluation-flow.md b/...develop-a-flow/develop-evaluation-flow.md → ...lop-a-dag-flow/develop-evaluation-flow.md
diff --git a/...s/develop-a-flow/develop-standard-flow.md → ...velop-a-dag-flow/develop-standard-flow.md b/...s/develop-a-flow/develop-standard-flow.md → ...velop-a-dag-flow/develop-standard-flow.md
diff --git a/docs/how-to-guides/develop-a-dag-flow/index.md b/docs/how-to-guides/develop-a-dag-flow/index.md
@@ -0,0 +1,26 @@
+# Develop a dag flow
+
+LLM apps can be defined as Directed Acyclic Graphs (DAGs) of function calls. These DAGs are flows in prompt flow.
+
+A `DAG flow` in prompt flow is a DAG of functions (we call them [tools](../../concepts//concept-tools.md)). These functions/tools connected via input/output dependencies and executed based on the topology by prompt flow executor.
+
+A flow is represented as a YAML file and can be visualized with our [Prompt flow for VS Code extension](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow). Here is an example `flow.dag.yaml`:
+
+![flow_dag](../../media/how-to-guides/quick-start/flow_dag.png)
+
+Please refer to our [examples](https://github.com/microsoft/promptflow/tree/main/examples/flows) and guides in this section to learn how to write a `DAG flow`. 
+
+Note: 
+- promptflow also support user develop a a flow using code. learn more on comparasion of these two [flow concepts](../../concepts/concept-flows.md).
+
+```{toctree}
+:maxdepth: 1
+
+init-and-test-a-flow
+develop-standard-flow
+develop-chat-flow
+develop-evaluation-flow
+add-conditional-control-to-a-flow
+process-image-in-flow
+referencing-external-files-or-folders-in-a-flow
+```
diff --git a/...es/develop-a-flow/init-and-test-a-flow.md → ...evelop-a-dag-flow/init-and-test-a-flow.md b/...es/develop-a-flow/init-and-test-a-flow.md → ...evelop-a-dag-flow/init-and-test-a-flow.md
diff --git a/...s/develop-a-flow/process-image-in-flow.md → ...velop-a-dag-flow/process-image-in-flow.md b/...s/develop-a-flow/process-image-in-flow.md → ...velop-a-dag-flow/process-image-in-flow.md
diff --git a/...ng-external-files-or-folders-in-a-flow.md → ...ng-external-files-or-folders-in-a-flow.md b/...ng-external-files-or-folders-in-a-flow.md → ...ng-external-files-or-folders-in-a-flow.md
diff --git a/docs/how-to-guides/develop-a-flex-flow/class-based-flow.md b/docs/how-to-guides/develop-a-flex-flow/class-based-flow.md
@@ -0,0 +1,296 @@
+# Class based flow
+
+:::{admonition} Experimental feature
+This is an experimental feature, and may change at any time. Learn [more](../faq.md#stable-vs-experimental).
+:::
+
+When user need to persist objects (like connection) in memory during multiple rounds of flow runs, they can write a callable class as flow entry and put persist params in `__init__` method.
+
+If user need to log metrics on batch run outputs, they can add an `__aggregate__` method and it will be scheduled after batch run finishes.
+The `__aggregate__` method should only contain 1 params which is list of batch run results.
+
+See [connection support](#connection-support) & [aggregation support](#aggregation-support) for more details.
+
+## Class as a flow
+
+Assume we have a file `flow_entry.py`:
+
+```python
+class Reply(TypedDict):
+    output: str
+
+class MyFlow:
+    def __init__(self, model_config: AzureOpenAIModelConfiguration, flow_config: dict):
+      """Flow initialization logic goes here."""
+      self.model_config = model_config
+      self.flow_config = flow_config
+
+    def __call__(question: str) -> Reply:
+      """Flow execution logic goes here."""
+      return Reply(output=output)
+
+    def __aggregate__(self, line_results: List[str]) -> dict:
+      """Aggregation logic goes here. Return key-value pair as metrics."""
+      return {"key": val}
+```
+
+
+## Flow test
+
+Since flow's definition is function/callable class. We recommend user directly run it like running other scripts:
+
+```python
+class MyFlow:
+    pass
+if __name__ == "__main__":
+    flow = MyFlow(model_config, flow_config)
+    output = flow(question)
+    metrics = flow.__aggregate__([output])
+    # check metrics here
+```
+
+You can also test the flow using CLI:
+```bash
+# flow entry syntax: path.to.module:ClassName
+pf flow test --flow flow_entry:MyFlow --inputs question="What's the capital of France?" --init init.json
+```
+
+Check out a full example here: [basic-chat](https://github.com/microsoft/promptflow/tree/main/examples/flex-flows/basic-chat)
+
+### Chat with a flow
+
+Chat with flow in CLI is supported:
+
+```bash
+pf flow test --flow flow_entry:MyFlow --inputs inputs.json --init init.json --ui
+```
+
+Check [here](../chat-with-a-flow/index.md) for more information.
+
+## Batch run
+
+User can also batch run a flow.
+
+::::{tab-set}
+:::{tab-item} CLI
+:sync: CLI
+
+```bash
+pf run create --flow "path.to.module:ClassName" --data "./data.jsonl"
+```
+
+:::
+
+:::{tab-item} SDK
+:sync: SDK
+```python
+# user can also directly use entry in `flow` param for batch run
+pf.run(flow="path.to.module:ClassName", init="./init.jsonl", data="./data.jsonl")
+```
+
+:::
+::::
+
+Or directly run the imported flow class or flow instance.
+
+```python
+class MyFlow:
+    pass
+pf.run(flow=MyFlow, init={"model_config": config, "flow_config": {}}, data="./data.jsonl")
+# or
+flow_obj = MyFlow(model_config=config, flow_config={})
+pf.run(flow=flow_obj, data="./data.jsonl")
+```
+
+Learn more on this topic on [Run and evaluate a flow](../run-and-evaluate-a-flow/index.md)
+
+## Define a flow yaml
+
+User can write a YAML file with name `flow.flex.yaml` manually or save a function/callable entry to YAML file.
+This is required for advanced scenario like deployment or run in cloud.
+A flow YAML may look like this:
+
+```yaml
+$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
+entry: path.to.module:ClassName
+```
+
+## Batch run with YAML
+
+User can batch run a flow. Flow init function's param is supported by `init` parameter.
+
+::::{tab-set}
+:::{tab-item} CLI
+:sync: CLI
+
+User need to write an JSON file as init's value since it's hard to write model config in command line.
+
+```json
+{
+    "model_config": {
+        "azure_endpoint": "my_endpoint",
+        "azure_deployment": "my_deployment",
+        "api_key": "actual_api_key"
+    },
+    "flow_config": {}
+}
+```
+
+```bash
+pf run create --flow "./flow.flex.yaml" --data "./data.jsonl" --init init.json
+```
+
+:::
+
+:::{tab-item} SDK
+:sync: SDK
+
+```python
+pf = PFClient()
+
+config = AzureOpenAIModelConfiguration(
+  azure_deployment="my_deployment",
+  api_key="actual_key"
+)
+# if init's value is not json serializable, raise user error
+pf.run(flow="./flow.flex.yaml", init={"model_config": config, "flow_config": {}}, data="./data.jsonl")
+
+# when submit to cloud, user can only use connection
+# in runtime executor will resolve connection in AzureOpenAIModelConfiguration and set connection's fields to ModelConfig: equal to original ModelConfiguration.from_connection()
+config = AzureOpenAIModelConfiguration(
+  azure_deployment="my_embedding_deployment",
+  connection="my-aoai-connection",
+) 
+pfazure.run(flow="./flow.flex.yaml", init={"model_config": config, "flow_config": {}}, data="./data.jsonl")
+```
+
+:::
+::::
+
+## Deploy a flow
+
+User can serve a flow. Flow init function's param is supported by `init` parameter.
+The flow should have complete init/inputs/outputs specification in YAML to make sure serving swagger can be generated.
+
+User need to write an JSON file as init's value since it's hard to write model config in command line.
+
+```json
+{
+    "model_config": {
+        "azure_endpoint": "my_endpoint",
+        "azure_deployment": "my_deployment",
+        "api_key": "actual_api_key"
+    },
+    "flow_config": {}
+}
+```
+
+```bash
+# user can only pass model config by file 
+pf flow serve --source "./"  --port 8088 --host localhost --init path/to/init.json
+```
+
+Learn more: [Deploy a flow](../deploy-a-flow/index.md).
+
+## Connection support
+
+### Model config in `__init__`
+
+Just like example in [batch run](#batch-run-with-yaml), it's supported to reference connection in ModelConfig.
+And connection will be resolved and flatten connection's fields to ModelConfig.
+
+### Connection in `__init__`
+
+It's also supported to directly pass connection by **name** in `__init__`. 
+
+```python
+class MyFlow:
+    def __init__(self, my_connection: AzureOpenAIConnection):
+        pass
+```
+
+Note:
+
+- Union of connection types(`Union[OpenAIConnection, AzureOpenAIConnection]`) is not supported.
+
+#### Batch run with connection
+
+User can pass connection name to connection field in `init`.
+
+In local, the connection name will be replaced with local connection object in execution time.
+In cloud, the connection name will be replaced with workspace's connection object in execution time.
+
+```python
+# local connection "my_connection"'s instance will be passed to `__init__`
+pf.run(flow="./flow.flex.yaml", init={"connection": "my_connection"}, data="./data.jsonl")
+# cloud connection "my_cloud_connection"'s instance will be passed to `__init__`
+pfazure.run(flow="./flow.flex.yaml", init={"connection": "my_cloud_connection"}, data="./data.jsonl")
+```
+
+### Environment variable connections(EVC)
+
+If flow YAML has `environment_variables` and it's value is a connection reference like this:
+
+```yaml
+environment_variables:
+  AZURE_OPENAI_API_KEY: ${open_ai_connection.api_key}
+  AZURE_OPENAI_ENDPOINT: ${open_ai_connection.api_base}
+```
+
+The environment variable's value will be resolved to actual value in runtime.
+If the connection not exist (in local or cloud), connection not found error will be raised.
+
+**Note**: User can override the `environment_variables` with existing environment variable keys in `flow.flex.yaml`:
+
+```bash
+pf run create --flow . --data ./data.jsonl --environment-variables AZURE_OPENAI_API_KEY='${new_connection.api_key}' AZURE_OPENAI_ENDPOINT='my_endpoint'
+```
+
+Overriding with environment variable names which not exist in `flow.flex.yaml` is not supported.
+Which means if user added environment variables which does not exist in `flow.flex.yaml` in runtime, it's value won't be resolved.
+
+For example,
+
+```bash
+pf run create --flow . --data ./data.jsonl --environment-variables NEW_API_KEY='${my_new_connection.api_key}'
+```
+
+The `NEW_API_KEY`'s value won't be resolved to connection's API key.
+
+## Aggregation support
+
+Aggregation support is introduce to help user calculate metrics.
+
+```python
+class MyFlow:
+    def __call__(text: str) -> str:
+      """Flow execution logic goes here."""
+      pass
+
+    # will only execute once after batch run finished.
+    # the processed_results will be list of __call__'s output and we will log the return value as metrics automatically.
+    def __aggregate__(self, processed_results: List[str]) -> dict:
+        for element in processed_results:
+            # If __call__'s output is primitive type, element will be primitive type.
+            # If __call__'s output is dataclass, element will be a dictionary, but can access it's attribute with `element.attribute_name`
+            # For other cases, it's recommended to access by key `element["attribute_name"]`
+
+```
+
+**Note**:
+
+There's several limitations on aggregation support:
+
+- The aggregation function will only execute in batch run.
+- Only 1 hard coded `__aggregate__` function is supported.
+- The `__aggregate__` will only be passed **1** positional arguments when executing.
+- The aggregation function’s input will be flow run’s outputs list.
+  - Each element inside `processed_results` passed passed inside `__aggregate__` function is not same object with each line's `__call__` returns.
+  - The reconstructed element is a dictionary which supports 1 layer attribute access. But it's recommended to access them by key. See the above example for usage.
+- If aggregation function accept more than 1 arguments, raise error in submission phase.
+
+## Next steps
+
+- [Input output format](./input-output-format.md)
+- [Class based flow sample](https://github.com/microsoft/promptflow/blob/main/examples/flex-flows/chat-basic/README.md)
+- [Class based flow evaluation sample](https://github.com/microsoft/promptflow/blob/main/examples/flex-flows/eval-code-quality/README.md)