Skip to content

Commit

Permalink
[Doc] Add flex flow doc (microsoft#2981)
Browse files Browse the repository at this point in the history
# Description

Please add an informative description that covers that changes made by
the pull request and link all relevant issues.

# All Promptflow Contribution checklist:
- [ ] **The pull request does not introduce [breaking changes].**
- [ ] **CHANGELOG is updated for new features, bug fixes or other
significant changes.**
- [ ] **I have read the [contribution guidelines](../CONTRIBUTING.md).**
- [ ] **Create an issue and link to the pull request to get dedicated
review from promptflow team. Learn more: [suggested
workflow](../CONTRIBUTING.md#suggested-workflow).**

## General Guidelines and Best Practices
- [ ] Title of the pull request is clear and informative.
- [ ] There are a small number of commits, each of which have an
informative message. This means that previously merged commits do not
appear in the history of the PR. For more information on cleaning up the
commits in your PR, [see this
page](https://github.com/Azure/azure-powershell/blob/master/documentation/development-docs/cleaning-up-commits.md).

### Testing Guidelines
- [ ] Pull request includes test coverage for the included changes.

---------

Co-authored-by: Zhengfei Wang <[email protected]>
Co-authored-by: Clement Wang <[email protected]>
Co-authored-by: Clement Wang <[email protected]>
  • Loading branch information
4 people authored Apr 25, 2024
1 parent 0a7cde3 commit 9cb7869
Show file tree
Hide file tree
Showing 25 changed files with 572 additions and 36 deletions.
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Below is a table of important doc pages.
|----------------|----------------|
|Quick start|[Getting started with prompt flow](./how-to-guides/quick-start.md)|
|Concepts|[Flows](./concepts/concept-flows.md)<br> [Tools](./concepts/concept-tools.md)<br> [Connections](./concepts/concept-connections.md)<br> [Variants](./concepts/concept-variants.md)<br> |
|How-to guides|[How to initialize and test a flow](./how-to-guides/develop-a-flow/init-and-test-a-flow.md) <br>[How to run and evaluate a flow](./how-to-guides/run-and-evaluate-a-flow/index.md)<br> [How to tune prompts using variants](./how-to-guides/tune-prompts-with-variants.md)<br>[How to deploy a flow](./how-to-guides/deploy-a-flow/index.md)<br>[How to create and use your own tool package](./how-to-guides/develop-a-tool/create-and-use-tool-package.md)|
|How-to guides|[How to initialize and test a flow](./how-to-guides/develop-a-dag-flow/init-and-test-a-flow.md) <br>[How to run and evaluate a flow](./how-to-guides/run-and-evaluate-a-flow/index.md)<br> [How to tune prompts using variants](./how-to-guides/tune-prompts-with-variants.md)<br>[How to deploy a flow](./how-to-guides/deploy-a-flow/index.md)<br>[How to create and use your own tool package](./how-to-guides/develop-a-tool/create-and-use-tool-package.md)|
|Tools reference|[LLM tool](./reference/tools-reference/llm-tool.md)<br> [Prompt tool](./reference/tools-reference/prompt-tool.md)<br> [Python tool](./reference/tools-reference/python-tool.md)<br> [Embedding tool](./reference/tools-reference/embedding_tool.md)<br>[SERP API tool](./reference/tools-reference/serp-api-tool.md) ||


Expand Down
2 changes: 1 addition & 1 deletion docs/cloud/azureai/run-promptflow-in-azure-ai.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,7 @@ At the end of stream logs, you can find the `portal_url` of the submitted run, c

### Run snapshot of the flow with additional includes

Flows that enabled [additional include](../../how-to-guides/develop-a-flow/referencing-external-files-or-folders-in-a-flow.md) files can also be submitted for execution in the workspace. Please note that the specific additional include files or folders will be uploaded and organized within the **Files** folder of the run snapshot in the cloud.
Flows that enabled [additional include](../../how-to-guides/develop-a-dag-flow/referencing-external-files-or-folders-in-a-flow.md) files can also be submitted for execution in the workspace. Please note that the specific additional include files or folders will be uploaded and organized within the **Files** folder of the run snapshot in the cloud.

![img](../../media/cloud/azureml/run-with-additional-includes.png)

Expand Down
4 changes: 2 additions & 2 deletions docs/cloud/azureai/use-flow-in-azure-ml-pipeline.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Use flow in Azure ML pipeline job
In practical scenarios, flows fulfill various functions. For example, consider an offline flow specifically designed to assess the relevance score for communication sessions between humans and agents. This flow is triggered nightly and processes a substantial amount of session data. In such a context, Parallel component and AzureML pipeline emerge as the optimal choices for handling large-scale, highly resilient, and efficient offline batch requirements.

Once you’ve developed and thoroughly tested your flow using the guidelines in the [init and test a flow](../../how-to-guides/develop-a-flow/init-and-test-a-flow.md) section, this guide will walk you through utilizing your flow as a parallel component within an AzureML pipeline job.
Once you’ve developed and thoroughly tested your flow, this guide will walk you through utilizing your flow as a parallel component within an AzureML pipeline job.

:::{admonition} Pre-requirements
To enable this feature, customer need to:
Expand Down Expand Up @@ -329,7 +329,7 @@ Given above, if your flow has logic relying on identity or environment variable,
| key | source | type | description |
| ----------- | ------ | ---------------------- | ------------------------------------------------------------ |
| data | fixed | uri_folder or uri_file | required; to pass in input data. Supported format includes [`mltable`](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-mltable?view=azureml-api-2&tabs=cli#authoring-mltable-files) and list of jsonl files. |
| run_outputs | fixed | uri_folder | optional; to pass in output of a standard flow for [an evaluation flow](../../how-to-guides/develop-a-flow/develop-evaluation-flow.md). Should be linked to a `flow_outputs` of a previous flow node in the pipeline. |
| run_outputs | fixed | uri_folder | optional; to pass in output of a standard flow for [an evaluation flow](../../how-to-guides/develop-a-dag-flow/develop-evaluation-flow.md). Should be linked to a `flow_outputs` of a previous flow node in the pipeline. |

### Output ports

Expand Down
15 changes: 13 additions & 2 deletions docs/concepts/concept-flows.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,25 @@ Our [examples](https://github.com/microsoft/promptflow/tree/main/examples/flex-f

Thus LLM apps can be defined as Directed Acyclic Graphs (DAGs) of function calls. These DAGs are flows in prompt flow.

A flow in prompt flow is a DAG of functions (we call them [tools](./concept-tools.md)). These functions/tools connected via input/output dependencies and executed based on the topology by prompt flow executor.
A `DAG flow` in prompt flow is a DAG of functions (we call them [tools](./concept-tools.md)). These functions/tools connected via input/output dependencies and executed based on the topology by prompt flow executor.

A flow is represented as a YAML file and can be visualized with our [Prompt flow for VS Code extension](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow). Here is an example `flow.dag.yaml`:

![flow_dag](../media/how-to-guides/quick-start/flow_dag.png)

Please refer to our [examples](https://github.com/microsoft/promptflow/tree/main/examples/flows) to learn how to write a `DAG flow`.

## When to use Flex or DAG flow

`Dag flow` provides a UI-friendly way to develop your LLM app, which has the following benifits:
- **Low code**: user can drag-and-drop in UI to create a LLM app.
- **DAG Visualization**: user can easily understand the logic structure of the app with DAG view.

`Flex flow` provides a code-friendly way to develop your LLM app, which has the following benifits:
- **Quick start**: Users can quickly test with a simple prompt, then customize with python code with Tracing visualization UI.
- **More advanced orchestration**: Users can write complex flow with Python built-in control operators (if-else, foreach) or other 3rd party / open-source library.
- **Easy onboard from other platforms**: user might already onboard platforms like `langchain` and `sematic kernel` with existing code. User can easily onboard promptflow with a few code changes.

## Flow types

Prompt flow examples organize flows by three categories:
Expand All @@ -42,6 +53,6 @@ DAG flow [examples](https://github.com/microsoft/promptflow/tree/main/examples/f
## Next steps

- [Quick start](../how-to-guides/quick-start.md)
- [Initialize and test a flow](../how-to-guides/develop-a-flow/init-and-test-a-flow.md)
- [Initialize and test a flow](../how-to-guides/develop-a-dag-flow/init-and-test-a-flow.md)
- [Run and evaluate a flow](../how-to-guides/run-and-evaluate-a-flow/index.md)
- [Tune prompts using variants](../how-to-guides/tune-prompts-with-variants.md)
26 changes: 26 additions & 0 deletions docs/how-to-guides/develop-a-dag-flow/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# Develop a dag flow

LLM apps can be defined as Directed Acyclic Graphs (DAGs) of function calls. These DAGs are flows in prompt flow.

A `DAG flow` in prompt flow is a DAG of functions (we call them [tools](../../concepts//concept-tools.md)). These functions/tools connected via input/output dependencies and executed based on the topology by prompt flow executor.

A flow is represented as a YAML file and can be visualized with our [Prompt flow for VS Code extension](https://marketplace.visualstudio.com/items?itemName=prompt-flow.prompt-flow). Here is an example `flow.dag.yaml`:

![flow_dag](../../media/how-to-guides/quick-start/flow_dag.png)

Please refer to our [examples](https://github.com/microsoft/promptflow/tree/main/examples/flows) and guides in this section to learn how to write a `DAG flow`.

Note:
- promptflow also support user develop a a flow using code. learn more on comparasion of these two [flow concepts](../../concepts/concept-flows.md).

```{toctree}
:maxdepth: 1
init-and-test-a-flow
develop-standard-flow
develop-chat-flow
develop-evaluation-flow
add-conditional-control-to-a-flow
process-image-in-flow
referencing-external-files-or-folders-in-a-flow
```
296 changes: 296 additions & 0 deletions docs/how-to-guides/develop-a-flex-flow/class-based-flow.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
# Class based flow

:::{admonition} Experimental feature
This is an experimental feature, and may change at any time. Learn [more](../faq.md#stable-vs-experimental).
:::

When user need to persist objects (like connection) in memory during multiple rounds of flow runs, they can write a callable class as flow entry and put persist params in `__init__` method.

If user need to log metrics on batch run outputs, they can add an `__aggregate__` method and it will be scheduled after batch run finishes.
The `__aggregate__` method should only contain 1 params which is list of batch run results.

See [connection support](#connection-support) & [aggregation support](#aggregation-support) for more details.

## Class as a flow

Assume we have a file `flow_entry.py`:

```python
class Reply(TypedDict):
output: str

class MyFlow:
def __init__(self, model_config: AzureOpenAIModelConfiguration, flow_config: dict):
"""Flow initialization logic goes here."""
self.model_config = model_config
self.flow_config = flow_config

def __call__(question: str) -> Reply:
"""Flow execution logic goes here."""
return Reply(output=output)

def __aggregate__(self, line_results: List[str]) -> dict:
"""Aggregation logic goes here. Return key-value pair as metrics."""
return {"key": val}
```


## Flow test

Since flow's definition is function/callable class. We recommend user directly run it like running other scripts:

```python
class MyFlow:
pass
if __name__ == "__main__":
flow = MyFlow(model_config, flow_config)
output = flow(question)
metrics = flow.__aggregate__([output])
# check metrics here
```

You can also test the flow using CLI:
```bash
# flow entry syntax: path.to.module:ClassName
pf flow test --flow flow_entry:MyFlow --inputs question="What's the capital of France?" --init init.json
```

Check out a full example here: [basic-chat](https://github.com/microsoft/promptflow/tree/main/examples/flex-flows/basic-chat)

### Chat with a flow

Chat with flow in CLI is supported:

```bash
pf flow test --flow flow_entry:MyFlow --inputs inputs.json --init init.json --ui
```

Check [here](../chat-with-a-flow/index.md) for more information.

## Batch run

User can also batch run a flow.

::::{tab-set}
:::{tab-item} CLI
:sync: CLI

```bash
pf run create --flow "path.to.module:ClassName" --data "./data.jsonl"
```

:::

:::{tab-item} SDK
:sync: SDK
```python
# user can also directly use entry in `flow` param for batch run
pf.run(flow="path.to.module:ClassName", init="./init.jsonl", data="./data.jsonl")
```

:::
::::

Or directly run the imported flow class or flow instance.

```python
class MyFlow:
pass
pf.run(flow=MyFlow, init={"model_config": config, "flow_config": {}}, data="./data.jsonl")
# or
flow_obj = MyFlow(model_config=config, flow_config={})
pf.run(flow=flow_obj, data="./data.jsonl")
```

Learn more on this topic on [Run and evaluate a flow](../run-and-evaluate-a-flow/index.md)

## Define a flow yaml

User can write a YAML file with name `flow.flex.yaml` manually or save a function/callable entry to YAML file.
This is required for advanced scenario like deployment or run in cloud.
A flow YAML may look like this:

```yaml
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
entry: path.to.module:ClassName
```
## Batch run with YAML
User can batch run a flow. Flow init function's param is supported by `init` parameter.

::::{tab-set}
:::{tab-item} CLI
:sync: CLI

User need to write an JSON file as init's value since it's hard to write model config in command line.

```json
{
"model_config": {
"azure_endpoint": "my_endpoint",
"azure_deployment": "my_deployment",
"api_key": "actual_api_key"
},
"flow_config": {}
}
```

```bash
pf run create --flow "./flow.flex.yaml" --data "./data.jsonl" --init init.json
```

:::

:::{tab-item} SDK
:sync: SDK

```python
pf = PFClient()
config = AzureOpenAIModelConfiguration(
azure_deployment="my_deployment",
api_key="actual_key"
)
# if init's value is not json serializable, raise user error
pf.run(flow="./flow.flex.yaml", init={"model_config": config, "flow_config": {}}, data="./data.jsonl")
# when submit to cloud, user can only use connection
# in runtime executor will resolve connection in AzureOpenAIModelConfiguration and set connection's fields to ModelConfig: equal to original ModelConfiguration.from_connection()
config = AzureOpenAIModelConfiguration(
azure_deployment="my_embedding_deployment",
connection="my-aoai-connection",
)
pfazure.run(flow="./flow.flex.yaml", init={"model_config": config, "flow_config": {}}, data="./data.jsonl")
```

:::
::::

## Deploy a flow

User can serve a flow. Flow init function's param is supported by `init` parameter.
The flow should have complete init/inputs/outputs specification in YAML to make sure serving swagger can be generated.

User need to write an JSON file as init's value since it's hard to write model config in command line.

```json
{
"model_config": {
"azure_endpoint": "my_endpoint",
"azure_deployment": "my_deployment",
"api_key": "actual_api_key"
},
"flow_config": {}
}
```

```bash
# user can only pass model config by file
pf flow serve --source "./" --port 8088 --host localhost --init path/to/init.json
```

Learn more: [Deploy a flow](../deploy-a-flow/index.md).

## Connection support

### Model config in `__init__`

Just like example in [batch run](#batch-run-with-yaml), it's supported to reference connection in ModelConfig.
And connection will be resolved and flatten connection's fields to ModelConfig.

### Connection in `__init__`

It's also supported to directly pass connection by **name** in `__init__`.

```python
class MyFlow:
def __init__(self, my_connection: AzureOpenAIConnection):
pass
```

Note:

- Union of connection types(`Union[OpenAIConnection, AzureOpenAIConnection]`) is not supported.

#### Batch run with connection

User can pass connection name to connection field in `init`.

In local, the connection name will be replaced with local connection object in execution time.
In cloud, the connection name will be replaced with workspace's connection object in execution time.

```python
# local connection "my_connection"'s instance will be passed to `__init__`
pf.run(flow="./flow.flex.yaml", init={"connection": "my_connection"}, data="./data.jsonl")
# cloud connection "my_cloud_connection"'s instance will be passed to `__init__`
pfazure.run(flow="./flow.flex.yaml", init={"connection": "my_cloud_connection"}, data="./data.jsonl")
```
### Environment variable connections(EVC)
If flow YAML has `environment_variables` and it's value is a connection reference like this:

```yaml
environment_variables:
AZURE_OPENAI_API_KEY: ${open_ai_connection.api_key}
AZURE_OPENAI_ENDPOINT: ${open_ai_connection.api_base}
```

The environment variable's value will be resolved to actual value in runtime.
If the connection not exist (in local or cloud), connection not found error will be raised.

**Note**: User can override the `environment_variables` with existing environment variable keys in `flow.flex.yaml`:

```bash
pf run create --flow . --data ./data.jsonl --environment-variables AZURE_OPENAI_API_KEY='${new_connection.api_key}' AZURE_OPENAI_ENDPOINT='my_endpoint'
```

Overriding with environment variable names which not exist in `flow.flex.yaml` is not supported.
Which means if user added environment variables which does not exist in `flow.flex.yaml` in runtime, it's value won't be resolved.

For example,

```bash
pf run create --flow . --data ./data.jsonl --environment-variables NEW_API_KEY='${my_new_connection.api_key}'
```

The `NEW_API_KEY`'s value won't be resolved to connection's API key.

## Aggregation support

Aggregation support is introduce to help user calculate metrics.

```python
class MyFlow:
def __call__(text: str) -> str:
"""Flow execution logic goes here."""
pass
# will only execute once after batch run finished.
# the processed_results will be list of __call__'s output and we will log the return value as metrics automatically.
def __aggregate__(self, processed_results: List[str]) -> dict:
for element in processed_results:
# If __call__'s output is primitive type, element will be primitive type.
# If __call__'s output is dataclass, element will be a dictionary, but can access it's attribute with `element.attribute_name`
# For other cases, it's recommended to access by key `element["attribute_name"]`

```

**Note**:

There's several limitations on aggregation support:

- The aggregation function will only execute in batch run.
- Only 1 hard coded `__aggregate__` function is supported.
- The `__aggregate__` will only be passed **1** positional arguments when executing.
- The aggregation function’s input will be flow run’s outputs list.
- Each element inside `processed_results` passed passed inside `__aggregate__` function is not same object with each line's `__call__` returns.
- The reconstructed element is a dictionary which supports 1 layer attribute access. But it's recommended to access them by key. See the above example for usage.
- If aggregation function accept more than 1 arguments, raise error in submission phase.

## Next steps

- [Input output format](./input-output-format.md)
- [Class based flow sample](https://github.com/microsoft/promptflow/blob/main/examples/flex-flows/chat-basic/README.md)
- [Class based flow evaluation sample](https://github.com/microsoft/promptflow/blob/main/examples/flex-flows/eval-code-quality/README.md)
Loading

0 comments on commit 9cb7869

Please sign in to comment.