diff --git a/docs/docs/tutorials/entity_extraction/index.ipynb b/docs/docs/tutorials/entity_extraction/index.ipynb new file mode 100644 index 000000000..8ec298196 --- /dev/null +++ b/docs/docs/tutorials/entity_extraction/index.ipynb @@ -0,0 +1,1232 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Tutorial: Entity Extraction\n", + "\n", + "This tutorial demonstrates how to perform **entity extraction** using the CoNLL-2003 dataset with DSPy. The focus is on extracting entities referring to people. We will:\n", + "\n", + "- Extract and label entities from the CoNLL-2003 dataset that refer to people\n", + "- Define a DSPy program for extracting entities that refer to people\n", + "- Optimize and evaluate the program on a subset of the CoNLL-2003 dataset\n", + "\n", + "By the end of this tutorial, you'll understand how to structure tasks in DSPy using signatures and modules, evaluate your system's performance, and improve its quality with optimizers.\n", + "\n", + "Install the latest version of DSPy and follow along. If you're looking instead for a conceptual overview of DSPy, this [recent lecture](https://www.youtube.com/live/JEMYuzrKLUw) is a good place to start." + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [], + "source": [ + "# Install the latest version of DSPy\n", + "%pip install -U dspy-ai\n", + "# Install the Hugging Face datasets library to load the CoNLL-2003 dataset\n", + "%pip install datasets" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Load and Prepare the Dataset\n", + "\n", + "In this section, we prepare the CoNLL-2003 dataset, which is commonly used for entity extraction tasks. The dataset includes tokens annotated with entity labels such as persons, organizations, and locations.\n", + "\n", + "We will:\n", + "1. Load the dataset using the Hugging Face `datasets` library.\n", + "2. Define a function to extract tokens referring to people.\n", + "3. Slice the dataset to create smaller subsets for training and testing.\n", + "\n", + "DSPy expects examples in a structured format, so we'll also transform the dataset into DSPy `Examples` for easy integration." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import tempfile\n", + "from datasets import load_dataset\n", + "from typing import Dict, Any, List\n", + "import dspy\n", + "\n", + "def load_conll_dataset() -> dict:\n", + " \"\"\"\n", + " Loads the CoNLL-2003 dataset into train, validation, and test splits.\n", + " \n", + " Returns:\n", + " dict: Dataset splits with keys 'train', 'validation', and 'test'.\n", + " \"\"\"\n", + " with tempfile.TemporaryDirectory() as temp_dir:\n", + " # Use a temporary Hugging Face cache directory for compatibility with certain hosted notebook\n", + " # environments that don't support the default Hugging Face cache directory\n", + " os.environ[\"HF_DATASETS_CACHE\"] = temp_dir\n", + " return load_dataset(\"conll2003\", trust_remote_code=True)\n", + "\n", + "def extract_people_entities(data_row: Dict[str, Any]) -> List[str]:\n", + " \"\"\"\n", + " Extracts entities referring to people from a row of the CoNLL-2003 dataset.\n", + " \n", + " Args:\n", + " data_row (Dict[str, Any]): A row from the dataset containing tokens and NER tags.\n", + " \n", + " Returns:\n", + " List[str]: List of tokens tagged as people.\n", + " \"\"\"\n", + " return [\n", + " token\n", + " for token, ner_tag in zip(data_row[\"tokens\"], data_row[\"ner_tags\"])\n", + " if ner_tag in (1, 2) # CoNLL entity codes 1 and 2 refer to people\n", + " ]\n", + "\n", + "def prepare_dataset(data_split, start: int, end: int) -> List[dspy.Example]:\n", + " \"\"\"\n", + " Prepares a sliced dataset split for use with DSPy.\n", + " \n", + " Args:\n", + " data_split: The dataset split (e.g., train or test).\n", + " start (int): Starting index of the slice.\n", + " end (int): Ending index of the slice.\n", + " \n", + " Returns:\n", + " List[dspy.Example]: List of DSPy Examples with tokens and expected labels.\n", + " \"\"\"\n", + " return [\n", + " dspy.Example(\n", + " tokens=row[\"tokens\"],\n", + " expected_extracted_people=extract_people_entities(row)\n", + " ).with_inputs(\"tokens\")\n", + " for row in data_split.select(range(start, end))\n", + " ]\n", + "\n", + "# Load the dataset\n", + "dataset = load_conll_dataset()\n", + "\n", + "# Prepare the training and test sets\n", + "train_set = prepare_dataset(dataset[\"train\"], 0, 50)\n", + "test_set = prepare_dataset(dataset[\"test\"], 0, 200)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Configure DSPy and create an Entity Extraction Program\n", + "\n", + "Here, we define a DSPy program for extracting entities referring to people from tokenized text.\n", + "\n", + "Then, we configure DSPy to use a particular language model (`gpt-4o-mini`) for all invocations of the program.\n", + "\n", + "**Key DSPy Concepts Introduced:**\n", + "- **Signatures:** Define structured input/output schemas for your program.\n", + "- **Modules:** Encapsulate program logic in reusable, composable units.\n", + "\n", + "Specifically, we'll:\n", + "- Create a `PeopleExtraction` DSPy Signature to specify the input (`tokens`) and output (`extracted_people`) fields.\n", + "- Define a `people_extractor` program that uses DSPy's built-in `dspy.ChainOfThought` module to implement the `PeopleExtraction` signature. The program extracts entities referring to people from a list of input tokens using language model (LM) prompting.\n", + "- Use the `dspy.LM` class and `dspy.settings.configure()` method to configure the language model that DSPy will use when invoking the program." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "from typing import List\n", + "\n", + "class PeopleExtraction(dspy.Signature):\n", + " \"\"\"\n", + " Extract contiguous tokens referring to specific people, if any, from a list of string tokens.\n", + " Output a list of tokens. In other words, do not combine multiple tokens into a single value.\n", + " \"\"\"\n", + " tokens: list[str] = dspy.InputField(desc=\"tokenized text\")\n", + " extracted_people: list[str] = dspy.OutputField(desc=\"all tokens referring to specific people extracted from the tokenized text\")\n", + "\n", + "people_extractor = dspy.ChainOfThought(PeopleExtraction)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Here, we tell DSPy to use OpenAI's `gpt-4o-mini` model in our program. To authenticate, DSPy reads your `OPENAI_API_KEY`. You can easily swap this out for [other providers or local models](https://github.com/stanfordnlp/dspy/blob/main/examples/migration.ipynb)." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [], + "source": [ + "lm = dspy.LM(model=\"openai/gpt-4o-mini\")\n", + "dspy.settings.configure(lm=lm)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Define Metric and Evaluation Functions\n", + "\n", + "In DSPy, evaluating a program's performance is critical for iterative development. A good evaluation framework allows us to:\n", + "- Measure the quality of our program's outputs.\n", + "- Compare outputs against ground-truth labels.\n", + "- Identify areas for improvement.\n", + "\n", + "**What We'll Do:**\n", + "- Define a custom metric (`extraction_correctness_metric`) to evaluate whether the extracted entities match the ground truth.\n", + "- Create an evaluation function (`evaluate_correctness`) to apply this metric to a training or test dataset and compute the overall accuracy.\n", + "\n", + "The evaluation function uses DSPy's `Evaluate` utility to handle parallelism and visualization of results." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [], + "source": [ + "def extraction_correctness_metric(example: dspy.Example, prediction: dspy.Prediction, trace=None) -> bool:\n", + " \"\"\"\n", + " Computes correctness of entity extraction predictions.\n", + " \n", + " Args:\n", + " example (dspy.Example): The dataset example containing expected people entities.\n", + " prediction (dspy.Prediction): The prediction from the DSPy people extraction program.\n", + " trace: Optional trace object for debugging.\n", + " \n", + " Returns:\n", + " bool: True if predictions match expectations, False otherwise.\n", + " \"\"\"\n", + " return prediction.extracted_people == example.expected_extracted_people\n", + "\n", + "evaluate_correctness = dspy.Evaluate(\n", + " devset=test_set,\n", + " metric=extraction_correctness_metric,\n", + " num_threads=24,\n", + " display_progress=True,\n", + " display_table=True\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Evaluate Initial Extractor\n", + "\n", + "Before optimizing our program, we need a baseline evaluation to understand its current performance. This helps us:\n", + "- Establish a reference point for comparison after optimization.\n", + "- Identify potential weaknesses in the initial implementation.\n", + "\n", + "In this step, we'll run our `people_extractor` program on the test set and measure its accuracy using the evaluation framework defined earlier." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Average Metric: 172.00 / 200 (86.0%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 200/200 [00:16<00:00, 11.94it/s]" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:08:04 INFO dspy.evaluate.evaluate: Average Metric: 172 / 200 (86.0%)\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
tokensexpected_extracted_peoplerationaleextracted_peopleextraction_correctness_metric
0[SOCCER, -, JAPAN, GET, LUCKY, WIN, ,, CHINA, IN, SURPRISE, DEFEAT...[CHINA]We extracted \"JAPAN\" and \"CHINA\" as they refer to specific countri...[JAPAN, CHINA]
1[Nadim, Ladki][Nadim, Ladki]We extracted the tokens \"Nadim\" and \"Ladki\" as they refer to speci...[Nadim, Ladki]\u2714\ufe0f [True]
2[AL-AIN, ,, United, Arab, Emirates, 1996-12-06][]There are no tokens referring to specific people in the provided l...[]\u2714\ufe0f [True]
3[Japan, began, the, defence, of, their, Asian, Cup, title, with, a...[]We did not find any tokens referring to specific people in the pro...[]\u2714\ufe0f [True]
4[But, China, saw, their, luck, desert, them, in, the, second, matc...[]The extracted tokens referring to specific people are \"China\" and ...[China, Uzbekistan]
..................
195['The', 'Wallabies', 'have', 'their', 'sights', 'set', 'on', 'a', ...[David, Campese]The extracted_people includes \"David Campese\" as it refers to a sp...[David, Campese]\u2714\ufe0f [True]
196['The', 'Wallabies', 'currently', 'have', 'no', 'plans', 'to', 'ma...[]The extracted_people includes \"Wallabies\" as it refers to a specif...[]\u2714\ufe0f [True]
197['Campese', 'will', 'be', 'up', 'against', 'a', 'familiar', 'foe',...[Campese, Rob, Andrew]The extracted tokens refer to specific people mentioned in the tex...[Campese, Rob, Andrew]\u2714\ufe0f [True]
198['\"', 'Campo', 'has', 'a', 'massive', 'following', 'in', 'this', '...[Campo, Andrew]The extracted tokens referring to specific people include \"Campo\" ...[Campo, Andrew]\u2714\ufe0f [True]
199['On', 'tour', ',', 'Australia', 'have', 'won', 'all', 'four', 'te...[]We extracted the names of specific people from the tokenized text....[]\u2714\ufe0f [True]
\n", + "

200 rows \u00d7 5 columns

\n", + "
" + ], + "text/plain": [ + " tokens \\\n", + "0 [SOCCER, -, JAPAN, GET, LUCKY, WIN, ,, CHINA, IN, SURPRISE, DEFEAT... \n", + "1 [Nadim, Ladki] \n", + "2 [AL-AIN, ,, United, Arab, Emirates, 1996-12-06] \n", + "3 [Japan, began, the, defence, of, their, Asian, Cup, title, with, a... \n", + "4 [But, China, saw, their, luck, desert, them, in, the, second, matc... \n", + ".. ... \n", + "195 ['The', 'Wallabies', 'have', 'their', 'sights', 'set', 'on', 'a', ... \n", + "196 ['The', 'Wallabies', 'currently', 'have', 'no', 'plans', 'to', 'ma... \n", + "197 ['Campese', 'will', 'be', 'up', 'against', 'a', 'familiar', 'foe',... \n", + "198 ['\"', 'Campo', 'has', 'a', 'massive', 'following', 'in', 'this', '... \n", + "199 ['On', 'tour', ',', 'Australia', 'have', 'won', 'all', 'four', 'te... \n", + "\n", + " expected_extracted_people \\\n", + "0 [CHINA] \n", + "1 [Nadim, Ladki] \n", + "2 [] \n", + "3 [] \n", + "4 [] \n", + ".. ... \n", + "195 [David, Campese] \n", + "196 [] \n", + "197 [Campese, Rob, Andrew] \n", + "198 [Campo, Andrew] \n", + "199 [] \n", + "\n", + " rationale \\\n", + "0 We extracted \"JAPAN\" and \"CHINA\" as they refer to specific countri... \n", + "1 We extracted the tokens \"Nadim\" and \"Ladki\" as they refer to speci... \n", + "2 There are no tokens referring to specific people in the provided l... \n", + "3 We did not find any tokens referring to specific people in the pro... \n", + "4 The extracted tokens referring to specific people are \"China\" and ... \n", + ".. ... \n", + "195 The extracted_people includes \"David Campese\" as it refers to a sp... \n", + "196 The extracted_people includes \"Wallabies\" as it refers to a specif... \n", + "197 The extracted tokens refer to specific people mentioned in the tex... \n", + "198 The extracted tokens referring to specific people include \"Campo\" ... \n", + "199 We extracted the names of specific people from the tokenized text.... \n", + "\n", + " extracted_people extraction_correctness_metric \n", + "0 [JAPAN, CHINA] \n", + "1 [Nadim, Ladki] \u2714\ufe0f [True] \n", + "2 [] \u2714\ufe0f [True] \n", + "3 [] \u2714\ufe0f [True] \n", + "4 [China, Uzbekistan] \n", + ".. ... ... \n", + "195 [David, Campese] \u2714\ufe0f [True] \n", + "196 [] \u2714\ufe0f [True] \n", + "197 [Campese, Rob, Andrew] \u2714\ufe0f [True] \n", + "198 [Campo, Andrew] \u2714\ufe0f [True] \n", + "199 [] \u2714\ufe0f [True] \n", + "\n", + "[200 rows x 5 columns]" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "86.0" + ] + }, + "execution_count": 6, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "evaluate_correctness(people_extractor, devset=test_set)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Optimize the Model\n", + "\n", + "DSPy includes powerful optimizers that can improve the quality of your system.\n", + "\n", + "Here, we use DSPy's `MIPROv2` optimizer to:\n", + "- Automatically tune the program's language model (LM) prompt by 1. using the LM to adjust the prompt's instructions and 2. building few-shot examples from the training dataset that are augmented with reasoning generated from `dspy.ChainOfThought`.\n", + "- Maximize correctness on the training set.\n", + "\n", + "This optimization process is automated, saving time and effort while improving accuracy." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:08:04 INFO dspy.teleprompt.mipro_optimizer_v2: \n", + "RUNNING WITH THE FOLLOWING MEDIUM AUTO RUN SETTINGS:\n", + "num_trials: 25\n", + "minibatch: False\n", + "num_candidates: 19\n", + "valset size: 40\n", + "\n", + "2024/11/18 21:08:04 INFO dspy.teleprompt.mipro_optimizer_v2: \n", + "==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==\n", + "2024/11/18 21:08:04 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used as few-shot example candidates for our program and for creating instructions.\n", + "\n", + "2024/11/18 21:08:04 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=19 sets of demonstrations...\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Bootstrapping set 1/19\n", + "Bootstrapping set 2/19\n", + "Bootstrapping set 3/19\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n...\n", + "...\n", + "...\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Bootstrapped 2 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.\n", + "Bootstrapping set 19/19\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + " 40%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u258a | 4/10 [00:00<00:00, 995.21it/s]\n", + "2024/11/18 21:08:17 INFO dspy.teleprompt.mipro_optimizer_v2: \n", + "==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==\n", + "2024/11/18 21:08:17 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Bootstrapped 4 full traces after 4 examples for up to 1 rounds, amounting to 4 attempts.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:08:21 INFO dspy.teleprompt.mipro_optimizer_v2: \n", + "Proposing instructions...\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: Proposed Instructions for Predictor 0:\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 0: Extract contiguous tokens referring to specific people, if any, from a list of string tokens.\n", + "Output a list of tokens. In other words, do not combine multiple tokens into a single value.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 1: Given a list of tokenized text, identify and extract all contiguous tokens that refer to specific individuals. Ensure that the output is a list of these tokens without combining them into single values. Provide a clear rationale explaining the reasoning behind each extraction.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 2: In a high-stakes scenario where accurate identification of EU officials is crucial for compliance with new health regulations affecting livestock, extract contiguous tokens from the provided list that refer to specific individuals. Ensure that your output is a comprehensive list of these tokens, as any oversight could lead to significant regulatory implications. Remember, do not combine multiple tokens into a single value; each name must be clearly delineated.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 3: Given a list of tokenized text strings, identify and extract any contiguous tokens that refer to specific individuals. Provide a rationale for your extraction process, explaining the reasoning step by step. Output the extracted names as a list of tokens, ensuring that multiple tokens are not combined into a single value.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 4: You are a Named Entity Recognition expert. Your task is to extract contiguous tokens that refer to specific people from the provided list of string tokens. If there are no specific individuals mentioned, return an empty list. Ensure that you do not combine multiple tokens into a single value; output them as a list.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 5: Given the tokenized text, extract contiguous tokens that refer to specific individuals. If there are no references to identifiable people, indicate that no people have been extracted. Provide a rationale for your reasoning process along with the list of extracted names.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 6: In a critical situation where accurate identification of EU officials is essential for compliance with new regulations, extract contiguous tokens from the provided list of string tokens that specifically refer to individuals. Ensure that your output is a list of distinct tokens without combining them into single values. This task is vital for ensuring clear communication in health communications regarding livestock, particularly in the context of sheep and mad cow disease.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 7: In a high-stakes situation where accurate identification of individuals is critical for regulatory compliance and public health communication, extract contiguous tokens referring to specific people from the provided list of string tokens. Ensure that you output each identified individual as separate tokens without combining multiple tokens into a single value. This task is essential for ensuring clarity and accountability in communications pertaining to EU regulations and health matters.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 8: Given a list of tokenized text, identify and extract any contiguous sequences of tokens that refer specifically to individuals. Ensure that the output is a list of tokens representing those names, and do not merge multiple tokens into a single value. Provide reasoning for your extraction process, clearly stating if specific individuals were found or if the tokens did not contain any references to people.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 9: In a high-stakes scenario where accurate identification of EU officials is critical for public health communications regarding livestock diseases, extract contiguous tokens that refer to specific people from the provided list of string tokens. Ensure that the output is a list of tokens, without combining multiple tokens into a single value. Provide a clear rationale explaining the reasoning behind the identification of these tokens as referring to specific individuals.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 10: Identify and extract contiguous tokens from the provided list that specifically refer to individuals. Ensure that the output consists of distinct tokens representing the names, without merging them into single values.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 11: In a critical situation where accurate identification of key individuals is essential for effective communication regarding EU regulations and health communications, extract contiguous tokens referring to specific people from the provided list of string tokens. Ensure that the output is a list of tokens without combining them into a single value. This task is crucial for clarity in reporting and decision-making processes.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 12: In a critical situation where accurate identification of key individuals is essential for public health communications regarding EU regulations on livestock, extract contiguous tokens referring to specific people from the provided list of string tokens. Ensure that the output is a list of individual tokens, maintaining their separation to facilitate precise recognition of each person mentioned in the context.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 13: In a high-stakes situation where accurate identification of individuals is critical for regulatory compliance and public health communication, extract contiguous tokens referring to specific people from the provided list of string tokens. Ensure that you output each identified individual as separate tokens without combining multiple tokens into a single value. This task is essential for ensuring clarity and accountability in communications pertaining to EU regulations and health matters.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 14: You are a Named Entity Recognition expert. Your task is to extract contiguous tokens referring to specific people from a list of string tokens. Please ensure that you output a list of tokens without combining them into a single value. Provide a rationale for your extraction, explaining why the identified tokens refer to a specific person.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 15: Given a list of tokenized words, identify and extract contiguous tokens that refer to specific individuals. Provide a rationale explaining the reasoning behind the extraction process, and output a list of the identified tokens without combining them into single values.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 16: You are an AI text analyzer. Your task is to extract contiguous tokens that refer to specific individuals from a list of string tokens. Carefully examine the tokens and output a list of those that represent people. If no tokens refer to individuals, return an empty list. Remember to provide a rationale explaining your extraction process.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 17: In a critical situation where EU regulations regarding livestock health are being discussed, it is essential to accurately identify and extract the names of officials involved in these discussions. Given a list of tokenized text, extract contiguous tokens that refer to specific individuals. Ensure that each name is output as separate tokens, as combining them could lead to confusion. This information is vital for understanding the key players in the regulatory landscape and their statements on issues like mad cow disease and sheep health.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: 18: In a high-stakes situation where accurate identification of key individuals is crucial for regulatory compliance and public health communication, extract contiguous tokens referring to specific people from the provided list of string tokens. Ensure that your output is a list of tokens representing individuals, without combining multiple tokens into a single value. This extraction is vital for understanding the roles and actions of officials in EU regulations related to livestock health.\n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: \n", + "\n", + "2024/11/18 21:10:06 INFO dspy.teleprompt.mipro_optimizer_v2: Evaluating the default program...\n", + "\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Average Metric: 34.00 / 40 (85.0%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 40/40 [00:10<00:00, 3.69it/s]" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:10:16 INFO dspy.evaluate.evaluate: Average Metric: 34 / 40 (85.0%)\n", + "2024/11/18 21:10:16 INFO dspy.teleprompt.mipro_optimizer_v2: Default program score: 85.0\n", + "\n", + "2024/11/18 21:10:16 INFO dspy.teleprompt.mipro_optimizer_v2: ==> STEP 3: FINDING OPTIMAL PROMPT PARAMETERS <==\n", + "2024/11/18 21:10:16 INFO dspy.teleprompt.mipro_optimizer_v2: We will evaluate the program over a series of trials with different combinations of instructions and few-shot examples to find the optimal combination using Bayesian Optimization.\n", + "\n", + "/Users/corey.zumar/miniconda3/envs/default/lib/python3.10/site-packages/optuna/samplers/_tpe/sampler.py:319: ExperimentalWarning: ``multivariate`` option is an experimental feature. The interface can change in the future.\n", + " warnings.warn(\n", + "2024/11/18 21:10:16 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Trial 1 / 25 =====\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Average Metric: 34.00 / 40 (85.0%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 40/40 [00:17<00:00, 2.31it/s]" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:10:34 INFO dspy.evaluate.evaluate: Average Metric: 34 / 40 (85.0%)\n", + "2024/11/18 21:10:34 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 85.0 with parameters ['Predictor 0: Instruction 12', 'Predictor 0: Few-Shot Set 7'].\n", + "2024/11/18 21:10:34 INFO dspy.teleprompt.mipro_optimizer_v2: Scores so far: [85.0, 85.0]\n", + "2024/11/18 21:10:34 INFO dspy.teleprompt.mipro_optimizer_v2: Best score so far: 85.0\n", + "2024/11/18 21:10:34 INFO dspy.teleprompt.mipro_optimizer_v2: ========================\n", + "\n", + "\n", + "2024/11/18 21:10:34 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Trial 2 / 25 =====\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Average Metric: 36.00 / 40 (90.0%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 40/40 [00:09<00:00, 4.16it/s]" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:10:43 INFO dspy.evaluate.evaluate: Average Metric: 36 / 40 (90.0%)\n", + "2024/11/18 21:10:43 INFO dspy.teleprompt.mipro_optimizer_v2: \u001b[92mBest full score so far!\u001b[0m Score: 90.0\n", + "2024/11/18 21:10:43 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 90.0 with parameters ['Predictor 0: Instruction 10', 'Predictor 0: Few-Shot Set 7'].\n", + "2024/11/18 21:10:43 INFO dspy.teleprompt.mipro_optimizer_v2: Scores so far: [85.0, 85.0, 90.0]\n", + "2024/11/18 21:10:43 INFO dspy.teleprompt.mipro_optimizer_v2: Best score so far: 90.0\n", + "2024/11/18 21:10:43 INFO dspy.teleprompt.mipro_optimizer_v2: ========================\n", + "\n", + "\n", + "2024/11/18 21:10:43 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Trial 3 / 25 =====\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Average Metric: 39.00 / 40 (97.5%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 40/40 [00:10<00:00, 3.68it/s]" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n...\n", + "...\n", + "...\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:14:37 INFO dspy.evaluate.evaluate: Average Metric: 34 / 40 (85.0%)\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 85.0 with parameters ['Predictor 0: Instruction 2', 'Predictor 0: Few-Shot Set 0'].\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Scores so far: [85.0, 85.0, 90.0, 97.5, 95.0, 97.5, 82.5, 92.5, 85.0, 77.5, 85.0, 97.5, 97.5, 97.5, 95.0, 95.0, 97.5, 85.0, 90.0, 97.5, 92.5, 95.0, 95.0, 95.0, 85.0]\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Best score so far: 97.5\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: =========================\n", + "\n", + "\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: ===== Trial 25 / 25 =====\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Average Metric: 39.00 / 40 (97.5%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 40/40 [00:00<00:00, 2609.25it/s]" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:14:37 INFO dspy.evaluate.evaluate: Average Metric: 39 / 40 (97.5%)\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Score: 97.5 with parameters ['Predictor 0: Instruction 7', 'Predictor 0: Few-Shot Set 18'].\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Scores so far: [85.0, 85.0, 90.0, 97.5, 95.0, 97.5, 82.5, 92.5, 85.0, 77.5, 85.0, 97.5, 97.5, 97.5, 95.0, 95.0, 97.5, 85.0, 90.0, 97.5, 92.5, 95.0, 95.0, 95.0, 85.0, 97.5]\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Best score so far: 97.5\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: =========================\n", + "\n", + "\n", + "2024/11/18 21:14:37 INFO dspy.teleprompt.mipro_optimizer_v2: Returning best identified program with score 97.5!\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "mipro_optimizer = dspy.MIPROv2(\n", + " metric=extraction_correctness_metric,\n", + " auto=\"medium\",\n", + ")\n", + "optimized_people_extractor = mipro_optimizer.compile(\n", + " people_extractor,\n", + " trainset=train_set,\n", + " max_bootstrapped_demos=4,\n", + " requires_permission_to_run=False,\n", + " minibatch=False\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Evaluate Optimized Program\n", + "\n", + "After optimization, we re-evaluate the program on the test set to measure improvements. Comparing the optimized and initial results allows us to:\n", + "- Quantify the benefits of optimization.\n", + "- Validate that the program generalizes well to unseen data.\n", + "\n", + "In this case, we see that accuracy of the program on the test dataset has improved significantly." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Average Metric: 186.00 / 200 (93.0%): 100%|\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588\u2588| 200/200 [00:23<00:00, 8.58it/s]" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "2024/11/18 21:15:00 INFO dspy.evaluate.evaluate: Average Metric: 186 / 200 (93.0%)\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + }, + { + "data": { + "text/html": [ + "
\n", + "\n", + "\n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + " \n", + "
tokensexpected_extracted_peoplerationaleextracted_peopleextraction_correctness_metric
0[SOCCER, -, JAPAN, GET, LUCKY, WIN, ,, CHINA, IN, SURPRISE, DEFEAT...[CHINA]There are no specific people mentioned in the provided tokens. The...[]
1[Nadim, Ladki][Nadim, Ladki]The tokens \"Nadim Ladki\" refer to a specific individual. Both toke...[Nadim, Ladki]\u2714\ufe0f [True]
2[AL-AIN, ,, United, Arab, Emirates, 1996-12-06][]There are no tokens referring to specific people in the provided l...[]\u2714\ufe0f [True]
3[Japan, began, the, defence, of, their, Asian, Cup, title, with, a...[]There are no specific people mentioned in the provided tokens. The...[]\u2714\ufe0f [True]
4[But, China, saw, their, luck, desert, them, in, the, second, matc...[]There are no tokens referring to specific people in the provided l...[]\u2714\ufe0f [True]
..................
195['The', 'Wallabies', 'have', 'their', 'sights', 'set', 'on', 'a', ...[David, Campese]The extracted tokens refer to a specific person mentioned in the t...[David, Campese]\u2714\ufe0f [True]
196['The', 'Wallabies', 'currently', 'have', 'no', 'plans', 'to', 'ma...[]There are no specific individuals mentioned in the provided tokens...[]\u2714\ufe0f [True]
197['Campese', 'will', 'be', 'up', 'against', 'a', 'familiar', 'foe',...[Campese, Rob, Andrew]The tokens include the names \"Campese\" and \"Rob Andrew,\" both of w...[Campese, Rob, Andrew]\u2714\ufe0f [True]
198['\"', 'Campo', 'has', 'a', 'massive', 'following', 'in', 'this', '...[Campo, Andrew]The extracted tokens refer to specific people mentioned in the tex...[Campo, Andrew]\u2714\ufe0f [True]
199['On', 'tour', ',', 'Australia', 'have', 'won', 'all', 'four', 'te...[]There are no specific people mentioned in the provided tokens. The...[]\u2714\ufe0f [True]
\n", + "

200 rows \u00d7 5 columns

\n", + "
" + ], + "text/plain": [ + " tokens \\\n", + "0 [SOCCER, -, JAPAN, GET, LUCKY, WIN, ,, CHINA, IN, SURPRISE, DEFEAT... \n", + "1 [Nadim, Ladki] \n", + "2 [AL-AIN, ,, United, Arab, Emirates, 1996-12-06] \n", + "3 [Japan, began, the, defence, of, their, Asian, Cup, title, with, a... \n", + "4 [But, China, saw, their, luck, desert, them, in, the, second, matc... \n", + ".. ... \n", + "195 ['The', 'Wallabies', 'have', 'their', 'sights', 'set', 'on', 'a', ... \n", + "196 ['The', 'Wallabies', 'currently', 'have', 'no', 'plans', 'to', 'ma... \n", + "197 ['Campese', 'will', 'be', 'up', 'against', 'a', 'familiar', 'foe',... \n", + "198 ['\"', 'Campo', 'has', 'a', 'massive', 'following', 'in', 'this', '... \n", + "199 ['On', 'tour', ',', 'Australia', 'have', 'won', 'all', 'four', 'te... \n", + "\n", + " expected_extracted_people \\\n", + "0 [CHINA] \n", + "1 [Nadim, Ladki] \n", + "2 [] \n", + "3 [] \n", + "4 [] \n", + ".. ... \n", + "195 [David, Campese] \n", + "196 [] \n", + "197 [Campese, Rob, Andrew] \n", + "198 [Campo, Andrew] \n", + "199 [] \n", + "\n", + " rationale \\\n", + "0 There are no specific people mentioned in the provided tokens. The... \n", + "1 The tokens \"Nadim Ladki\" refer to a specific individual. Both toke... \n", + "2 There are no tokens referring to specific people in the provided l... \n", + "3 There are no specific people mentioned in the provided tokens. The... \n", + "4 There are no tokens referring to specific people in the provided l... \n", + ".. ... \n", + "195 The extracted tokens refer to a specific person mentioned in the t... \n", + "196 There are no specific individuals mentioned in the provided tokens... \n", + "197 The tokens include the names \"Campese\" and \"Rob Andrew,\" both of w... \n", + "198 The extracted tokens refer to specific people mentioned in the tex... \n", + "199 There are no specific people mentioned in the provided tokens. The... \n", + "\n", + " extracted_people extraction_correctness_metric \n", + "0 [] \n", + "1 [Nadim, Ladki] \u2714\ufe0f [True] \n", + "2 [] \u2714\ufe0f [True] \n", + "3 [] \u2714\ufe0f [True] \n", + "4 [] \u2714\ufe0f [True] \n", + ".. ... ... \n", + "195 [David, Campese] \u2714\ufe0f [True] \n", + "196 [] \u2714\ufe0f [True] \n", + "197 [Campese, Rob, Andrew] \u2714\ufe0f [True] \n", + "198 [Campo, Andrew] \u2714\ufe0f [True] \n", + "199 [] \u2714\ufe0f [True] \n", + "\n", + "[200 rows x 5 columns]" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "data": { + "text/plain": [ + "93.0" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "evaluate_correctness(optimized_people_extractor, devset=test_set)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Inspect Optimized Program's Prompt\n", + "\n", + "After optimizing the program, we can inspect the history of interactions to see how DSPy has augmented the program's prompt with few-shot examples. This step demonstrates:\n", + "- The structure of the prompt used by the program.\n", + "- How few-shot examples are added to guide the model's behavior.\n", + "\n", + "Use `inspect_history(n=1)` to view the last interaction and analyze the generated prompt." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "\n", + "\n", + "\n", + "\u001b[34m[2024-11-18T21:15:00.584497]\u001b[0m\n", + "\n", + "\u001b[31mSystem message:\u001b[0m\n", + "\n", + "Your input fields are:\n", + "1. `tokens` (list[str]): tokenized text\n", + "\n", + "Your output fields are:\n", + "1. `rationale` (str): ${produce the extracted_people}. We ...\n", + "2. `extracted_people` (list[str]): all tokens referring to specific people extracted from the tokenized text\n", + "\n", + "All interactions will be structured in the following way, with the appropriate values filled in.\n", + "\n", + "[[ ## tokens ## ]]\n", + "{tokens}\n", + "\n", + "[[ ## rationale ## ]]\n", + "{rationale}\n", + "\n", + "[[ ## extracted_people ## ]]\n", + "{extracted_people} # note: the value you produce must be pareseable according to the following JSON schema: {\"type\": \"array\", \"items\": {\"type\": \"string\"}}\n", + "\n", + "[[ ## completed ## ]]\n", + "\n", + "In adhering to this structure, your objective is: \n", + " In a high-stakes situation where accurate identification of individuals is critical for regulatory compliance and public health communication, extract contiguous tokens referring to specific people from the provided list of string tokens. Ensure that you output each identified individual as separate tokens without combining multiple tokens into a single value. This task is essential for ensuring clarity and accountability in communications pertaining to EU regulations and health matters.\n", + "\n", + "\n", + "\u001b[31mUser message:\u001b[0m\n", + "\n", + "[[ ## tokens ## ]]\n", + "[\"He\", \"said\", \"further\", \"scientific\", \"study\", \"was\", \"required\", \"and\", \"if\", \"it\", \"was\", \"found\", \"that\", \"action\", \"was\", \"needed\", \"it\", \"should\", \"be\", \"taken\", \"by\", \"the\", \"European\", \"Union\", \".\"]\n", + "\n", + "Respond with the corresponding output fields, starting with the field `[[ ## rationale ## ]]`, then `[[ ## extracted_people ## ]]` (must be formatted as a valid Python list[str]), and then ending with the marker for `[[ ## completed ## ]]`.\n", + "\n", + "\n", + "\u001b[31mAssistant message:\u001b[0m\n", + "\n", + "[[ ## rationale ## ]]\n", + "There are no specific people mentioned in the provided tokens. The only reference is to the \"European Union,\" which is an organization rather than a specific person. We therefore conclude that there are no extracted people.\n", + "\n", + "[[ ## extracted_people ## ]]\n", + "[]\n", + "\n", + "[[ ## completed ## ]]\n", + "\n", + "\n", + "\u001b[31mUser message:\u001b[0m\n", + "\n", + "[[ ## tokens ## ]]\n", + "[\"BRUSSELS\", \"1996-08-22\"]\n", + "\n", + "Respond with the corresponding output fields, starting with the field `[[ ## rationale ## ]]`, then `[[ ## extracted_people ## ]]` (must be formatted as a valid Python list[str]), and then ending with the marker for `[[ ## completed ## ]]`.\n", + "\n", + "\n", + "\u001b[31mAssistant message:\u001b[0m\n", + "\n", + "[[ ## rationale ## ]]\n", + "There are no tokens referring to specific people in the provided list. We only have a location and a date.\n", + "\n", + "[[ ## extracted_people ## ]]\n", + "[]\n", + "\n", + "[[ ## completed ## ]]\n", + "\n", + "\n", + "\u001b[31mUser message:\u001b[0m\n", + "\n", + "[[ ## tokens ## ]]\n", + "[\"Fischler\", \"proposed\", \"EU-wide\", \"measures\", \"after\", \"reports\", \"from\", \"Britain\", \"and\", \"France\", \"that\", \"under\", \"laboratory\", \"conditions\", \"sheep\", \"could\", \"contract\", \"Bovine\", \"Spongiform\", \"Encephalopathy\", \"(\", \"BSE\", \")\", \"--\", \"mad\", \"cow\", \"disease\", \".\"]\n", + "\n", + "Respond with the corresponding output fields, starting with the field `[[ ## rationale ## ]]`, then `[[ ## extracted_people ## ]]` (must be formatted as a valid Python list[str]), and then ending with the marker for `[[ ## completed ## ]]`.\n", + "\n", + "\n", + "\u001b[31mAssistant message:\u001b[0m\n", + "\n", + "[[ ## rationale ## ]]\n", + "The extracted tokens refer to specific people mentioned in the text. In this case, \"Fischler\" is the only token that refers to a specific person. We extract this token as it is contiguous and represents an individual involved in the proposed measures.\n", + "\n", + "[[ ## extracted_people ## ]]\n", + "[\"Fischler\"]\n", + "\n", + "[[ ## completed ## ]]\n", + "\n", + "\n", + "\u001b[31mUser message:\u001b[0m\n", + "\n", + "[[ ## tokens ## ]]\n", + "[\"Germany\", \"'s\", \"representative\", \"to\", \"the\", \"European\", \"Union\", \"'s\", \"veterinary\", \"committee\", \"Werner\", \"Zwingmann\", \"said\", \"on\", \"Wednesday\", \"consumers\", \"should\", \"buy\", \"sheepmeat\", \"from\", \"countries\", \"other\", \"than\", \"Britain\", \"until\", \"the\", \"scientific\", \"advice\", \"was\", \"clearer\", \".\"]\n", + "\n", + "Respond with the corresponding output fields, starting with the field `[[ ## rationale ## ]]`, then `[[ ## extracted_people ## ]]` (must be formatted as a valid Python list[str]), and then ending with the marker for `[[ ## completed ## ]]`.\n", + "\n", + "\n", + "\u001b[31mAssistant message:\u001b[0m\n", + "\n", + "[[ ## rationale ## ]]\n", + "We extracted the name \"Werner Zwingmann\" from the tokenized text as it refers to a specific person mentioned in the context of a statement regarding sheepmeat consumption.\n", + "\n", + "[[ ## extracted_people ## ]]\n", + "[\"Werner\", \"Zwingmann\"]\n", + "\n", + "[[ ## completed ## ]]\n", + "\n", + "\n", + "\u001b[31mUser message:\u001b[0m\n", + "\n", + "[[ ## tokens ## ]]\n", + "[\"LONDON\", \"1996-12-06\"]\n", + "\n", + "Respond with the corresponding output fields, starting with the field `[[ ## rationale ## ]]`, then `[[ ## extracted_people ## ]]` (must be formatted as a valid Python list[str]), and then ending with the marker for `[[ ## completed ## ]]`.\n", + "\n", + "\n", + "\u001b[31mResponse:\u001b[0m\n", + "\n", + "\u001b[32m[[ ## rationale ## ]]\n", + "There are no tokens referring to specific people in the provided list. The tokens only include a location and a date.\n", + "\n", + "[[ ## extracted_people ## ]]\n", + "[]\n", + "\n", + "[[ ## completed ## ]]\u001b[0m\n", + "\n", + "\n", + "\n", + "\n", + "\n" + ] + } + ], + "source": [ + "dspy.inspect_history(n=1)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Keeping an eye on cost\n", + "\n", + "DSPy allows you to track the cost of your programs. The following code demonstrates how to obtain the cost of all LM calls made by the DSPy extractor program so far." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "0.26362742999999983" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "cost = sum([x['cost'] for x in lm.history if x['cost'] is not None]) # cost in USD, as calculated by LiteLLM for certain providers\n", + "cost" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Saving and Loading Optimized Programs\n", + "\n", + "DSPy supports saving and loading programs, enabling you to reuse optimized systems without the need to re-optimize from scratch. This feature is especially useful for deploying your programs in production environments or sharing them with collaborators.\n", + "\n", + "In this step, we'll save the optimized program to a file and demonstrate how to load it back for future use." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "['Marcello', 'Cuttitta']" + ] + }, + "execution_count": 11, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "optimized_people_extractor.save(\"optimized_extractor.json\")\n", + "\n", + "loaded_people_extractor = dspy.ChainOfThought(PeopleExtraction)\n", + "loaded_people_extractor.load(\"optimized_extractor.json\")\n", + "\n", + "loaded_people_extractor(tokens=[\"Italy\", \"recalled\", \"Marcello\", \"Cuttitta\"]).extracted_people" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Conclusion\n", + "\n", + "In this tutorial, we demonstrated how to:\n", + "- Use DSPy to build a modular, interpretable system for entity extraction.\n", + "- Evaluate and optimize the system using DSPy's built-in tools.\n", + "\n", + "By leveraging structured inputs and outputs, we ensured that the system was easy to understand and improve. The optimization process allowed us to quickly improve performance without manually crafting prompts or tweaking parameters.\n", + "\n", + "**Next Steps:**\n", + "- Experiment with extraction of other entity types (e.g., locations or organizations).\n", + "- Explore DSPy's other builtin modules like `ReAct` for more complex reasoning tasks.\n", + "- Use the system in larger workflows, such as large scale document processing or summarization." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.12" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 18b7e9ea1..3f842630b 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -6,55 +6,56 @@ repo_url: https://github.com/stanfordnlp/dspy repo_name: stanfordnlp/dspy edit_uri: blob/main/docs/docs/ -docs_dir: 'docs/' +docs_dir: "docs/" nav: - Home: index.md - Learn DSPy: - - Learning DSPy: learn/index.md - - DSPy Programming: - - Programming Overview: learn/programming/overview.md - - Language Models: learn/programming/language_models.md - - Signatures: learn/programming/signatures.md - - Modules: learn/programming/modules.md - - DSPy Evaluation: - - Evaluation Overview: learn/evaluation/overview.md - - Data Handling: learn/evaluation/data.md - - Metrics: learn/evaluation/metrics.md - - DSPy Optimization: - - Optimization Overview: learn/optimization/overview.md - - Optimizers: learn/optimization/optimizers.md - - Other References: - - Retrieval Clients: - - Azure: deep-dive/retrieval_models_clients/Azure.md - - ChromadbRM: deep-dive/retrieval_models_clients/ChromadbRM.md - - ClarifaiRM: deep-dive/retrieval_models_clients/ClarifaiRM.md - - ColBERTv2: deep-dive/retrieval_models_clients/ColBERTv2.md - - Custom RM Client: deep-dive/retrieval_models_clients/custom-rm-client.md - - DatabricksRM: deep-dive/retrieval_models_clients/DatabricksRM.md - - FaissRM: deep-dive/retrieval_models_clients/FaissRM.md - - LancedbRM: deep-dive/retrieval_models_clients/LancedbRM.md - - MilvusRM: deep-dive/retrieval_models_clients/MilvusRM.md - - MyScaleRM: deep-dive/retrieval_models_clients/MyScaleRM.md - - Neo4jRM: deep-dive/retrieval_models_clients/Neo4jRM.md - - QdrantRM: deep-dive/retrieval_models_clients/QdrantRM.md - - RAGatouilleRM: deep-dive/retrieval_models_clients/RAGatouilleRM.md - - SnowflakeRM: deep-dive/retrieval_models_clients/SnowflakeRM.md - - WatsonDiscovery: deep-dive/retrieval_models_clients/WatsonDiscovery.md - - WeaviateRM: deep-dive/retrieval_models_clients/WeaviateRM.md - - YouRM: deep-dive/retrieval_models_clients/YouRM.md + - Learning DSPy: learn/index.md + - DSPy Programming: + - Programming Overview: learn/programming/overview.md + - Language Models: learn/programming/language_models.md + - Signatures: learn/programming/signatures.md + - Modules: learn/programming/modules.md + - DSPy Evaluation: + - Evaluation Overview: learn/evaluation/overview.md + - Data Handling: learn/evaluation/data.md + - Metrics: learn/evaluation/metrics.md + - DSPy Optimization: + - Optimization Overview: learn/optimization/overview.md + - Optimizers: learn/optimization/optimizers.md + - Other References: + - Retrieval Clients: + - Azure: deep-dive/retrieval_models_clients/Azure.md + - ChromadbRM: deep-dive/retrieval_models_clients/ChromadbRM.md + - ClarifaiRM: deep-dive/retrieval_models_clients/ClarifaiRM.md + - ColBERTv2: deep-dive/retrieval_models_clients/ColBERTv2.md + - Custom RM Client: deep-dive/retrieval_models_clients/custom-rm-client.md + - DatabricksRM: deep-dive/retrieval_models_clients/DatabricksRM.md + - FaissRM: deep-dive/retrieval_models_clients/FaissRM.md + - LancedbRM: deep-dive/retrieval_models_clients/LancedbRM.md + - MilvusRM: deep-dive/retrieval_models_clients/MilvusRM.md + - MyScaleRM: deep-dive/retrieval_models_clients/MyScaleRM.md + - Neo4jRM: deep-dive/retrieval_models_clients/Neo4jRM.md + - QdrantRM: deep-dive/retrieval_models_clients/QdrantRM.md + - RAGatouilleRM: deep-dive/retrieval_models_clients/RAGatouilleRM.md + - SnowflakeRM: deep-dive/retrieval_models_clients/SnowflakeRM.md + - WatsonDiscovery: deep-dive/retrieval_models_clients/WatsonDiscovery.md + - WeaviateRM: deep-dive/retrieval_models_clients/WeaviateRM.md + - YouRM: deep-dive/retrieval_models_clients/YouRM.md - Tutorials: - - Tutorials Overview: tutorials/index.md - - Retrieval-Augmented Generation: tutorials/rag/index.ipynb - - Deployment: tutorials/deployment/index.md + - Tutorials Overview: tutorials/index.md + - Retrieval-Augmented Generation: tutorials/rag/index.ipynb + - Entity Extraction: tutorials/entity_extraction/index.ipynb + - Deployment: tutorials/deployment/index.md - Community: - - Community Resources: community/community-resources.md - - Use Cases: community/use-cases.md - - Roadmap: roadmap.md - - Contributing: community/how-to-contribute.md + - Community Resources: community/community-resources.md + - Use Cases: community/use-cases.md + - Roadmap: roadmap.md + - Contributing: community/how-to-contribute.md - FAQ: - - FAQ: faqs.md - - Cheatsheet: cheatsheet.md - + - FAQ: faqs.md + - Cheatsheet: cheatsheet.md + theme: name: material custom_dir: overrides @@ -78,26 +79,23 @@ theme: icon: material/weather-night name: Switch to dark mode primary: white - accent: black - - scheme: slate + accent: black + - scheme: slate toggle: icon: material/weather-sunny - name: Switch to light mode + name: Switch to light mode primary: black accent: lime icon: repo: fontawesome/brands/git-alt - edit: material/pencil + edit: material/pencil view: material/eye logo: static/img/dspy_logo.png favicon: static/img/logo.png - extra_css: - stylesheets/extra.css - - plugins: - social - search @@ -108,14 +106,13 @@ plugins: - redirects: redirect_maps: # Redirect /intro/ to the main page - 'intro/index.md': 'index.md' - 'intro.md': 'index.md' - - 'docs/quick-start/getting-started-01.md': 'tutorials/rag/index.ipynb' - 'docs/quick-start/getting-started-02.md': 'tutorials/rag/index.ipynb' - 'quick-start/getting-started-01.md': 'tutorials/rag/index.ipynb' - 'quick-start/getting-started-02.md': 'tutorials/rag/index.ipynb' + "intro/index.md": "index.md" + "intro.md": "index.md" + "docs/quick-start/getting-started-01.md": "tutorials/rag/index.ipynb" + "docs/quick-start/getting-started-02.md": "tutorials/rag/index.ipynb" + "quick-start/getting-started-01.md": "tutorials/rag/index.ipynb" + "quick-start/getting-started-02.md": "tutorials/rag/index.ipynb" extra: social: