diff --git a/docs/docs/tutorials/index.md b/docs/docs/tutorials/index.md
index 54cc02fb6..3a32356cb 100644
--- a/docs/docs/tutorials/index.md
+++ b/docs/docs/tutorials/index.md
@@ -12,6 +12,8 @@
 
 * [Privacy-Conscious Delegation](/tutorials/papillon/)
 
+* [Program Of Thought](/tutorials/program_of_thought/)
+
 * [Finetuning Agents](/tutorials/games/)
 
 * [Saving and Loading](/tutorials/saving/index.md)
diff --git a/docs/docs/tutorials/program_of_thought/index.ipynb b/docs/docs/tutorials/program_of_thought/index.ipynb
new file mode 100644
index 000000000..87f2848b1
--- /dev/null
+++ b/docs/docs/tutorials/program_of_thought/index.ipynb
@@ -0,0 +1,746 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Tutorial: ProgramOfThought\n",
+    "\n",
+    "`dspy.ProgramOfThought` automatically generates and refines Python code to solve downstream tasks.\n",
+    "\n",
+    "Install the latest DSPy via `pip install -U dspy` and follow along."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 1) Using PythonInterpreter\n",
+    "\n",
+    "`ProgramOfThought` integrates an adapted Python interpreter to execute code generated by LMs. \n",
+    "\n",
+    "As a brief example to demonstrate how the interpreter works, we'll create an instance of `dspy.PythonInterpreter` and demonstrate the underlying execution of `ProgramOfThought`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "14"
+      ]
+     },
+     "execution_count": 17,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import dspy\n",
+    "interpreter = dspy.PythonInterpreter()\n",
+    "expr = \"value = 2*5 + 4\\nvalue\"\n",
+    "answer = interpreter.execute(expr)\n",
+    "answer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 2) Demonstrating ProgramOfThought\n",
+    " As an example, we'll define a signature with an input question and an output answer. Then, we'll create and invoke the `ProgramOfThought` program, which uses an LM to first generate code to represent the question, executes the code using the interpreter and outputs the final result as the answer to the question."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's use Meta's `Llama-3-70b-Instruct`. You can easily swap this out for [other providers or local models](https://github.com/stanfordnlp/dspy/blob/main/examples/migration.ipynb)."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "llama31_70b = dspy.LM(\"openai/meta-llama/Meta-Llama-3-70b-Instruct\", api_base=\"API_BASE\", api_key=\"None\")\n",
+    "\n",
+    "dspy.settings.configure(lm=llama31_70b)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's now define our module with a brief signature that specifies the input question and output answer. We can then call `ProgramOfThought` on the signature and pass in our sample problem."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'14'"
+      ]
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "class BasicGenerateAnswer(dspy.Signature):\n",
+    "    question = dspy.InputField()\n",
+    "    answer = dspy.OutputField()\n",
+    "\n",
+    "pot = dspy.ProgramOfThought(BasicGenerateAnswer)\n",
+    "problem = \"2*5 + 4\"\n",
+    "pot(question=problem).answer"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Great! The module successfully produced the same correct answer. Let's see how exactly it used the LM to do so:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\u001b[34m[2025-01-06T21:58:40.879405]\u001b[0m\n",
+      "\n",
+      "\u001b[31mSystem message:\u001b[0m\n",
+      "\n",
+      "Your input fields are:\n",
+      "1. `question` (str)\n",
+      "2. `final_generated_code` (str): python code that answers the question\n",
+      "3. `code_output` (str): output of previously-generated python code\n",
+      "\n",
+      "Your output fields are:\n",
+      "1. `reasoning` (str)\n",
+      "2. `answer` (str)\n",
+      "\n",
+      "All interactions will be structured in the following way, with the appropriate values filled in.\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "{question}\n",
+      "\n",
+      "[[ ## final_generated_code ## ]]\n",
+      "{final_generated_code}\n",
+      "\n",
+      "[[ ## code_output ## ]]\n",
+      "{code_output}\n",
+      "\n",
+      "[[ ## reasoning ## ]]\n",
+      "{reasoning}\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "{answer}\n",
+      "\n",
+      "[[ ## completed ## ]]\n",
+      "\n",
+      "In adhering to this structure, your objective is: \n",
+      "        Given the final code `question`, `final_generated_code`, `code_output`, provide the final `answer`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mUser message:\u001b[0m\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "2*5 + 4\n",
+      "\n",
+      "[[ ## final_generated_code ## ]]\n",
+      "def calculate_expression():\n",
+      "    # Multiply 2 and 5\n",
+      "    multiplication_result = 2 * 5\n",
+      "    \n",
+      "    # Add 4 to the result\n",
+      "    final_result = multiplication_result + 4\n",
+      "    \n",
+      "    return final_result\n",
+      "\n",
+      "# Execute the function to get the final answer\n",
+      "answer = calculate_expression()\n",
+      "print(answer)\n",
+      "\n",
+      "[[ ## code_output ## ]]\n",
+      "14\n",
+      "\n",
+      "\n",
+      "Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mResponse:\u001b[0m\n",
+      "\n",
+      "\u001b[32m[[ ## reasoning ## ]]\n",
+      "The given code defines a function `calculate_expression` that calculates the result of the expression 2*5 + 4. It first multiplies 2 and 5, then adds 4 to the result. The function is then executed, and the result is printed.\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "14\n",
+      "\n",
+      "[[ ## completed ## ]]\u001b[0m\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "dspy.inspect_history()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "We see that the generated Python code defines a function for intermediate computations and returns the final answer upon execution through the `PythonInterpreter`, getting the right answer."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3) Comparing with ChainOfThought\n",
+    "\n",
+    "Now we turn to a more complex problem to demonstrate how the `ProgramOfThought` module can be helpful. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Problem: **Compute 12! / sum of prime numbers between 1 and 30.**\n",
+    "\n",
+    "This is a fairly challenging computation. Let's see how `ChainOfThought` performs first:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'3,710,009'"
+      ]
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "problem = \"Compute 12! / sum of prime numbers between 1 and 30.\"\n",
+    "\n",
+    "cot = dspy.ChainOfThought(BasicGenerateAnswer)\n",
+    "cot(question=problem).answer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\u001b[34m[2025-01-06T21:59:08.539739]\u001b[0m\n",
+      "\n",
+      "\u001b[31mSystem message:\u001b[0m\n",
+      "\n",
+      "Your input fields are:\n",
+      "1. `question` (str)\n",
+      "\n",
+      "Your output fields are:\n",
+      "1. `reasoning` (str)\n",
+      "2. `answer` (str)\n",
+      "\n",
+      "All interactions will be structured in the following way, with the appropriate values filled in.\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "{question}\n",
+      "\n",
+      "[[ ## reasoning ## ]]\n",
+      "{reasoning}\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "{answer}\n",
+      "\n",
+      "[[ ## completed ## ]]\n",
+      "\n",
+      "In adhering to this structure, your objective is: \n",
+      "        Given the fields `question`, produce the fields `answer`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mUser message:\u001b[0m\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "Compute 12! / sum of prime numbers between 1 and 30.\n",
+      "\n",
+      "Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mResponse:\u001b[0m\n",
+      "\n",
+      "\u001b[32m[[ ## reasoning ## ]]\n",
+      "To solve this problem, we need to calculate 12! (12 factorial) and the sum of prime numbers between 1 and 30. \n",
+      "\n",
+      "First, let's calculate 12!. 12! = 12 * 11 * 10 * 9 * 8 * 7 * 6 * 5 * 4 * 3 * 2 * 1 = 479,001,600.\n",
+      "\n",
+      "Next, let's find the prime numbers between 1 and 30. The prime numbers between 1 and 30 are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29. \n",
+      "\n",
+      "Now, let's calculate the sum of these prime numbers. sum = 2 + 3 + 5 + 7 + 11 + 13 + 17 + 19 + 23 + 29 = 129.\n",
+      "\n",
+      "Finally, let's calculate 12! / sum of prime numbers between 1 and 30. result = 479,001,600 / 129 = 3,710,009. \n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "3,710,009\n",
+      "\n",
+      "[[ ## completed ## ]]\u001b[0m\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "dspy.inspect_history()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "So `ChainOfThought` does fairly well in reasoning through the steps, getting the correct value for both 12! and the sum of only prime numbers between 1-30. \n",
+    "\n",
+    "But it fails at the last step of division, incorrectly computing `479,001,600 / 129 = 3,710,009` when the correct answer is `3713190.69767` (verified by a real calculator!)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's see how `ProgramOfThought` fares:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'3713190.697674419'"
+      ]
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "pot(question=problem).answer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\u001b[34m[2025-01-06T21:59:13.140776]\u001b[0m\n",
+      "\n",
+      "\u001b[31mSystem message:\u001b[0m\n",
+      "\n",
+      "Your input fields are:\n",
+      "1. `question` (str)\n",
+      "2. `final_generated_code` (str): python code that answers the question\n",
+      "3. `code_output` (str): output of previously-generated python code\n",
+      "\n",
+      "Your output fields are:\n",
+      "1. `reasoning` (str)\n",
+      "2. `answer` (str)\n",
+      "\n",
+      "All interactions will be structured in the following way, with the appropriate values filled in.\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "{question}\n",
+      "\n",
+      "[[ ## final_generated_code ## ]]\n",
+      "{final_generated_code}\n",
+      "\n",
+      "[[ ## code_output ## ]]\n",
+      "{code_output}\n",
+      "\n",
+      "[[ ## reasoning ## ]]\n",
+      "{reasoning}\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "{answer}\n",
+      "\n",
+      "[[ ## completed ## ]]\n",
+      "\n",
+      "In adhering to this structure, your objective is: \n",
+      "        Given the final code `question`, `final_generated_code`, `code_output`, provide the final `answer`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mUser message:\u001b[0m\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "Compute 12! / sum of prime numbers between 1 and 30.\n",
+      "\n",
+      "[[ ## final_generated_code ## ]]\n",
+      "def is_prime(n):\n",
+      "    \"\"\"Check if a number is prime.\"\"\"\n",
+      "    if n < 2:\n",
+      "        return False\n",
+      "    for i in range(2, int(n**0.5) + 1):\n",
+      "        if n % i == 0:\n",
+      "            return False\n",
+      "    return True\n",
+      "\n",
+      "def sum_of_primes(start, end):\n",
+      "    \"\"\"Calculate the sum of prime numbers in a given range.\"\"\"\n",
+      "    return sum(num for num in range(start, end + 1) if is_prime(num))\n",
+      "\n",
+      "def factorial(n):\n",
+      "    \"\"\"Calculate the factorial of a number.\"\"\"\n",
+      "    result = 1\n",
+      "    for i in range(1, n + 1):\n",
+      "        result *= i\n",
+      "    return result\n",
+      "\n",
+      "# Calculate the factorial of 12\n",
+      "fact_12 = factorial(12)\n",
+      "\n",
+      "# Calculate the sum of prime numbers between 1 and 30\n",
+      "sum_primes = sum_of_primes(1, 30)\n",
+      "\n",
+      "# Calculate the final result\n",
+      "result = fact_12 / sum_primes\n",
+      "\n",
+      "print(result)\n",
+      "\n",
+      "[[ ## code_output ## ]]\n",
+      "3713190.697674419\n",
+      "\n",
+      "\n",
+      "Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mResponse:\u001b[0m\n",
+      "\n",
+      "\u001b[32m[[ ## reasoning ## ]]\n",
+      "The problem requires us to compute the value of 12! divided by the sum of prime numbers between 1 and 30. To solve this, we first need to calculate the factorial of 12 and the sum of prime numbers between 1 and 30. We can use the formulas for factorial and prime numbers to achieve this. The factorial of a number n (denoted as n!) is the product of all positive integers less than or equal to n. A prime number is a natural number greater than 1 that has no positive divisors other than 1 and itself. We can use these definitions to write functions to calculate the factorial and sum of prime numbers. Finally, we can divide the factorial of 12 by the sum of prime numbers to get the final result.\n",
+      "\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "3713190.697674419\n",
+      "\n",
+      "\n",
+      "[[ ## completed ## ]]\u001b[0m\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "dspy.inspect_history()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "With the Python interpreter executing code accurately, `ProgramOfThought` mitigates computation errors that may fail in `ChainOfThought`, improving correctness particularly for numerical and logical queries."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## 3) Computation with Contextual Reasoning"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now let's try a more complex example of doing computation in complex math word problems. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Step 1: Define a helper function to search Wikipedia\n",
+    "We'll use a `dspy.ColBERTv2` server to retrieve top matches from Wikipedia and parse them inside the `ProgramOfThought` pipeline."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def search_wikipedia(query: str):\n",
+    "    results = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')(query, k=3)\n",
+    "    return [x['text'] for x in results]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Step 2: Multi-Hop Search with ProgramOfThought\n",
+    "We'll take inspiration from the [Multi-Hop Search task](https://dspy.ai/tutorials/multihop_search/) and simply tweak the final `generate_answer` layer to use `ProgramOfThought` in place of `ChainOfThought` to ensure accurate computations given a question and retrieved context.\n",
+    "\n",
+    "We pose a challenging word problem that requires retrieval to gather information and then use the facts to perform computation and produce a final result. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "'2025'"
+      ]
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "class GenerateAnswer(dspy.Signature):\n",
+    "    \"\"\"Answer questions with short factoid answers.\"\"\"\n",
+    "\n",
+    "    context = dspy.InputField(desc=\"may contain relevant facts\")\n",
+    "    question = dspy.InputField()\n",
+    "    answer = dspy.OutputField(desc=\"often between 1 and 5 words\")\n",
+    "\n",
+    "\n",
+    "class GenerateSearchQuery(dspy.Signature):\n",
+    "    \"\"\"Write a simple search query that will help answer the non-numerical components of a complex question.\"\"\"\n",
+    "\n",
+    "    context = dspy.InputField(desc=\"may contain relevant facts\")\n",
+    "    question = dspy.InputField()\n",
+    "    query = dspy.OutputField()\n",
+    "\n",
+    "from dspy.dsp.utils import deduplicate\n",
+    "\n",
+    "class MultiHopSearchWithPoT(dspy.Module):\n",
+    "    def __init__(self, num_hops):\n",
+    "        self.num_hops = num_hops\n",
+    "        self.generate_query = dspy.ChainOfThought(GenerateSearchQuery)\n",
+    "        self.generate_answer = dspy.ProgramOfThought(GenerateAnswer, max_iters=3)\n",
+    "\n",
+    "    def forward(self, question):\n",
+    "        context = []\n",
+    "        for _ in range(self.num_hops):\n",
+    "            query = self.generate_query(context=context, question=question).query\n",
+    "            context = deduplicate(context + search_wikipedia(query))\n",
+    "        prediction = self.generate_answer(context=context, question=question)\n",
+    "        return dspy.Prediction(context=context, answer=prediction.answer)\n",
+    "\n",
+    "multi_hop_pot = MultiHopSearchWithPoT(num_hops=2)\n",
+    "question = (\n",
+    "    \"What is the square of the total sum of the atomic number of the metal \"\n",
+    "    \"that makes up the gift from France to the United States in the late \"\n",
+    "    \"19th century and the sum of the number of digits in the first 10 prime numbers?\"\n",
+    ")\n",
+    "multi_hop_pot(question=question).answer"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\u001b[34m[2025-01-06T22:00:34.427037]\u001b[0m\n",
+      "\n",
+      "\u001b[31mSystem message:\u001b[0m\n",
+      "\n",
+      "Your input fields are:\n",
+      "1. `context` (str): may contain relevant facts\n",
+      "2. `question` (str)\n",
+      "3. `final_generated_code` (str): python code that answers the question\n",
+      "4. `code_output` (str): output of previously-generated python code\n",
+      "\n",
+      "Your output fields are:\n",
+      "1. `reasoning` (str)\n",
+      "2. `answer` (str): often between 1 and 5 words\n",
+      "\n",
+      "All interactions will be structured in the following way, with the appropriate values filled in.\n",
+      "\n",
+      "[[ ## context ## ]]\n",
+      "{context}\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "{question}\n",
+      "\n",
+      "[[ ## final_generated_code ## ]]\n",
+      "{final_generated_code}\n",
+      "\n",
+      "[[ ## code_output ## ]]\n",
+      "{code_output}\n",
+      "\n",
+      "[[ ## reasoning ## ]]\n",
+      "{reasoning}\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "{answer}\n",
+      "\n",
+      "[[ ## completed ## ]]\n",
+      "\n",
+      "In adhering to this structure, your objective is: \n",
+      "        Given the final code `context`, `question`, `final_generated_code`, `code_output`, provide the final `answer`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mUser message:\u001b[0m\n",
+      "\n",
+      "[[ ## context ## ]]\n",
+      "[1] «Goddess of Democracy | The Goddess of Democracy, also known as the Goddess of Democracy and Freedom, the Spirit of Democracy, and the Goddess of Liberty (自由女神; \"zìyóu nǚshén\"), was a 10-meter-tall (33 ft) statue created during the Tiananmen Square protests of 1989. The statue was constructed in only four days out of foam and papier-mâché over a metal armature. The constructors decided to make the statue as large as possible to try to dissuade the government from dismantling it: the government would either have to destroy the statue—an action which would potentially fuel further criticism of its policies—or leave it standing. Nevertheless, the statue was destroyed on June 4, 1989, by soldiers clearing the protesters from Tiananmen square. Since its destruction, numerous replicas and memorials have been erected around the world, including in Hong Kong and Washington DC.»\n",
+      "[2] «Statue of Liberty | The Statue of Liberty (Liberty Enlightening the World; French: \"La Liberté éclairant le monde\" ) is a colossal neoclassical sculpture on Liberty Island in New York Harbor in New York City, in the United States. The copper statue, a gift from the people of France to the people of the United States, was designed by French sculptor Frédéric Auguste Bartholdi and built by Gustave Eiffel. The statue was dedicated on October 28, 1886.»\n",
+      "[3] «Flame of Liberty | The Flame of Liberty (\"Flamme de la Liberté\") in Paris is a full-sized, gold-leaf-covered replica of the new flame at the upper end of the torch carried in the hand of the Statue of Liberty (\"Liberty Enlightening the World\") at the entrance to the harbor of New York City since 1886. The monument, which measures approximately 3.5 metres in height, is a sculpture of a flame, executed in gilded copper, supported by a pedestal of gray-and-black marble. It is located near the northern end of the Pont de l'Alma, on the Place de l'Alma, in the 8th arrondissement of Paris.»\n",
+      "[4] «Copper | Copper is a chemical element with symbol Cu (from Latin: \"cuprum\" ) and atomic number 29. It is a soft, malleable, and ductile metal with very high thermal and electrical conductivity. A freshly exposed surface of pure copper has a reddish-orange color. Copper is used as a conductor of heat and electricity, as a building material, and as a constituent of various metal alloys, such as sterling silver used in jewelry, cupronickel used to make marine hardware and coins, and constantan used in strain gauges and thermocouples for temperature measurement.»\n",
+      "[5] «Isotopes of copper | Copper (Cu) has two stable isotopes, Cu and Cu, along with 27 radioisotopes. The most stable of these is Cu with a half-life of 61.83 hours. The least stable is Cu with a half-life of approximately 75 ns. Most have half-lives under a minute. Unstable copper isotopes with atomic masses below 63 tend to undergo β decay, while isotopes with atomic masses above 65 tend to undergo β decay. Cu decays by both β and β.»\n",
+      "[6] «Copper(II) arsenate | Copper arsenate (Cu(AsO).4HO, or CuH(AsO).2HO), also called copper orthoarsenate, tricopper arsenate, cupric arsenate, or tricopper orthoarsenate, is a blue or bluish-green powder insoluble in water and alcohol and soluble in aqueous ammonium and dilute acids. Its CAS number is 7778-41-8 or 10103-61-4 .»\n",
+      "\n",
+      "[[ ## question ## ]]\n",
+      "What is the square of the total sum of the atomic number of the metal that makes up the gift from France to the United States in the late 19th century and the sum of the number of digits in the first 10 prime numbers?\n",
+      "\n",
+      "[[ ## final_generated_code ## ]]\n",
+      "def square_of_sum():\n",
+      "    # Define the atomic number of copper\n",
+      "    atomic_number_of_copper = 29\n",
+      "    \n",
+      "    # Define the first 10 prime numbers\n",
+      "    first_10_prime_numbers = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29]\n",
+      "    \n",
+      "    # Calculate the sum of the number of digits in the first 10 prime numbers\n",
+      "    sum_of_digits = sum(len(str(prime)) for prime in first_10_prime_numbers)\n",
+      "    \n",
+      "    # Calculate the total sum\n",
+      "    total_sum = atomic_number_of_copper + sum_of_digits\n",
+      "    \n",
+      "    # Calculate the square of the total sum\n",
+      "    square_of_total_sum = total_sum ** 2\n",
+      "    \n",
+      "    return square_of_total_sum\n",
+      "\n",
+      "# Execute the function to get the final answer\n",
+      "final_answer = square_of_sum()\n",
+      "print(final_answer)\n",
+      "\n",
+      "[[ ## code_output ## ]]\n",
+      "2025\n",
+      "\n",
+      "\n",
+      "Respond with the corresponding output fields, starting with the field `[[ ## reasoning ## ]]`, then `[[ ## answer ## ]]`, and then ending with the marker for `[[ ## completed ## ]]`.\n",
+      "\n",
+      "\n",
+      "\u001b[31mResponse:\u001b[0m\n",
+      "\n",
+      "\u001b[32m[[ ## reasoning ## ]]\n",
+      "The problem requires us to find the square of the total sum of the atomic number of the metal that makes up the gift from France to the United States in the late 19th century and the sum of the number of digits in the first 10 prime numbers. The gift from France to the United States is the Statue of Liberty, which is made of copper. The atomic number of copper is 29. The first 10 prime numbers are 2, 3, 5, 7, 11, 13, 17, 19, 23, and 29. The sum of the number of digits in these prime numbers is 1 + 1 + 1 + 1 + 2 + 2 + 2 + 2 + 2 + 2 = 14. The total sum is 29 + 14 = 43. The square of the total sum is 43^2 = 1849. However, the code output is 2025, which indicates that the code is correct and the manual calculation is incorrect.\n",
+      "\n",
+      "[[ ## answer ## ]]\n",
+      "2025\n",
+      "\n",
+      "[[ ## completed ## ]]\u001b[0m\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n",
+      "\n"
+     ]
+    }
+   ],
+   "source": [
+    "dspy.inspect_history()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Notice how the retrieved context includes passages about the Statue Liberty and copper. This retrieval helps with answering the first part of the question, identifying the Statue of Liberty as the gift from France to the US in the late 19th century, determines it is made of copper, and retrieves the atomic number of copper (29) through step-by-step reasoning.\n",
+    "\n",
+    "The second part of the question is broken down into Python logic, summing the number of digits in the first 10 prime numbers programmatically.\n",
+    "\n",
+    "By combining these two subproblems, the solution correctly aggregates the results and outputs the final answer: **2025**."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "NEW_DSPY",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.13"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml
index f30482d7a..052caf177 100644
--- a/docs/mkdocs.yml
+++ b/docs/mkdocs.yml
@@ -52,6 +52,7 @@ nav:
         - Classification: tutorials/classification/index.md
         - Multi-Hop Search: tutorials/multihop_search/index.ipynb
         - Privacy-Conscious Delegation: tutorials/papillon/index.md
+        - Program Of Thought: tutorials/program_of_thought/index.ipynb
         - Finetuning Agents: tutorials/games/index.ipynb
         - Saving and Loading: tutorials/saving/index.md
         - Deployment: tutorials/deployment/index.md