From bb6ba03fd31bfff73aaec32e68308e1d10fb1a73 Mon Sep 17 00:00:00 2001 From: AnnePicus Date: Thu, 9 Jan 2025 19:51:59 +0200 Subject: [PATCH] DQI max-XORSAT English suggestions --- algorithms/dqi/dqi_max_xorsat.ipynb | 123 ++++++++++++++-------------- 1 file changed, 61 insertions(+), 62 deletions(-) diff --git a/algorithms/dqi/dqi_max_xorsat.ipynb b/algorithms/dqi/dqi_max_xorsat.ipynb index 4bfd9808c..0ffdd4797 100644 --- a/algorithms/dqi/dqi_max_xorsat.ipynb +++ b/algorithms/dqi/dqi_max_xorsat.ipynb @@ -13,70 +13,70 @@ "id": "3d22cc71-740c-459f-a9ee-3a020661a6e1", "metadata": {}, "source": [ - "## Introduction\n", + "This notebook relates to the paper \"Optimization by Decoded Quantum Interferometry\" (DQI) [[1](#DQI)], which introduces a quantum algorithm for combinatorial optimization problems.\n", "\n", - "The following demonstration will follow the paper \"Optimization by Decoded Quantum Interferometry\" (DQI) [[1](#DQI)], which introduces a quantum algorithm for combinatorial optimization problems.\n", + "The algorithm focuses on finding approximate solutions to the *max-LINSAT* problem, and takes advantage of the sparse Fourier spectrum of certain optimization functions.\n", "\n", - "The algorithm is focused on finding approximate solutions to the *max-LINSAT* problem, and takes advantage of the sparse Fourier spectrum of certain optimization functions.\n", - "\n", - "### max-LINSAT problem\n", + "## max-LINSAT Problem\n", "* **Input:** A matrix $B \\in \\mathbb{F}^{m \\times n}$ and $m$ functions $f_i : \\mathbb{F} \\rightarrow \\{+1, -1\\}$ for $i = 1, \\cdots, m $, where $\\mathbb{F}$ is a finite field.\n", "\n", " Define the objective function $f : \\mathbb{F}^n \\rightarrow \\mathbb{Z}$ to be $f(x) = \\sum_{i=1}^m f_i \\left( \\sum_{j=1}^n B_{ij} x_j \\right)$. \n", "\n", "* **Output:** a vector $x \\in \\mathbb{F}^n$ that best maximizes $f$.\n", "\n", - "The paper shows that for the problem of *Optimal Polynomial Intersection (OPI)*, a special case of the the *max-LINSAT*, the algorithm can reach a better approximation ratio than any known polynomial time classical algoritm.\n", + "The paper shows that for the problem of *Optimal Polynomial Intersection (OPI)*—a special case of the the *max-LINSAT*—the algorithm can reach a better approximation ratio than any known polynomial time classical algorithm.\n", "\n", - "We will demonstrate the algorithm in the setting of *max-XORSAT*, which is another special case of *max-LINSAT*, but is different from the *OPI* problem. Although in the setting of *max-XORSAT* a quantum advantage haven't been showed in the paper, it will be simpler for demonstration.\n", + "We demonstrate the algorithm in the setting of *max-XORSAT*, which is another special case of *max-LINSAT*, and is different from the *OPI* problem. Even though the paper does not show a quantum advantage with *max-XORSAT*, it is simpler to demonstrate.\n", "\n", - "### max-XORSAT problem\n", + "## max-XORSAT Problem\n", "\n", "* **Input:** A matrix $B \\in \\mathbb{F}_2^{m \\times n}$ and a vector $v \\in \\mathbb{F}_2^m$ with $m > n$.\n", "\n", - " Define the objective function $f : \\mathbb{F}_2^n \\rightarrow \\mathbb{Z}$ to be $f(x) = \\sum_{i=1}^m (-1)^{v_i + b_i \\cdot x} = \\sum_{i=1}^m f_i(x)$ (with $b_i$ the columns of $B$), which represents the number of staisfied constraints minus the number of unsatisfied constraints for the equation $Bx=v$. \n", + " Define the objective function $f : \\mathbb{F}_2^n \\rightarrow \\mathbb{Z}$ as $f(x) = \\sum_{i=1}^m (-1)^{v_i + b_i \\cdot x} = \\sum_{i=1}^m f_i(x)$ (with $b_i$ the columns of $B$), which represents the number of satisfied constraints minus the number of unsatisfied constraints for the equation $Bx=v$. \n", "\n", "* **Output:** a vector $x \\in \\mathbb{F}_2^n$ that best maximizes $f$.\n", "\n", "\n", - "The *max-XORSAT* problem is NP-hard. As an example, the *Max-Cut* problem is a special case of *max-XORSAT* where the number of 1s in each row is exactly 2. The DQI algorithm is focused on finding approximate solutions to the problem. " + "The *max-XORSAT* problem is NP-hard. As an example, the *Max-Cut* problem is a special case of *max-XORSAT* where the number of ones in each row is exactly two. The DQI algorithm focuses on finding approximate solutions to the problem. " ] }, { "cell_type": "markdown", - "id": "444ee983-d082-4283-bc76-067c3d4f1ae3", + "id": "de7db3db-7dcd-4cdc-827f-a2440a7dab15", "metadata": {}, "source": [ - "## Algorithm description\n", - "The strategy is to prepare the following state:\n", + "## Algorithm Description\n", + "The strategy is to prepare this state:\n", "$$\n", "|P(f)\\rangle = \\sum_{x\\in\\mathbb{F}_2^n}P(f(x))|x\\rangle\n", "$$\n", "\n", - "Where $P$ is a normalized polynomial. Choosing a good polynomial can bias the sampling of this state towards high $f$ values. The higher the degree $l$ of the polynomial, the better approximation ratio of the optimum we can get. The Hadamard spectrum of $|P(f)\\rangle$ is:\n", + "where $P$ is a normalized polynomial. Choosing a good polynomial can bias the sampling of the state towards high $f$ values. The higher the degree $l$ of the polynomial, the better approximation ratio of the optimum we can get. The Hadamard spectrum of $|P(f)\\rangle$ is\n", "$$\n", "\\sum_{k = 0}^{l} \\frac{w_k}{\\sqrt{\\binom{m}{k}}}\n", "\\sum_{\\substack{y \\in \\mathbb{F}_2^m \\\\ |y| = k}} (-1)^{v \\cdot y} |B^T y\\rangle\n", "$$\n", - "where $w_k$ are normalized weights that can be calculated from the coefficients of $P$. So in order to prepare $|P(f)\\rangle$ we will prepare prepare its hadamrd transform, then apply a Hadamard transform over it. It will take the following stages:\n", + "where $w_k$ are normalized weights that can be calculated from the coefficients of $P$. So, to prepare $|P(f)\\rangle$, we prepare its Hadamard transform, then apply the Hadamard transform over it. \n", + "\n", + "Stages:\n", "\n", - "1. Prepare $\\sum_{k=0}^l w_k|k\\rangle$\n", + "1. Prepare $\\sum_{k=0}^l w_k|k\\rangle$.\n", "\n", - "2. Translate the binary encoded $|k\\rangle$ to a unary encoded state $|k\\rangle_{unary} = |\\underbrace{1 \\cdots 1}_{k} \\underbrace{0 \\cdots 0}_{n - k} \\rangle$, resulting with the state $\\sum_{k=0}^l w_k|k\\rangle_{unary}$\n", + "2. Translate the binary encoded $|k\\rangle$ to a unary encoded state $|k\\rangle_{unary} = |\\underbrace{1 \\cdots 1}_{k} \\underbrace{0 \\cdots 0}_{n - k} \\rangle$, resulting in the state $\\sum_{k=0}^l w_k|k\\rangle_{unary}$.\n", "\n", - "3. Translate each $|k\\rangle_{unary}$ to a Dicke-State [[2](#Dicke)], resulting with the state $\\sum_{k = 0}^{l} \\frac{w_k}{\\sqrt{\\binom{m}{k}}}\n", - "\\sum_{\\substack{y \\in \\mathbb{F}_2^m \\\\ |y| = k}} |y\\rangle_m$\n", + "3. Translate each $|k\\rangle_{unary}$ to a Dicke state [[2](#Dicke)], resulting in the state $\\sum_{k = 0}^{l} \\frac{w_k}{\\sqrt{\\binom{m}{k}}}\n", + "\\sum_{\\substack{y \\in \\mathbb{F}_2^m \\\\ |y| = k}} |y\\rangle_m$.\n", "\n", - "4. For each $|y\\rangle_m$ calculate $(-1)^{v \\cdot y} |y\\rangle_m |B^T y\\rangle_n$, getting $\\sum_{k = 0}^{l} \\frac{w_k}{\\sqrt{\\binom{m}{k}}}\n", - "\\sum_{\\substack{y \\in \\mathbb{F}_2^m \\\\ |y| = k}} (-1)^{v \\cdot y} |y\\rangle_m |B^T y\\rangle_n$\n", + "4. For each $|y\\rangle_m$, calculate $(-1)^{v \\cdot y} |y\\rangle_m |B^T y\\rangle_n$, getting $\\sum_{k = 0}^{l} \\frac{w_k}{\\sqrt{\\binom{m}{k}}}\n", + "\\sum_{\\substack{y \\in \\mathbb{F}_2^m \\\\ |y| = k}} (-1)^{v \\cdot y} |y\\rangle_m |B^T y\\rangle_n$.\n", "\n", "5. Uncompute $|y\\rangle_m$ by decoding $|B^T y\\rangle_n$.\n", "\n", - "6. Apply Hadamard transform to get the desired $|P(f)\\rangle$\n", + "6. Apply the Hadamard transform to get the desired $|P(f)\\rangle$.\n", "\n", "\n", "\n", - "Step 5 is the heart of the algorithm. The decoding of $|B^T y\\rangle_n$ is in general an ill-defined problem, but when the hamming weight of $y$ is known to be limited by some integer l (the degree of $P$) , it might be feasible and even efficient, depending on the structure of the matrix $B$. The problem is equivalent to decoding error from syndrome [[3](#SYND)], when $B^T$ is the parity-check matrix.\n", + "Step 5 is the heart of the algorithm. The decoding of $|B^T y\\rangle_n$ is, in general, an ill-defined problem, but when the hamming weight of $y$ is known to be limited by some integer l (the degree of $P$) , it might be feasible and even efficient, depending on the structure of the matrix $B$. The problem is equivalent to the decoding error from syndrome [[3](#SYND)], where $B^T$ is the parity-check matrix.\n", "\n", "Figure 1 shows a layout of the resulting quantum program. Executing the quantum program guarantees that we sample `x` with high $f$ values with high probability (see the last plot in this notebook)." ] @@ -92,7 +92,7 @@ "metadata": {}, "source": [ "![image.png](attachment:9ee2175d-b027-4cc6-85f0-cf67e7cf0e33.png)\n", - "
Figure 1. The Full DQI circuit for a *MaxCut* problem. The `x` solutions are sampled from the `target` variable after the last Hadamard-Transform.
\n", + "
Figure 1. The full DQI circuit for a *MaxCut* problem. The `x` solutions are sampled from the `target` variable after the last Hadamard transform.
\n", "" ] }, @@ -101,7 +101,7 @@ "id": "39b24e85-9429-4f25-95e8-56920d9679b4", "metadata": {}, "source": [ - "## Defining The algorithm building-blocks" + "## Defining the Algorithm Building Blocks" ] }, { @@ -109,7 +109,7 @@ "id": "cd4fa7a6-54a3-4329-ac2b-a537441f2a91", "metadata": {}, "source": [ - "Next we define the needed building-blocks for all algorithm stages. Step 1 is omitted as we use the built-in `prepare_amplitudes` function." + "Next, we define the needed building-blocks for all algorithm stages. Step 1 is omitted as we use the built-in `prepare_amplitudes` function." ] }, { @@ -125,14 +125,14 @@ "id": "86e0c7d6-c7c9-483a-99c7-22b71e4ae8c9", "metadata": {}, "source": [ - "We use 3 different encodings here:\n", - "- **Binary Encoding**: Represents a number using binary bits, where each qubit corresponds to a binary place value. For example, the number 3 on 4 qubits is: $|1100\\rangle$.\n", - "- **One-hot Encoding**: Represents a number by activating a single qubit, with its position indicating the value. For example, the number 3 on 4 qubits is: $|0001\\rangle$.\n", - "- **Unary Encoding**: Represents a number by setting the first $k$ qubits to 1 $k$ is the number, and the rest to 0. For example, the number 3 on 4 qubits is $|1110\\rangle$.\n", + "We use three different encodings:\n", + "- **Binary encoding**: Represents a number using binary bits, where each qubit corresponds to a binary place value. For example, the number 3 on 4 qubits is $|1100\\rangle$.\n", + "- **One-hot encoding**: Represents a number by activating a single qubit, with its position indicating the value. For example, the number 3 on 4 qubits is $|0001\\rangle$.\n", + "- **Unary encoding**: Represents a number by setting the first $k$ qubits to 1 $k$ is the number, and the rest to 0. For example, the number 3 on 4 qubits is $|1110\\rangle$.\n", "\n", - "Specifically we will translate a binary (unsigned `QNum`) to one-hot encoding, and show how to convert the one-hot encoding to a unary encoding.\n", + "Specifically, we translate a binary (unsigned `QNum`) to one-hot encoding, and show how to convert the one-hot encoding to a unary encoding.\n", "\n", - "The conversions will be done inplace, meaning that the same binary encoded quantum variable will be extended to represent the target encoding.\n", + "The conversions are done in place, meaning that the same binary encoded quantum variable is extended to represent the target encoding.\n", "The logic is based on [this post](https://quantumcomputing.stackexchange.com/questions/5526/garbage-free-reversible-binary-to-unary-decoder-construction)." ] }, @@ -214,7 +214,7 @@ "id": "aa77eaf3-2dbb-4a18-8e79-7e247a8f3b27", "metadata": {}, "source": [ - "Now test the function on the conversion of the number 8 from binary to unary:" + "Now, we test the function on the conversion of the number 8 from binary to unary:" ] }, { @@ -263,12 +263,12 @@ "id": "f85c90aa-001c-4869-989c-26f281cb2b32", "metadata": {}, "source": [ - "Transform a unary input quantum variable to a Dicke state, such that:\n", + "We transform a unary input quantum variable to a Dicke state, such that\n", "$$\n", "U|\\underbrace{1 \\cdots 1}_{k} \\underbrace{0 \\cdots 0}_{n - k} \\rangle = \\sum_{k = 0}^{l} \\frac{1}{\\sqrt{\\binom{n}{k}}}\n", "\\sum_{\\substack{|y| = k}} |y\\rangle_n\n", "$$\n", - "This recursive implementation is based on [[2](#Dicke)]. The recursion is working bit by bit." + "This recursive implementation is based on [[2](#Dicke)]. The recursion works bit by bit." ] }, { @@ -333,7 +333,7 @@ "id": "f7c199ac-94d5-48dd-850a-cf3bd3b67cb0", "metadata": {}, "source": [ - "Test the function for Dicke state of 6 qubits with 4 1's:" + "We test the function for the Dicke state of six qubits with four ones:" ] }, { @@ -398,7 +398,7 @@ "id": "30728a21-0863-497f-88e8-e3761e01fcc0", "metadata": {}, "source": [ - "### Step 4: Vector and matrix products" + "### Step 4: Vector and Matrix Products" ] }, { @@ -430,7 +430,7 @@ "id": "8ed1df9a-26f0-4327-8f4c-6ebc6470697c", "metadata": {}, "source": [ - "## Assembling the full MAX-XOR-SAT algorithm" + "## Assembling the Full max-XORSAT Algorithm" ] }, { @@ -438,16 +438,15 @@ "id": "2dc5da56-467b-4c57-ab06-af0a99adbfd8", "metadata": {}, "source": [ - "Here we combine all the building-blocks to the full algorithm. To save qubits, the decoding will be done inplace directly onto the \n", - "$|y\\rangle$ register. The only remaining part is the decoding part, that will be treated after choosing the problem to optimize, as it depends on the input structure.\n", + "Here, we combine all the building blocks into the full algorithm. To save qubits, the decoding is done in place, directly onto the $|y\\rangle$ register. The only remaining part is the decoding, that will be treated after choosing the problem to optimize, as it depends on the input structure.\n", "\n", "`dqi_max_xor_sat` is the main quantum function of the algorithm. It expects the following arguments:\n", "- `B`: the (classical) constraints matrix of the optimization problem\n", "- `v`: the (classical) constraints vector of the optimization problem\n", - "- `w_k`: a (classical) vector of coefficients $w_k$, corresponds to the polynomial transformation of the target function. The index of the last nonzero element will set the maximal number of errors that the decoder should decode\n", - "- `y`: the (quantum) array of the errors to be decoded by the decoder. If the decoder is perfect, should hold only 0's at the output\n", - "- `solution`: the (quantum) output array of the solution. Holds $|B^Ty\\rangle$ before the Hadamard-transform. \n", - "- `syndrome_decode`: a quantum callable that accept a syndrome quantum array and outputs the decoded error on its second quantum argument" + "- `w_k`: a (classical) vector of coefficients $w_k$ corresponding to the polynomial transformation of the target function. The index of the last non-zero element sets the maximum number of errors that the decoder should decode\n", + "- `y`: the (quantum) array of the errors for the decoder to decode. If the decoder is perfect, it should hold only zeros at the output\n", + "- `solution`: the (quantum) output array of the solution. It holds $|B^Ty\\rangle$ before the Hadamard transform. \n", + "- `syndrome_decode`: a quantum callable that accepts a syndrome quantum array and outputs the decoded error on its second quantum argument" ] }, { @@ -509,7 +508,7 @@ "id": "0aedcd1b-0c52-44a9-b3da-c27ee84d91f4", "metadata": {}, "source": [ - "## Example problem: Max Cut for Regular Graphs" + "## Example Problem: Max-Cut for Regular Graphs" ] }, { @@ -517,9 +516,9 @@ "id": "87d11341-0a03-4588-971c-4a363b821380", "metadata": {}, "source": [ - "Now let's be more specific. We choose to optimize a Max-Cut problem. We also choose specific parameters so that with the resulting $B$ matrix we will be able to decode up to 2 errors on the vector $|y\\rangle$.\n", + "Now, let's be more specific and optimize a Max-Cut problem. We choose specific parameters so that with the resulting $B$ matrix we can decode up to two errors on the vector $|y\\rangle$.\n", "\n", - "The tranlation between Max-Cut and max-XORSAT is quite straightforward. Every edge is a row, with the nodes as columns. The $v$ vector is all ones, so that if $(v_i, v_j) \\in E$, we get a constraint $x_i \\oplus x_j = 1$, that will be satisfied if $x_i$, $x_j$ are on different sides of the cut." + "The translation between Max-Cut and max-XORSAT is quite straightforward. Every edge is a row, with the nodes as columns. The $v$ vector is all ones, so that if $(v_i, v_j) \\in E$, we get a constraint $x_i \\oplus x_j = 1$, which is satisfied if $x_i$, $x_j$ are on different sides of the cut." ] }, { @@ -580,7 +579,7 @@ "id": "685ae8b2-148f-4022-8985-33096368216b", "metadata": {}, "source": [ - "### Original sampling statistics" + "### Original Sampling Statistics" ] }, { @@ -588,9 +587,9 @@ "id": "be2d4d84-b9d4-4492-bd6c-c39d46f414f2", "metadata": {}, "source": [ - "Let's plot the statistics of $f$ for uniformly sampling $x$, as an histogram. \n", + "Let's plot the statistics of $f$ for uniformly sampling $x$, as a histogram. \n", "\n", - "We will Later show how we get a better histogram after sampling from the state of the DQI algorithm." + "Later, we show how to get a better histogram after sampling from the state of the DQI algorithm." ] }, { @@ -628,7 +627,7 @@ "id": "604eadc0-6d83-45f1-97a7-d79149387dc2", "metadata": {}, "source": [ - "### Decodability of the resulting matrix" + "### Decodability of the Resulting Matrix" ] }, { @@ -636,7 +635,7 @@ "id": "07a47df3-b1cb-459c-a508-1cf778f9b4fc", "metadata": {}, "source": [ - "The transposed matrix of the specific matrix we have chosen can be decoded with up to 2 errors, which corresponds to a polynomial transformation of $f$ of degree 2 in the amplitude, and degree 4 in the sampling probability:" + "The transposed matrix of the specific matrix we have chosen can be decoded with up to two errors, which corresponds to a polynomial transformation of $f$ of degree 2 in the amplitude, and degree 4 in the sampling probability:" ] }, { @@ -680,7 +679,7 @@ "id": "e809f82d-b7e8-4a83-b572-339c368a2bd8", "metadata": {}, "source": [ - "### Step 5: Defining the decoder" + "### Step 5: Defining the Decoder" ] }, { @@ -688,7 +687,7 @@ "id": "f9b54c3d-e0bf-49c7-971f-a751fdc3fd41", "metadata": {}, "source": [ - "For this basic demonstration, we just use a brute-force decoder, that will use a lookup-table for decoding each syndrome in superposition:" + "For this basic demonstration, we use a brute force decoder that uses a lookup table for decoding each syndrome in superposition:" ] }, { @@ -716,8 +715,8 @@ "id": "195a755f-ad43-4cb5-bcc4-edd95af9be50", "metadata": {}, "source": [ - "It is also possible to define a decoder that use a local rule of syndrome majority.\n", - "This decoder can correct just 1 error." + "It is also possible to define a decoder that uses a local rule of syndrome majority.\n", + "This decoder can correct just one error." ] }, { @@ -741,8 +740,8 @@ "id": "1d1d19fb-beb3-4325-bdbd-fc73039c3243", "metadata": {}, "source": [ - "### Choosing optimal $w_k$ coefficients\n", - "This is done according to the paper [[1](#DQI)] by finding the principal value of a tridiagonal matrix $A$ defined by the follwing code. The optimality is with regards to the expected ratio of satisfied constraints." + "### Choosing Optimal $w_k$ Coefficients\n", + "According to the paper [[1](#DQI)], this is done by finding the principal value of a tridiagonal matrix $A$ defined by the following code. The optimality is with regard to the expected ratio of satisfied constraints." ] }, { @@ -922,7 +921,7 @@ "id": "67ea4578-4675-4973-a0cb-a9769de53666", "metadata": {}, "source": [ - "Verify the `y` variable was uncomputed correctly by the decoder:" + "We verify that the decoder uncomputed the `y` variable correctly:" ] }, { @@ -948,7 +947,7 @@ "id": "446d1611-3e63-47bd-aac9-c2092e3da245", "metadata": {}, "source": [ - "### Post Processing" + "### Postprocessing" ] }, { @@ -956,7 +955,7 @@ "id": "d658802b-3e8e-4f95-865a-17bac6696e1f", "metadata": {}, "source": [ - "Finally, we plot the histogram of the sampled $f$ values from the algorithm, and compare it to a uniform sampling of $x$ values, and also to sampling weighted by $|f|$ and $|f|^2$ values. We can see the the DQI histogram is biased to higher $f$ values compared to the other sampling methods." + "Finally, we plot the histogram of the sampled $f$ values from the algorithm, and compare it to a uniform sampling of $x$ values, and also to sampling weighted by $|f|$ and $|f|^2$ values. We can see that the DQI histogram is biased to higher $f$ values compared to the other sampling methods." ] }, { @@ -1070,7 +1069,7 @@ "\n", "[2]: [Bärtschi, Andreas, and Stephan Eidenbenz. \"Deterministic Preparation of Dicke States.\" In *Fundamentals of Computation Theory*, pp. 126–139. Springer International Publishing, 2019.](http://dx.doi.org/10.1007/978-3-030-25027-0_9)\n", "\n", - "[3]: [\"Linear Block Codes: Encoding and Syndrome Decoding\" from MIT's OpenCourseWare](https://ocw.mit.edu/courses/6-02-introduction-to-eecs-ii-digital-communication-systems-fall-2012/resources/mit6_02f12_chap06/)" + "[3]: [\"Linear Block Codes: Encoding and Syndrome Decoding\" from MIT's OpenCourseWare](https://ocw.mit.edu/courses/6-02-introduction-to-eecs-ii-digital-communication-systems-fall-2012/resources/mit6_02f12_chap06/)." ] } ],