title | filename | chapternum |
---|---|---|
Universality and uncomputability |
lec_08_uncomputability |
9 |
- The universal machine/program - "one program to rule them all"
- A fundamental result in computer science and mathematics: the existence of uncomputable functions.
- The halting problem: the canonical example of an uncomputable function.
- Introduction to the technique of reductions.
- Rice's Theorem: A "meta tool" for uncomputability results, and a starting point for much of the research on compilers, programming languages, and software verification.
"A function of a variable quantity is an analytic expression composed in any way whatsoever of the variable quantity and numbers or constant quantities.", Leonhard Euler, 1748.
"The importance of the universal machine is clear. We do not need to have an infinity of different machines doing different jobs. ... The engineering problem of producing various machines for various jobs is replaced by the office work of 'programming' the universal machine", Alan Turing, 1948
One of the most significant results we showed for Boolean circuits (or equivalently, straight-line programs) is the notion of universality: there is a single circuit that can evaluate all other circuits.
However, this result came with a significant caveat.
To evaluate a circuit of
It is no exaggeration to say that the existence of such a universal program/machine underlies the information technology revolution that began in the latter half of the 20th century (and is still ongoing).
Up to that point in history, people have produced various special-purpose calculating devices such as the abacus, the slide ruler, and machines that compute various trigonometric series.
But as Turing (who was perhaps the one to see most clearly the ramifications of universality) observed, a general purpose computer is much more powerful.
Once we build a device that can compute the single universal function, we have the ability, via software, to extend it to do arbitrary computations.
For example, if we want to simulate a new Turing machine
Beyond the practical applications, the existence of a universal algorithm also has surprising theoretical ramifications, and in particular can be used to show the existence of uncomputable functions, upending the intuitions of mathematicians over the centuries from Euler to Hilbert. In this chapter we will prove the existence of the universal program, and also show its implications for uncomputability, see universalchapoverviewfig{.ref}
::: {.nonmath} In this chapter we will see two of the most important results in Computer Science:
-
The existence of a universal Turing machine: a single algorithm that can evaluate all other algorithms,
-
The existence of uncomputable functions: functions (including the famous "Halting problem") that cannot be computed by any algorithm.
Along the way, we develop the technique of reductions as a way to show hardness of computing a function. A reduction gives a way to compute a certain function using "wishful thinking" and assuming that another function can be computed. Reductions are of course widely used in programming - we often obtain an algorithm for one task by using another task as a "black box" subroutine. However we will use it in the "contra positive": rather than using a reduction to show that the former task is "easy", we use them to show that the latter task is "hard". Don't worry if you find this confusing - reductions are initially confusing - but they can be mastered with time and practice. :::
We start by proving the existence of a universal Turing machine.
This is a single Turing machine
::: {.theorem title="Universal Turing Machine" #universaltmthm}
There exists a Turing machine
That is, if the machine
::: { .bigidea #universaltmidea} There is a "universal" algorithm that can evaluate arbitrary algorithms on arbitrary inputs. :::
::: {.proofidea data-ref="universaltmthm"}
Once you understand what the theorem says, it is not that hard to prove. The desired program
Think of how you would code
Once you do that, translating this interpreter from your favorite programming language to a Turing machine can be done just as we have seen in chapequivalentmodels{.ref}. The end result is what's known as a "meta-circular evaluator": an interpreter for a programming language in the same one. This is a concept that has a long history in computer science starting from the original universal Turing machine. See also lispinterpreterfig{.ref}. :::
To prove (and even properly state) universaltmthm{.ref}, we need to fix some representation for Turing machines as strings.
One potential choice for such a representation is to use the equivalence between Turing machines and NAND-TM programs and hence represent a Turing machine
::: {.definition title="String representation of Turing Machine" #representTM}
Let
where each value
::: {.remark title="Take away points of representation" #TMrepremark} The details of the representation scheme of Turing machines as strings are immaterial for almost all applications. What you need to remember are the following points:
-
We can represent every Turing machine as a string.
-
Given the string representation of a Turing machine
$M$ and an input$x$ , we can simulate$M$ 's execution on the input$x$ . (This is the content of universaltmthm{.ref}.)
An additional minor issue is that for convenience we make the assumption that every string represents some Turing machine. This is very easy to ensure by just mapping strings that would otherwise not represent a Turing machine into some fixed trivial machine. This assumption is not very important, but does make a few results (such as Rice's Theorem: rice-thm{.ref}) a little less cumbersome to state. :::
Using this representation, we can formally prove universaltmthm{.ref}.
::: {.proof data-ref="universaltmthm"}
We will only sketch the proof, giving the major ideas.
First, we observe that we can easily write a Python program that, on input a representation
# constants
def EVAL(δ,x):
'''Evaluate TM given by transition table δ
on input x'''
Tape = ["▷"] + [a for a in x]
i = 0; s = 0 # i = head pos, s = state
while True:
s, Tape[i], d = δ[(s,Tape[i])]
if d == "H": break
if d == "L": i = max(i-1,0)
if d == "R": i += 1
if i>= len(Tape): Tape.append('Φ')
j = 1; Y = [] # produce output
while Tape[j] != 'Φ':
Y.append(Tape[j])
j += 1
return Y
On input a transition table Tape
contains the contents of s
contains
The above does not prove the theorem as stated, since we need to show a Turing machine that computes
Translating the above Python code to NAND-RAM is truly straightforward.
The only issue is that NAND-RAM doesn't have the dictionary data structure built in, which we have used above to store the transition function δ
.
However, we can represent a dictionary
::: {.remark title="Efficiency of the simulation"}
The argument in the proof of universaltmthm{.ref} is a very inefficient way to implement the dictionary data structure in practice, but it suffices for the purpose of proving the theorem.
Reading and writing to a dictionary of
The construction above yields a universal Turing machine with a very large number of states. However, since universal Turing machines have such a philosophical and technical importance, researchers have attempted to find the smallest possible universal Turing machines, see uncomputablebibnotes{.ref}.
There is more than one Turing machine
The idea of a "universal program" is of course not limited to theory.
For example compilers for programming languages are often used to compile themselves, as well as programs more complicated than the compiler.
(An extreme example of this is Fabrice Bellard's Obfuscated Tiny C Compiler which is a C program of 2048 bytes that can compile a large subset of the C programming language, and in particular can compile itself.)
This is also related to the fact that it is possible to write a program that can print its own source code, see lispinterpreterfig{.ref}.
There are universal Turing machines known that require a very small number of states or alphabet symbols, and in particular there is a universal Turing machine (with respect to a particular choice of representing Turing machines as strings) whose tape alphabet is
In NAND-univ-thm{.ref}, we saw that NAND-CIRC programs can compute every finite function
The existence of uncomputable functions is quite surprising.
Our intuitive notion of a "function" (and the notion most mathematicians had until the 20th century)
is that a function
There exists a function $F^:{0,1}^ \rightarrow {0,1}$ that is not computable by any Turing machine.
The idea behind the proof follows quite closely Cantor's proof that the reals are uncountable (cantorthm{.ref}), and in fact the theorem can also be obtained fairly directly from that result (see uncountablefuncex{.ref}).
However, it is instructive to see the direct proof.
The idea is to construct $F^$ in a way that will ensure that every possible machine $M$ will in fact fail to compute $F^$. We do so by defining $F^(x)$ to equal $0$ if $x$ describes a Turing machine $M$ which satisfies $M(x)=1$ and defining $F^(x)=1$ otherwise. By construction, if
::: {.proof data-ref="uncomputable-func"}
The proof is illustrated in diagonal-fig{.ref}.
We start by defining the following function
For every string $x\in{0,1}^$, if $x$ satisfies (1) $x$ is a valid representation of some Turing machine $M$ (per the representation scheme above) and (2) when the program $M$ is executed on the input $x$ it halts and produces an output, then we define $G(x)$ as the first bit of this output. Otherwise (i.e., if $x$ is not a valid representation of a Turing machine, or the machine $M_x$ never halts on $x$) we define $G(x)=0$. We define $F^(x) = 1 - G(x)$.
We claim that there is no Turing machine that computes $F^$.
Indeed, suppose, towards the sake of contradiction, there exists a machine $M$ that computes $F^$, and let
::: { .bigidea #uncomputablefunctions} There are some functions that can not be computed by any algorithm. :::
The proof of uncomputable-func{.ref} is short but subtle. I suggest that you pause here and go back to read it again and think about it - this is a proof that is worth reading at least twice if not three or four times. It is not often the case that a few lines of mathematical reasoning establish a deeply profound fact - that there are problems we simply cannot solve.
The type of argument used to prove uncomputable-func{.ref} is known as diagonalization since it can be described as defining a function based on the diagonal entries of a table as in diagonal-fig{.ref}.
The proof can be thought of as an infinite version of the counting argument we used for showing lower bound for NAND-CIRC programs in counting-lb{.ref}.
Namely, we show that it's not possible to compute all functions from
As mentioned in decidablelanguagesrem{.ref}, many texts use the "language" terminology and so will call a set $L \subseteq {0,1}^$ an undecidable or non-recursive language if the function $F:{0,1}^ \rightarrow {0,1}$ such that
uncomputable-func{.ref} shows that there is some function that cannot be computed. But is this function the equivalent of the "tree that falls in the forest with no one hearing it"? That is, perhaps it is a function that no one actually wants to compute. It turns out that there are natural uncomputable functions:
Let
Before turning to prove halt-thm{.ref}, we note that
::: {.proofidea data-ref="halt-thm"}
One way to think about this proof is as follows:
$$
\text{Uncomputability of $F^$} ;+; \text{Universality} ;=; \text{Uncomputability of $HALT$}
$$
That is, we will use the universal Turing machine that computes $EVAL$ to derive the uncomputability of $HALT$ from the uncomputability of $F^$ shown in uncomputable-func{.ref}.
Specifically, the proof will be by contradiction.
That is, we will assume towards a contradiction that
::: { .bigidea #reductionuncomputeidea}
If a function
::: {.proof data-ref="halt-thm"} The proof will use the previously established result uncomputable-func{.ref}. Recall that uncomputable-func{.ref} shows that the following function $F^: {0,1}^ \rightarrow {0,1}$ is uncomputable:
$$
F^*(x) = \begin{cases}1 & x(x)=0 \text{or} x(x)=\bot \ 0 & \text{otherwise} \end{cases}
$$
where
We will show that the uncomputability of $F^$ implies the uncomputability of $HALT$.
Specifically, we will assume, towards a contradiction, that there exists a Turing machine $M$ that can compute the $HALT$ function, and use that to obtain a Turing machine $M'$ that computes the function $F^$.
(This is known as a proof by reduction, since we reduce the task of computing $F^$ to the task of computing $HALT$. By the contrapositive, this means the uncomputability of $F^$ implies the uncomputability of
Indeed, suppose that
INPUT: $x\in \{0,1\}^*$
OUTPUT: $F^*(x)$
# Assume T.M. $M_{HALT}$ computes $HALT$
Let $z \leftarrow M_{HALT}(x,x)$. # Assume $z=HALT(x,x)$.
If{$z=0$}
return $1$
endif
Let $y \leftarrow U(x,x)$ # $U$ universal TM, i.e., $y=x(x)$
If{$y=1$}
return $0$
endif
Return $1$
We claim that halttof{.ref} computes the function $F^$.
Indeed, suppose that $x(x)=1$ (and hence $F^(x)=0$).
In this case,
Suppose otherwise that
-
Case 1: The machine described by
$x$ does not halt on the input$x$ (and hence$F^*(x)=1$ ). In this case,$HALT(x,x)=0$ . Since we assume that$M$ computes$HALT$ it means that on input$x,x$ , the machine$M$ must halt and output the value$0$ . This means that halttof{.ref} will set$z=0$ and output$1$ . -
Case 2: The machine described by
$x$ halts on the input$x$ and outputs some$y' \neq 0$ (and hence$F^*(x)=0$ ). In this case, since$HALT(x,x)=1$ , under our assumptions, halttof{.ref} will set$y=y' \neq 0$ and so output$0$ .
We see that in all cases, $M'(x)=F^(x)$, which contradicts the fact that $F^$ is uncomputable.
Hence we reach a contradiction to our original assumption that
Once again, this is a proof that's worth reading more than once. The uncomputability of the halting problem is one of the fundamental theorems of computer science, and is the starting point for much of the investigations we will see later. An excellent way to get a better understanding of halt-thm{.ref} is to go over haltalternativesec{.ref}, which presents an alternative proof of the same result.
Many people's first instinct when they see the proof of halt-thm{.ref} is to not believe it.
That is, most people do believe the mathematical statement, but intuitively it doesn't seem that the Halting problem is really that hard.
After all, being uncomputable only means that
But programmers seem to solve
While every programmer encounters at some point an infinite loop, is there really no way to solve the halting problem? Some people argue that they personally can, if they think hard enough, determine whether any concrete program that they are given will halt or not. Some have even argued that humans in general have the ability to do that, and hence humans have inherently superior intelligence to computers or anything else modeled by Turing machines.^[This argument has also been connected to the issues of consciousness and free will. I am personally skeptical of its relevance to these issues. Perhaps the reasoning is that humans have the ability to solve the halting problem but they exercise their free will and consciousness by choosing not to do so.]
The best answer we have so far is that there truly is no way to solve
def isprime(p):
return all(p % i for i in range(2,p-1))
def Goldbach(n):
return any( (isprime(p) and isprime(n-p))
for p in range(2,n-1))
n = 4
while True:
if not Goldbach(n): break
n+= 2
Given that Goldbach's Conjecture has been open since 1742, it is unclear that humans have any magical ability to say whether this (or other similar programs) will halt or not.
It turns out that we can combine the ideas of the proofs of uncomputable-func{.ref} and halt-thm{.ref} to obtain a short proof of the latter theorem, that does not appeal to the uncomputability of
To the Editor, The Computer Journal.
An Impossible Program
Sir,
A well-known piece of folk-lore among programmers holds that it is impossible to write a program which can examine any other program and tell, in every case, if it will terminate or get into a closed loop when it is run. I have never actually seen a proof of this in print, and though Alan Turing once gave me a verbal proof (in a railway carriage on the way to a Conference at the NPL in 1953), I unfortunately and promptly forgot the details. This left me with an uneasy feeling that the proof must be long or complicated, but in fact it is so short and simple that it may be of interest to casual readers. The version below uses CPL, but not in any essential way.
Suppose
T[R]
is a Boolean function taking a routine (or program)R
with no formal or free variables as its arguments and that for allR
,T[R] = True
ifR
terminates if run and thatT[R] = False
ifR
does not terminate.Consider the routine P defined as follows
rec routine P
§L: if T[P] go to L
Return §
If
T[P] = True
the routineP
will loop, and it will only terminate ifT[P] = False
. In each caseT[P]
has exactly the wrong value, and this contradiction shows that the function T cannot exist.Yours faithfully,
C. StracheyChurchill College, Cambridge
::: { .pause } Try to stop and extract the argument for proving halt-thm{.ref} from the letter above. :::
Since CPL is not as common today, let us reproduce this proof.
The idea is the following: suppose for the sake of contradiction that there exists a program T
such that T(f,x)
equals True
iff f
halts on input x
. (Strachey's letter considers the no-input variant of P
and an input x
such that T(P,x)
gives the wrong answer.
The idea is that on input x
, the program P
will do the following: run T(x,x)
, and if the answer is True
then go into an infinite loop, and otherwise halt.
Now you can see that T(P,P)
will give the wrong answer: if P
halts when it gets its own code as input, then T(P,P)
is supposed to be True
, but then P(P)
will go into an infinite loop. And if P
does not halt, then T(P,P)
is supposed to be False
but then P(P)
will halt.
We can also code this up in Python:
def CantSolveMe(T):
"""
Gets function T that claims to solve HALT.
Returns a pair (P,x) of code and input on which
T(P,x) ≠ HALT(x)
"""
def fool(x):
if T(x,x):
while True: pass
return "I halted"
return (fool,fool)
For example, consider the following Naive Python program T
that guesses that a given function does not halt if its input contains while
or for
def T(f,x):
"""Crude halting tester - decides it doesn't halt if it contains a loop."""
import inspect
source = inspect.getsource(f)
if source.find("while"): return False
if source.find("for"): return False
return True
If we now set (f,x) = CantSolveMe(T)
, then T(f,x)=False
but f(x)
does in fact halt. This is of course not specific to this particular T
: for every program T
, if we run (f,x) = CantSolveMe(T)
then we'll get an input on which T
gives the wrong answer to
The Halting problem turns out to be a linchpin of uncomputability, in the sense that halt-thm{.ref} has been used to show the uncomputability of a great many interesting functions. We will see several examples of such results in this chapter and the exercises, but there are many more such results (see haltreductions{.ref}).
The idea behind such uncomputability results is conceptually simple but can at first be quite confusing.
If we know that
For example, to prove that
A reduction-based proof has two components.
For starters, since we need
Reduction-based proofs are just like other proofs by contradiction, but the fact that they involve hypothetical algorithms that don't really exist tends to make reductions quite confusing. The one silver lining is that at the end of the day the notion of reductions is mathematically quite simple, and so it's not that bad even if you have to go back to first principles every time you need to remember what is the direction that a reduction should go in.
::: {.remark title="Reductions are algorithms" #reductionsaralg} A reduction is an algorithm, which means that, as discussed in implspecanarem{.ref}, a reduction has three components:
-
Specification (what): In the case of a reduction from
$HALT$ to$BLAH$ , the specification is that function$R:{0,1}^* \rightarrow {0,1}^*$ should satisfy that$HALT(M,x)=BLAH(R(M,x))$ for every Turing machine$M$ and input$x$ . In general, to reduce a function$F$ to$G$ , the reduction should satisfy$F(w)=G(R(w))$ for every input$w$ to$F$ . -
Implementation (how): The algorithm's description: the precise instructions how to transform an input
$w$ to the output$y=R(w)$ . -
Analysis (why): A proof that the algorithm meets the specification. In particular, in a reduction from
$F$ to$G$ this is a proof that for every input$w$ , the output$y=R(w)$ of the algorithm satisfies that$F(w)=G(y)$ . :::
Here is a concrete example for a proof by reduction.
We define the function
The proof of haltonzero-thm{.ref} is below, but before reading it you might want to pause for a couple of minutes and think how you would prove it yourself.
In particular, try to think of what a reduction from
::: {.proof #proofofhaltonzero data-ref="haltonzero-thm"}
The proof is by reduction from
Since this is our first proof by reduction from the Halting problem, we will spell it out in more details than usual. Such a proof by reduction consists of two steps:
-
Description of the reduction: We will describe the operation of our algorithm
$B$ , and how it makes "function calls" to the hypothetical algorithm$A$ . -
Analysis of the reduction: We will then prove that under the hypothesis that Algorithm
$A$ computes$HALTONZERO$ , Algorithm$B$ will compute$HALT$ .
INPUT: Turing machine $M$ and string $x$.
OUTPUT: Turing machine $M'$ such that $M$ halts on $x$ iff $M'$ halts on zero
Procedure{$N_{M,x}$}{$w$} # Description of the T.M. $N_{M,x}$
Return $EVAL(M,x)$ # Ignore the input $w$, evaluate $M$ on $x$.
Endprocedure
Return $N_{M,x}$ # We do not execute $N_{M,x}$: only return its description
Our Algorithm
In pseudocode, the program
def N(z):
M = r'.......'
# a string constant containing desc. of M
x = r'.......'
# a string constant containing x
return eval(M,x)
# note that we ignore the input z
That is, if we think of
The above completes the description of the reduction. The analysis is obtained by proving the following claim:
Claim: For every strings
Proof of Claim: Since
In particular if we instantiate this claim with the input
In the proof of haltonzero-thm{.ref} we used the technique of "hardwiring" an input
The uncomputability of the Halting problem turns out to be a special case of a much more general phenomenon. Namely, that we cannot certify semantic properties of general purpose programs. "Semantic properties" mean properties of the function that the program computes, as opposed to properties that depend on the particular syntax used by the program.
An example for a semantic property of a program
Checking semantic properties of programs is of great interest, as it corresponds to checking whether a program conforms to a specification.
Alas it turns out that such properties are in general uncomputable.
We have already seen some examples of uncomputable semantic functions, namely
Let
::: { .pause }
Despite the similarity in their names,
::: {.proof data-ref="allzero-thm"}
The proof is by reduction from
Given a Turing machine
-
Construct a Turing machine
$M$ which on input$x\in{0,1}^*$ , first runs$N(0)$ and then outputs$0$ . -
Return
$A(M)$ .
Now if
Another result along similar lines is the following:
The following function is uncomputable $$ COMPUTES\text{-}PARITY(P) = \begin{cases} 1 & P \text{ computes the parity function } \ 0 & \text{otherwise} \end{cases} $$
::: { .pause } We leave the proof of paritythm{.ref} as an exercise (paritythmex{.ref}). I strongly encourage you to stop here and try to solve this exercise. :::
paritythm{.ref} can be generalized far beyond the parity function. In fact, this generalization rules out verifying any type of semantic specification on programs. We define a semantic specification on programs to be some property that does not depend on the code of the program but just on the function that the program computes.
For example, consider the following two C programs
int First(int n) {
if (n<0) return 0;
return 2*n;
}
int Second(int n) {
int i = 0;
int j = 0
if (n<0) return 0;
while (j<n) {
i = i + 2;
j = j + 1;
}
return i;
}
First
and Second
are two distinct C programs, but they compute the same function.
A semantic property, would be either true for both programs or false for both programs, since it depends on the function the programs compute and not on their code.
An example for a semantic property that both First
and Second
satisfy is the following: "The program $P$ computes a function $f$ mapping integers to integers satisfying that $f(n) \geq n$ for every input $n$".
A property is not semantic if it depends on the source code rather than the input/output behavior.
For example, properties such as "the program contains the variable k
" or "the program uses the while
operation" are not semantic.
Such properties can be true for one of the programs and false for others.
Formally, we define semantic properties as follows:
::: {.definition title="Semantic properties" #semanticpropdef}
A pair of Turing machines
A function
There are two trivial examples of semantic functions: the constant one function and the constant zero function. For example, if
::: {.solvedexercise title="$ZEROFUNC$ is semantic" #zerofuncsem}
Prove that the function
::: {.solution data-ref="zerofuncsem"}
Recall that
Often the properties of programs that we are most interested in computing are the semantic ones, since we want to understand the programs' functionality. Unfortunately, Rice's Theorem tells us that these properties are all uncomputable:
::: {.theorem title="Rice's Theorem" #rice-thm}
Let
::: {.proofidea #proofidea-rice-thm data-ref="rice-thm"}
The idea behind the proof is to show that every semantic non-trivial function
Because
::: {.proof data-ref="rice-thm"}
We will not give the proof in full formality, but rather illustrate the proof idea by restricting our attention to a particular semantic function
We start by noting that
-
The machine
$INF$ that simply goes into an infinite loop on every input satisfies$MONOTONE(INF)=1$ , since$INF$ is not defined anywhere and so in particular there are no two inputs$x,x'$ where$x_i \leq x'_i$ for every$i$ but$INF(x)=0$ and$INF(x')=1$ . -
The machine
$PAR$ that computes the XOR or parity of its input, is not monotone (e.g.,$PAR(1,1,0,0,\ldots,0)=0$ but$PAR(1,0,0,\ldots,0)=0$ ) and hence$MONOTONE(PAR)=0$ .
(Note that
We will now give a reduction from
::: {.quote}
Algorithm
Input: String
Assumption: Access to Algorithm
Operation:
-
Construct the following machine
$M$ : "On input$z\in {0,1}^*$ do: (a) Run$N(0)$ , (b) Return$PAR(z)$ ". -
Return
$1-A(M)$ . :::
To complete the proof we need to show that
If
An examination of this proof shows that we did not use anything about
::: {.remark title="Semantic is not the same as uncomputable" #syntacticcomputablefunctions}
Rice's Theorem is so powerful and such a popular way of proving uncomputability that people sometimes get confused and think that it is
the only way to prove uncomputability.
In particular, a common misconception is that if a function
For example, consider the following function Yale
. The function
Yale[0] = NAND(X[0],X[0])
Y[0] = NAND(X[0],Yale[0])
and
Harvard[0] = NAND(X[0],X[0])
Y[0] = NAND(X[0],Harvard[0])
However, :)
) program Yale
. Hence if we could compute
Moreover, as we will see in godelchap{.ref}, there are uncomputable functions whose inputs are not programs, and hence for which the adjective "semantic" is not applicable.
Properties such as "the program contains the variable Yale
" are sometimes known as syntactic properties.
The terms "semantic" and "syntactic" are used beyond the realm of programming languages: a famous example of a syntactically correct but semantically meaningless sentence in English is Chomsky's "Colorless green ideas sleep furiously." However, formally defining "syntactic properties" is rather subtle and we will not use this terminology in this book, sticking to the terms "semantic" and "non-semantic" only.
:::
As we saw before, many natural computational models turn out to be equivalent to one another, in the sense that we can transform a "program" of one model (such as a
Let
Once again, this is a good point for you to stop and try to prove the result yourself before reading the proof below.
::: {.proof }
We have seen in TM-equiv-thm{.ref} that for every Turing machine
The transformation
The same proof carries over to other computational models such as the $\lambda$ calculus, two dimensional (or even one-dimensional) automata etc.
Hence for example, there is no algorithm to decide if a
Indeed, we can generalize Rice's Theorem to all these models.
For example, if
Programs are increasingly being used for mission critical purposes, whether it's running our banking system, flying planes, or monitoring nuclear reactors.
If we can't even give a certification algorithm that a program correctly computes the parity function, how can we ever be assured that a program does what it is supposed to do?
The key insight is that while it is impossible to certify that a general program conforms with a specification, it is possible to write a program in the first place in a way that will make it easier to certify.
As a trivial example, if you write a program without loops, then you can certify that it halts.
Also, while it might not be possible to certify that an arbitrary program computes the parity function, it is quite possible to write a particular program
The field of software verification is concerned with verifying that given programs satisfy certain conditions. These conditions can be that the program computes a certain function, that it never writes into a dangerous memory location, that is respects certain invariants, and others. While the general tasks of verifying this may be uncomputable, researchers have managed to do so for many interesting cases, especially if the program is written in the first place in a formalism or programming language that makes verification easier. That said, verification, especially of large and complex programs, remains a highly challenging task in practice as well, and the number of programs that have been formally proven correct is still quite small. Moreover, even phrasing the right theorem to prove (i.e., the specification) is often a highly non-trivial endeavor.
{#inclusionuncomputablefig .class }
::: { .recap }
-
There is a universal Turing machine (or NAND-TM program)
$U$ such that on input a description of a Turing machine$M$ and some input$x$ ,$U(M,x)$ halts and outputs$M(x)$ if (and only if)$M$ halts on input$x$ . Unlike in the case of finite computation (i.e., NAND-CIRC programs / circuits), the input to the program$U$ can be a machine$M$ that has more states than$U$ itself. -
Unlike the finite case, there are actually functions that are inherently uncomputable in the sense that they cannot be computed by any Turing machine.
-
These include not only some "degenerate" or "esoteric" functions but also functions that people have deeply care about and conjectured that could be computed.
-
If the Church-Turing thesis holds then a function
$F$ that is uncomputable according to our definition cannot be computed by any means in our physical world. :::
::: {.exercise title="NAND-RAM Halt" #NANDRAMHalt}
Let
::: {.exercise title="Timed halting" #timedhalt}
Let
Prove that
::: {.exercise title="Space halting (challenging)" #spacehalting}
Let
Prove that
::: {.exercise title="Computable compositions" #necessarilyuncomputableex}
Suppose that
-
$H(x)=1$ iff$F(x)=1$ OR$G(x)=1$ . -
$H(x)=1$ iff there exist two non-empty strings$u,v \in {0,1}^*$ such that$x=uv$ (i.e.,$x$ is the concatenation of$u$ and$v$ ),$F(u)=1$ and$G(v)=1$ . -
$H(x)=1$ iff there exist a list$u_0,\ldots,u_{t-1}$ of non-empty strings such that strings$F(u_i)=1$ for every$i\in [t]$ and$x=u_0u_1\cdots u_{t-1}$ . -
$H(x)=1$ iff$x$ is a valid string representation of a NAND++ program$P$ such that for every$z\in {0,1}^*$ , on input$z$ the program$P$ outputs$F(z)$ . -
$H(x)=1$ iff$x$ is a valid string representation of a NAND++ program$P$ such that on input$x$ the program$P$ outputs$F(x)$ . -
$H(x)=1$ iff$x$ is a valid string representation of a NAND++ program$P$ such that on input$x$ ,$P$ outputs$F(x)$ after executing at most$100\cdot |x|^2$ lines. :::
::: {.exercise #finiteuncompex }
Prove that the following function
::: {.exercise title="Computing parity" #paritythmex} Prove paritythm{.ref} without using Rice's Theorem. :::
::: {.exercise title="TM Equivalence" #TMequivex}
Let
Note that you cannot use Rice's Theorem directly, as this theorem only deals with functions that take a single Turing machine as input, and
::: {.exercise #salil-ex} For each of the following two functions, say whether it is computable or not:
-
Given a NAND-TM program
$P$ , an input$x$ , and a number$k$ , when we run$P$ on$x$ , does the index variablei
ever reach$k$ ? -
Given a NAND-TM program
$P$ , an input$x$ , and a number$k$ , when we run$P$ on$x$ , does$P$ ever write to an array at index$k$ ? :::
::: {.exercise #ricetmnandram}
Let
::: {.exercise title="Recursively enumerable" #recursiveenumerableex}
Define a function
-
Prove that every computable
$F$ is also recursively enumerable. -
Prove that there exists
$F$ that is not computable but is recursively enumerable. See footnote for hint.^[$HALT$ has this property.] -
Prove that there exists a function
$F:{0,1}^* \rightarrow {0,1}$ such that$F$ is not recursively enumerable. See footnote for hint.^[You can either use the diagonalization method to prove this directly or show that the set of all recursively enumerable functions is countable.] -
Prove that there exists a function
$F:{0,1}^* \rightarrow {0,1}$ such that$F$ is recursively enumerable but the function$\overline{F}$ defined as$\overline{F}(x)=1-F(x)$ is not recursively enumerable. See footnote for hint.^[$HALT$ has this property: show that if both$HALT$ and$\overline{HALT}$ were recursively enumerable then$HALT$ would be in fact computable.] :::
::: {.exercise title="Rice's Theorem: standard form" #ricestandardex } In this exercise we will prove Rice's Theorem in the form that it is typically stated in the literature.
For a Turing machine
-
Prove that for every Turing machine
$M$ , if we define$F_M:{0,1}^* \rightarrow {0,1}$ to be the function such that$F_M(x)=1$ iff$x\in L(M)$ then$F_M$ is recursively enumerable as defined in recursiveenumerableex{.ref}. -
Use rice-thm{.ref} to prove that for every
$G:{0,1}^* \rightarrow {0,1}$ , if (a)$G$ is neither the constant zero nor the constant one function, and (b) for every$M,M'$ such that$L(M)=L(M')$ ,$G(M)=G(M')$ , then$G$ is uncomputable. See footnote for hint.^[Show that any$G$ satisfying (b) must be semantic.] :::
::: {.exercise title="Rice's Theorem for general Turing-equivalent models (optional)" #ricegeneralex}
Let
Prove that for every
::: {.exercise title="Busy Beaver" #beaverex} In this question we define the NAND-TM variant of the busy beaver function (see Aaronson's 1999 essay, 2017 blog post and 2020 survey [@aaronson20beaver]; see also Tao's highly recommended presentation on how civilization's scientific progress can be measured by the quantities we can grasp).
-
Let
$T_{BB}:{0,1}^* \rightarrow \mathbb{N}$ be defined as follows. For every string$P\in {0,1}^*$ , if$P$ represents a NAND-TM program such that when$P$ is executed on the input$0$ then it halts within$M$ steps then$T_{BB}(P)=M$ . Otherwise (if$P$ does not represent a NAND-TM program, or it is a program that does not halt on$0$ ),$T_{BB}(P)=0$ . Prove that$T_{BB}$ is uncomputable. -
Let
$TOWER(n)$ denote the number $\underbrace{2^{\cdot^{\cdot^{\cdot^2}}}}{n\text{ times}}$ (that is, a "tower of powers of two" of height $n$). To get a sense of how fast this function grows, $TOWER(1)=2$, $TOWER(2)=2^2=4$, $TOWER(3)=2^{2^2}=16$, $TOWER(4) = 2^{16} = 65536$ and $TOWER(5) = 2^{65536}$ which is about $10^{20000}$. $TOWER(6)$ is already a number that is too big to write even in scientific notation. Define $NBB:\mathbb{N} \rightarrow \mathbb{N}$ (for "NAND-TM Busy Beaver") to be the function $NBB(n) = \max{P\in {0,1}^n} T_{BB}(P)$ where$T_{BB}$ is as defined in Question 6.1. Prove that$NBB$ grows faster than$TOWER$ , in the sense that$TOWER(n) = o(NBB(n))$ . See footnote for hint^[You will not need to use very specific properties of the$TOWER$ function in this exercise. For example,$NBB(n)$ also grows faster than the Ackerman function.] :::
The cartoon of the Halting problem in universalchapoverviewfig{.ref} and taken from Charles Cooper's website, Copyright 2019 Charles F. Cooper.
Section 7.2 in [@MooreMertens11] gives a highly recommended overview of uncomputability. Gödel, Escher, Bach [@hofstadter1999] is a classic popular science book that touches on uncomputability, and unprovability, and specifically Gödel's Theorem that we will see in godelchap{.ref}. See also the recent book by Holt [@Holt2018].
The history of the definition of a function is intertwined with the development of mathematics as a field. For many years, a function was identified (as per Euler's quote above) with the means to calculate the output from the input. In the 1800's, with the invention of the Fourier series and with the systematic study of continuity and differentiability, people have started looking at more general kinds of functions, but the modern definition of a function as an arbitrary mapping was not yet universally accepted. For example, in 1899 Poincare wrote "we have seen a mass of bizarre functions which appear to be forced to resemble as little as possible honest functions which serve some purpose. ... they are invented on purpose to show that our ancestor's reasoning was at fault, and we shall never get anything more than that out of them". Some of this fascinating history is discussed in [@grabiner1983gave, @Kleiner91, @Lutzen2002, @grabiner2005the].
The existence of a universal Turing machine, and the uncomputability of
Some universal Turing machines with a small alphabet and number of states are given in [@rogozhin1996small], including a single-tape universal Turing machine with the binary alphabet and with less than
The diagonalization argument used to prove uncomputability of
Christopher Strachey was an English computer scientist and the inventor of the CPL programming language. He was also an early artificial intelligence visionary, programming a computer to play Checkers and even write love letters in the early 1950's, see this New Yorker article and this website.
Rice's Theorem was proven in [@rice1953classes]. It is typically stated in a form somewhat different than what we used, see ricestandardex{.ref}.
We do not discuss in the chapter the concept of recursively enumerable languages, but it is covered briefly in recursiveenumerableex{.ref}. As usual, we use function, as opposed to language, notation.