title | filename | chapternum |
---|---|---|
Equivalent models of computation |
lec_07_other_models |
8 |
- Learn about RAM machines and the λ calculus.
- Equivalence between these and other models and Turing machines.
- Cellular automata and configurations of Turing machines.
- Understand the Church-Turing thesis.
"All problems in computer science can be solved by another level of indirection", attributed to David Wheeler.
"Because we shall later compute with expressions for functions, we need a distinction between functions and forms and a notation for expressing this distinction. This distinction and a notation for describing it, from which we deviate trivially, is given by Church.", John McCarthy, 1960 (in paper describing the LISP programming language)
So far we have defined the notion of computing a function using Turing machines, which are not a close match to the way computation is done in practice. In this chapter we justify this choice by showing that the definition of computable functions will remain the same under a wide variety of computational models. This notion is known as Turing completeness or Turing equivalence and is one of the most fundamental facts of computer science. In fact, a widely believed claim known as the Church-Turing Thesis holds that every "reasonable" definition of computable function is equivalent to being computable by a Turing machine. We discuss the Church-Turing Thesis and the potential definitions of "reasonable" in churchturingdiscussionsec{.ref}.
Some of the main computational models we discuss in this chapter include:
-
RAM Machines: Turing machines do not correspond to standard computing architectures that have Random Access Memory (RAM). The mathematical model of RAM machines is much closer to actual computers, but we will see that it is equivalent in power to Turing machines. We also discuss a programming language variant of RAM machines, which we call NAND-RAM. The equivalence of Turing machines and RAM machines enables demonstrating the Turing Equivalence of many popular programming languages, including all general-purpose languages used in practice such as C, Python, JavaScript, etc.
-
Cellular Automata: Many natural and artificial systems can be modeled as collections of simple components, each evolving according to simple rules based on its state and the state of its immediate neighbors. One well-known such example is Conway's Game of Life. To prove that cellular automata are equivalent to Turing machines we introduce the tool of configurations of Turing machines. These have other applications, and in particular are used in godelchap{.ref} to prove Gödel's Incompleteness Theorem: a central result in mathematics.
-
$\lambda$ calculus: The$\lambda$ calculus is a model for expressing computation that originates from the 1930's, though it is closely connected to functional programming languages widely used today. Showing the equivalence of$\lambda$ calculus to Turing machines involves a beautiful technique to eliminate recursion known as the "Y Combinator".
::: {.nonmath} In this chapter we study equivalence between models. Two computational models are equivalent (also known as Turing equivalent) if they can compute the same set of functions. For example, we have seen that Turing machines and NAND-TM programs are equivalent since we can transform every Turing machine into a NAND-TM program that computes the same function, and similarly can transform every NAND-TM program into a Turing machine that computes the same function.
In this chapter we show this extends far beyond Turing machines. The techniques we develop allow us to show that all general-purpose programming languages (i.e., Python, C, Java, etc.) are Turing Complete, in the sense that they can simulate Turing machines and hence compute all functions that can be computed by a TM. We will also show the other direction- Turing machines can be used to simulate a program in any of these languages and hence compute any function computable by them. This means that all these programming language are Turing equivalent: they are equivalent in power to Turing machines and to each other. This is a powerful principle, which underlies behind the vast reach of Computer Science. Moreover, it enables us to "have our cake and eat it too"- since all these models are equivalent, we can choose the model of our convenience for the task at hand. To achieve this equivalence, we define a new computational model known as RAM machines. RAM Machines capture the architecture of modern computers more closely than Turing machines, but are still computationally equivalent to Turing machines.
Finally, we will show that Turing equivalence extends far beyond traditional programming languages. We will see that cellular automata which are a mathematical model of extremely simple natural systems is also Turing equivalent, and also see the Turing equivalence of the
See turingcompletefig{.ref} for an overview of the results of this chapter. :::
One of the limitations of Turing machines (and NAND-TM programs) is that we can only access one location of our arrays/tape at a time.
If the head is at position Memory
, such that given an index Memory
.
("Random access memory" is quite a misnomer since it has nothing to do with probability, but since it is a standard term in both the theory and practice of computing, we will use it as well.)
The computational model that models access to such a memory is the RAM machine (sometimes also known as the Word RAM model), as depicted in rammachinefig{.ref}.
The memory of a RAM machine is an array of unbounded size where each cell can store a single word, which we think of as a string in
The operations a RAM machine can carry out include:
-
Data movement: Load data from a certain cell in memory into a register or store the contents of a register into a certain cell of memory. A RAM machine can directly access any cell of memory without having to move the "head" (as Turing machines do) to that location. That is, in one step a RAM machine can load into register
$r_i$ the contents of the memory cell indexed by register$r_j$ , or store into the memory cell indexed by register$r_j$ the contents of register$r_i$ . -
Computation: RAM machines can carry out computation on registers such as arithmetic operations, logical operations, and comparisons.
-
Control flow: As in the case of Turing machines, the choice of what instruction to perform next can depend on the state of the RAM machine, which is captured by the contents of its register.
We will not give a formal definition of RAM Machines, though the bibliographical notes section (othermodelsbibnotes{.ref}) contains sources for such definitions. Just as the NAND-TM programming language models Turing machines, we can also define a NAND-RAM programming language that models RAM machines. The NAND-RAM programming language extends NAND-TM by adding the following features:
-
The variables of NAND-RAM are allowed to be (non-negative) integer valued rather than only Boolean as is the case in NAND-TM. That is, a scalar variable
foo
holds a non-negative integer in$\N$ (rather than only a bit in${0,1}$ ), and an array variableBar
holds an array of integers. As in the case of RAM machines, we will not allow integers of unbounded size. Concretely, each variable holds a number between$0$ and$T-1$ , where$T$ is the number of steps that have been executed by the program so far. (You can ignore this restriction for now: if we want to hold larger numbers, we can simply execute dummy instructions; it will be useful in later chapters.) -
We allow indexed access to arrays. If
foo
is a scalar andBar
is an array, thenBar[foo]
refers to the location ofBar
indexed by the value offoo
. (Note that this means we don't need to have a special index variablei
anymore.) -
As is often the case in programming languages, we will assume that for Boolean operations such as
NAND
, a zero valued integer is considered as false, and a non-zero valued integer is considered as true. -
In addition to
NAND
, NAND-RAM also includes all the basic arithmetic operations of addition, subtraction, multiplication, (integer) division, as well as comparisons (equal, greater than, less than, etc..). -
NAND-RAM includes conditional statements
if
/then
as part of the language. -
NAND-RAM contains looping constructs such as
while
anddo
as part of the language.
A full description of the NAND-RAM programming language is in the appendix. However, the most important fact you need to know about NAND-RAM is that you actually don't need to know much about NAND-RAM at all, since it is equivalent in power to Turing machines:
For every function
Since NAND-TM programs are equivalent to Turing machines, and NAND-RAM programs are equivalent to RAM machines, RAMTMequivalencethm{.ref} shows that all these four models are equivalent to one another.
::: {.proofidea data-ref="RAMTMequivalencethm"}
Clearly NAND-RAM is only more powerful than NAND-TM, and so if a function
This can be done but going over all the operations in detail is rather tedious. Hence we will focus on describing the main ideas behind this transformation. (See also nandramoverviewfig{.ref}.)
NAND-RAM generalizes NAND-TM in two main ways: (a) adding indexed access to the arrays (ie.., Foo[bar]
syntax) and (b) moving from Boolean valued variables to integer valued ones.
The transformation has two steps:
-
Indexed access of bit arrays: We start by showing how to handle (a). Namely, we show how we can implement in NAND-TM the operation
Setindex(Bar)
such that ifBar
is an array that encodes some integer$j$ , then after executingSetindex(Bar)
the value ofi
will equal to$j$ . This will allow us to simulate syntax of the formFoo[Bar]
bySetindex(Bar)
followed byFoo[i]
. -
Two dimensional bit arrays: We then show how we can use "syntactic sugar" to augment NAND-TM with two dimensional arrays. That is, have two indices
i
andj
and two dimensional arrays, such that we can use the syntaxFoo[i][j]
to access the (i
,j
)-th location ofFoo
. -
Arrays of integers: Finally we will encode a one dimensional array
Arr
of integers by a two dimensionalArrbin
of bits. The idea is simple: if$a_{i,0},\ldots,a_{i,\ell}$ is a binary (prefix-free) representation ofArr[
$i$]</code>, then <code>Arrbin[$ i$][$ j$]
will be equal to$a_{i,j}$ .
Once we have arrays of integers, we can use our usual syntactic sugar for functions, GOTO
etc. to implement the arithmetic and control flow operations of NAND-RAM.
:::
The above approach is not the only way to obtain a proof of RAMTMequivalencethm{.ref}, see for example RAMTMalternativeex{.ref}
::: {.remark title="RAM machines / NAND-RAM and assembly language (optional)" #NANDRAMassembly} RAM machines correspond quite closely to actual microprocessors such as those in the Intel x86 series that also contains a large primary memory and a constant number of small registers. This is of course no accident: RAM machines aim at modeling more closely than Turing machines the architecture of actual computing systems, which largely follows the so called von Neumann architecture as described in the report [@vonNeumann45]. As a result, NAND-RAM is similar in its general outline to assembly languages such as x86 or NIPS. These assembly languages all have instructions to (1) move data from registers to memory, (2) perform arithmetic or logical computations on registers, and (3) conditional execution and loops ("if" and "goto", commonly known as "branches" and "jumps" in the context of assembly languages).
The main difference between RAM machines and actual microprocessors (and correspondingly between NAND-RAM and assembly languages) is that actual microprocessors have a fixed word size
Of course actual microprocessors have many features not shared with RAM machines as well, including parallelism, memory hierarchies, and many others.
However, RAM machines do capture actual computers to a first approximation and so (as we will see), the running time of an algorithm on a RAM machine (e.g.,
We do not show the full formal proof of RAMTMequivalencethm{.ref} but focus on the most important parts: implementing indexed access, and simulating two dimensional arrays with one dimensional ones. Even these are already quite tedious to describe, as will not be surprising to anyone that has ever written a compiler. Hence you can feel free to merely skim this section. The important point is not for you to know all details by heart but to be convinced that in principle it is possible to transform a NAND-RAM program to an equivalent NAND-TM program, and even be convinced that, with sufficient time and effort, you could do it if you wanted to.
In NAND-TM we can only access our arrays in the position of the index variable i
, while NAND-RAM has integer-valued variables and can use them for indexed access to arrays, of the form Foo[bar]
.
To implement indexed access in NAND-TM, we will encode integers in our arrays using some prefix-free representation (see prefixfreesec{.ref})), and then have a procedure Setindex(Bar)
that sets i
to the value encoded by Bar
.
We can simulate the effect of Foo[Bar]
using Setindex(Bar)
followed by Foo[i]
.
Implementing Setindex(Bar)
can be achieved as follows:
-
We initialize an array
Atzero
such thatAtzero[
$0$]$ =1$ andAtzero[
$j$]$ =0$ for all$j>0$ . (This can be easily done in NAND-TM as all uninitialized variables default to zero.) -
Set
i
to zero, by decrementing it until we reach the point whereAtzero[i]
$=1$ . -
Let
Temp
be an array encoding the number$0$ . -
We use
GOTO
to simulate an inner loop of the form: whileTemp
$\neq$ Bar
, incrementTemp
. -
At the end of the loop,
i
is equal to the value encoded byBar
.
In NAND-TM code (using some syntactic sugar), we can implement the above operations as follows:
# assume Atzero is an array such that Atzero[0]=1
# and Atzero[j]=0 for all j>0
# set i to 0.
LABEL("zero_idx")
dir0 = zero
dir1 = one
# corresponds to i <- i-1
GOTO("zero_idx",NOT(Atzero[i]))
...
# zero out temp
#(code below assumes a specific prefix-free encoding in which 10 is the "end marker")
Temp[0] = 1
Temp[1] = 0
# set i to Bar, assume we know how to increment, compare
LABEL("increment_temp")
cond = EQUAL(Temp,Bar)
dir0 = one
dir1 = one
# corresponds to i <- i+1
INC(Temp)
GOTO("increment_temp",cond)
# if we reach this point, i is number encoded by Bar
...
# final instruction of program
MODANDJUMP(dir0,dir1)
To implement two dimensional arrays, we want to embed them in a one dimensional array.
The idea is that we come up with a one to one function Two
in the location One
.
Since the set
pair-ex{.ref} asks you to prove that Two[Foo][Bar] = something
(i.e., access the two dimensional array Two
at the integers encoded by the one dimensional arrays Foo
and Bar
) by code of the form:
Blah = embed(Foo,Bar)
Setindex(Blah)
Two[i] = something
Once we have two dimensional arrays and indexed access, simulating NAND-RAM with NAND-TM is just a matter of implementing the standard algorithms for arithmetic operations and comparisons in NAND-TM.
While this is cumbersome, it is not difficult, and the end result is to show that every NAND-RAM program
::: {.remark title="Recursion in NAND-RAM (advanced)" #recursion} One concept that appears in many programming languages but we did not include in NAND-RAM programs is recursion. However, recursion (and function calls in general) can be implemented in NAND-RAM using the stack data structure. A stack is a data structure containing a sequence of elements, where we can "push" elements into it and "pop" them from it in "first in last out" order.
We can implement a stack using an array of integers Stack
and a scalar variable stackpointer
that will be the number of items in the stack.
We implement push(foo)
by
Stack[stackpointer]=foo
stackpointer += one
and implement bar = pop()
by
bar = Stack[stackpointer]
stackpointer -= one
We implement a function call to
The fact that we can implement recursion using a non-recursive language is not surprising.
Indeed, machine languages typically do not have recursion (or function calls in general), and hence a compiler implements function calls using a stack and GOTO
.
You can find online tutorials on how recursion is implemented via stack in your favorite programming language, whether it's Python , JavaScript, or Lisp/Scheme.
:::
Any of the standard programming languages such as C
, Java
, Python
, Pascal
, Fortran
have very similar operations to NAND-RAM.
(Indeed, ultimately they can all be executed by machines which have a fixed number of registers and a large memory array.)
Hence using RAMTMequivalencethm{.ref}, we can simulate any program in such a programming language by a NAND-TM program.
In the other direction, it is a fairly easy programming exercise to write an interpreter for NAND-TM in any of the above programming languages.
Hence we can also simulate NAND-TM programs (and so by TM-equiv-thm{.ref}, Turing machines) using these programming languages.
This property of being equivalent in power to Turing machines / NAND-TM is called Turing Equivalent (or sometimes Turing Complete).
Thus all programming languages we are familiar with are Turing equivalent.^[Some programming languages have fixed (even if extremely large) bounds on the amount of memory they can access, which formally prevent them from being applicable to computing infinite functions and hence simulating Turing machines. We ignore such issues in this discussion and assume access to some storage device without a fixed upper bound on its capacity.]
The equivalence between Turing machines and RAM machines allows us to choose the most convenient language for the task at hand:
-
When we want to prove a theorem about all programs/algorithms, we can use Turing machines (or NAND-TM) since they are simpler and easier to analyze. In particular, if we want to show that a certain function cannot be computed, then we will use Turing machines.
-
When we want to show that a function can be computed we can use RAM machines or NAND-RAM, because they are easier to program in and correspond more closely to high level programming languages we are used to. In fact, we will often describe NAND-RAM programs in an informal manner, trusting that the reader can fill in the details and translate the high level description to the precise program. (This is just like the way people typically use informal or "pseudocode" descriptions of algorithms, trusting that their audience will know to translate these descriptions to code if needed.)
Our usage of Turing machines / NAND-TM and RAM Machines / NAND-RAM is very similar to the way people use in practice high and low level programming languages. When one wants to produce a device that executes programs, it is convenient to do so for a very simple and "low level" programming language. When one wants to describe an algorithm, it is convenient to use as high level a formalism as possible.
::: { .bigidea #eatandhavecake } Using equivalence results such as those between Turing and RAM machines, we can "have our cake and eat it too".
We can use a simpler model such as Turing machines when we want to prove something can't be done, and use a feature-rich model such as RAM machines when we want to prove something can be done. :::
"The programmer is in the unique position that ... he has to be able to think in terms of conceptual hierarchies that are much deeper than a single mind ever needed to face before.", Edsger Dijkstra, "On the cruelty of really teaching computing science", 1988.
At some point in any theory of computation course, the instructor and students need to have the talk. That is, we need to discuss the level of abstraction in describing algorithms. In algorithms courses, one typically describes algorithms in English, assuming readers can "fill in the details" and would be able to convert such an algorithm into an implementation if needed. For example, bfsalghighlevel{.ref} is a high level description of the breadth first search algorithm.
Input: Graph $G$, vertices $u,v$
Output: "connected" when $u$ is connected to $v$ in $G$, "disconnected"
Initialize empty queue $Q$.
Put $u$ in $Q$
While{$Q$ is not empty}
Remove top vertex $w$ from $Q$
If{$w=v$} return "connected" endif
Mark $w$
Add all unmarked neighbors of $w$ to $Q$.
Endwhile
Return "disconnected"
If we wanted to give more details on how to implement breadth first search in a programming language such as Python or C (or NAND-RAM / NAND-TM for that matter), we would describe how we implement the queue data structure using an array, and similarly how we would use arrays to mark vertices. We call such an "intermediate level" description an implementation level or pseudocode description. Finally, if we want to describe the implementation precisely, we would give the full code of the program (or another fully precise representation, such as in the form of a list of tuples). We call this a formal or low level description.
While we started off by describing NAND-CIRC, NAND-TM, and NAND-RAM programs at the full formal level, as we progress in this book we will move to implementation and high level description. After all, our goal is not to use these models for actual computation, but rather to analyze the general phenomenon of computation. That said, if you don't understand how the high level description translates to an actual implementation, going "down to the metal" is often an excellent exercise. One of the most important skills for a computer scientist is the ability to move up and down hierarchies of abstractions.
A similar distinction applies to the notion of representation of objects as strings.
Sometimes, to be precise, we give a low level specification of exactly how an object maps into a binary string.
For example, we might describe an encoding of
Finally, because we are translating between the various representations of graphs (and objects in general) can be done via a NAND-RAM (and hence a NAND-TM) program, when talking in a high level we also suppress discussion of representation altogether.
For example, the fact that graph connectivity is a computable function is true regardless of whether we represent graphs as adjacency lists, adjacency matrices, list of edge-pairs, and so on and so forth.
Hence, in cases where the precise representation doesn't make a difference, we would often talk about our algorithms as taking as input an object
Defining "Algorithms". Up until now we have used the term "algorithm" informally. However, Turing machines and the range of equivalent models yield a way to precisely and formally define algorithms. Hence whenever we refer to an algorithm in this book, we will mean that it is an instance of one of the Turing equivalent models, such as Turing machines, NAND-TM, RAM machines, etc. Because of the equivalence of all these models, in many contexts, it will not matter which of these we use.
A computational model is some way to define what it means for a program (which is represented by a string) to compute a (partial) function.
A computational model
::: {.definition title="Turing completeness and equivalence (optional)" #turingcompletedef}
Let
We say that a program
A computational model
A computational model
Some examples of Turing equivalent models (some of which we have already seen, and some are discussed below) include:
- Turing machines
- NAND-TM programs
- NAND-RAM programs
- λ calculus
- Game of life (mapping programs and inputs/outputs to starting and ending configurations)
- Programming languages such as Python/C/Javascript/OCaml... (allowing for unbounded storage)
Many physical systems can be described as consisting of a large number of elementary components that interact with one another. One way to model such systems is using cellular automata. This is a system that consists of a large (or even infinite) number of cells. Each cell only has a constant number of possible states. At each time step, a cell updates to a new state by applying some simple rule to the state of itself and its neighbors.
A canonical example of a cellular automaton is Conway's Game of Life.
In this automata the cells are arranged in an infinite two dimensional grid.
Each cell has only two states: "dead" (which we can encode as
{#onetwodimcellularautomatafig}
Since the cells in the game of life are are arranged in an infinite two-dimensional grid, it is an example of a two dimensional cellular automaton. We can also consider the even simpler setting of a one dimensional cellular automaton, where the cells are arranged in an infinite line, see onetwodimcellularautomatafig{.ref}. It turns out that even this simple model is enough to achieve Turing-completeness. We will now formally define one-dimensional cellular automata and then prove their Turing completeness.
::: {.definition title="One dimensional cellular automata" #cellautomatadef}
Let
A configuration of the automaton
Finite configuration. We say that a configuration of an automaton
We can write a program (for example using NAND-RAM) that simulates the evolution of any cellular automaton from an initial finite configuration by simply storing the values of the cells with state not equal to
In fact, even one dimensional cellular automata can be Turing complete:
::: {.theorem title="One dimensional automata are Turing complete" #onedimcathm}
For every Turing machine
To make the notion of "simulating a Turing machine" more precise we will need to define configurations of Turing machines. We will do so in turingmachinesconfigsec{.ref} below, but at a high level a configuration of a Turing machine is a string that encodes its full state at a given step in its computation. That is, the contents of all (non-empty) cells of its tape, its current state, as well as the head position.
The key idea in the proof of onedimcathm{.ref} is that at every point in the computation of a Turing machine
To turn the above ideas into a rigorous proof (and even statement!) of onedimcathm{.ref} we will need to precisely define the notion of configurations of Turing machines. This notion will be useful for us in later chapters as well.
::: {.definition title="Configuration of Turing Machines." #configtmdef}
Let
A configuration
-
$M$ 's tape contains$\alpha_{j,0}$ for all$j<|\alpha|$ and contains$\varnothing$ for all positions that are at least$|\alpha|$ , where we let$\alpha_{j,0}$ be the value$\sigma$ such that$\alpha_j = (\sigma,t)$ with$\sigma \in \Sigma$ and$t \in {\cdot } \cup [k]$ . (In other words, since$\alpha_j$ is a pair of an alphabet symbol$\sigma$ and either a state in$[k]$ or the symbol$\cdot$ ,$\alpha_{j,0}$ is the first component$\sigma$ of this pair.) -
$M$ 's head is in the unique position$i$ for which$\alpha_i$ has the form$(\sigma,s)$ for$s\in [k]$ , and$M$ 's state is equal to$s$ . :::
::: { .pause } configtmdef{.ref} below has some technical details, but is not actually that deep or complicated. Try to take a moment to stop and think how you would encode as a string the state of a Turing machine at a given point in an execution.
Think what are all the components that you need to know in order to be able to continue the execution from this point onwards, and what is a simple way to encode them using a list of finite symbols.
In particular, with an eye towards our future applications, try to think of an encoding which will make it as simple as possible to map a configuration at step
configtmdef{.ref} is a little cumbersome, but ultimately a configuration is simply a string that encodes a snapshot of the Turing machine at a given point in the execution. (In operating-systems lingo, it is a "core dump".) Such a snapshot needs to encode the following components:
-
The current head position.
-
The full contents of the large scale memory, that is the tape.
-
The contents of the "local registers", that is the state of the machine.
The precise details of how we encode a configuration are not important, but we do want to record the following simple fact:
Let
(For simplicity of notation, above we use the convention that if
Completing the proof of onedimcathm{.ref}. We can now restate onedimcathm{.ref} more formally, and complete its proof:
::: {.theorem title="One dimensional automata are Turing complete (formal statement)" #onedimcathmformal}
For every Turing machine
::: {.proof data-ref="onedimcathmformal"}
We consider the element
The automaton arising from the proof of onedimcathmformal{.ref} has a large alphabet, and furthermore one whose size that depends on the machine
::: {.remark title="Configurations of NAND-TM programs" #nandtmprogconfig} We can use the same approach as configtmdef{.ref} to define configurations of a NAND-TM program. Such a configuration will need to encode:
-
The current value of the variable
i
. -
For every scalar variable
foo
, the value offoo
. -
For every array variable
Bar
, the valueBar[
$j$ ]
for every$j \in {0,\ldots, t-1}$ where$t-1$ is the largest value that the index variablei
ever achieved in the computation. :::
The λ calculus is another way to define computable functions. It was proposed by Alonzo Church in the 1930's around the same time as Alan Turing's proposal of the Turing machine. Interestingly, while Turing machines are not used for practical computation, the λ calculus has inspired functional programming languages such as LISP, ML and Haskell, and indirectly the development of many other programming languages as well. In this section we will present the λ calculus and show that its power is equivalent to NAND-TM programs (and hence also to Turing machines). Our Github repository contains a Jupyter notebook with a Python implementation of the λ calculus that you can experiment with to get a better feel for this topic.
The λ operator.
At the core of the λ calculus is a way to define "anonymous" functions.
For example, instead of giving a name
we can write it as
and so lambda x: x*x
while in JavaScript we can use x => x*x
or (x) => x*x
. In Scheme we would define it as (lambda (x) (* x x))
.
Clearly, the name of the argument to a function doesn't matter, and so
Dropping parentheses. To reduce notational clutter, when writing
A key feature of the λ calculus is that functions are "first-class objects" in the sense that we can use functions as arguments to other functions. For example, can you guess what number is the following expression equal to?
::: { .pause } The expression lambdaexampleeq{.eqref} might seem daunting, but before you look at the solution below, try to break it apart to its components, and evaluate each component at a time. Working out this example would go a long way toward understanding the λ calculus. :::
Let's evaluate lambdaexampleeq{.eqref} one step at a time.
As nice as it is for the λ calculus to allow anonymous functions, adding names can be very helpful for understanding complicated expressions.
So, let us write
Therefore lambdaexampleeq{.eqref} becomes $$ ((F ; g); 3) ;. $$
On input a function
::: {.solvedexercise #lambdaexptwoex} What number does the following expression evaluate to?
::: {.solution data-ref="lambdaexptwoex"}
In a λ expression of the form
maps
In particular, if we invoke the function eqlambdaexampleone{.eqref} on
We now provide a formal description of the λ calculus.
We start with "basic expressions" that contain a single variable such as
::: {.definition title="λ expression." #lambdaexpdef}
A λ expression is either a single variable identifier or an expression
-
Application:
$e = (e' ; e'')$ , where$e'$ and$e''$ are λ expressions. -
Abstraction:
$e = \lambda x.(e')$ where$e'$ is a λ expression. :::
lambdaexpdef{.ref} is a recursive definition since we defined the concept of λ expressions in terms of itself.
This might seem confusing at first, but in fact you have known recursive definitions since you were an elementary school student.
Consider how we define an arithmetic expression: it is an expression that is either just a number, or has one of the forms
Free and bound variables. Variables in a λ expression can either be free or bound to a
Precedence and parentheses. We will use the following rules to allow us to drop some parentheses.
Function application associates from left to right, and so
Equivalence of λ expressions. As we have seen in lambdaexptwoex{.ref}, the rule that
::: {.definition title="Equivalence of λ expressions" #lambdaequivalence} Two λ expressions are equivalent if they can be made into the same expression by repeated applications of the following rules:
-
Evaluation (aka
$\beta$ reduction): The expression$(\lambda x.exp) exp'$ is equivalent to$exp[x \rightarrow exp']$ . -
Variable renaming (aka
$\alpha$ conversion): The expression$\lambda x.exp$ is equivalent to$\lambda y.exp[x \rightarrow y]$ . :::
If
$$ (\lambda x.f)(\lambda y.g; z) ;. \label{lambdaexpeq} $$ There are two natural conventions for this:
-
Call by name (aka "lazy evaluation"): We evaluate lambdaexpeq{.eqref} by first plugging in the right-hand expression
$(\lambda y.g; z)$ as input to the left-hand side function, obtaining$f[x \rightarrow (\lambda y.g; z)]$ and then continue from there. -
Call by value (aka "eager evaluation"): We evaluate lambdaexpeq{.eqref} by first evaluating the right-hand side and obtaining
$h=g[y \rightarrow z]$ , and then plugging this into the left-hand side to obtain$f[x \rightarrow h]$ .
Because the λ calculus has only pure functions, that do not have "side effects", in many cases the order does not matter. In fact, it can be shown that if we obtain a definite irreducible expression (for example, a number) in both strategies, then it will be the same one. However, for concreteness we will always use the "call by name" (i.e., lazy evaluation) order. (The same choice is made in the programming language Haskell, though many other programming languages use eager evaluation.) Formally, the evaluation of a λ expression using "call by name" is captured by the following process:
::: {.definition title="Simplification of λ expressions" #simplifylambdadef }
Let
-
If
$e$ is a single variable$x$ then the simplification of$e$ is$e$ . -
If
$e$ has the form$e= \lambda x.e'$ then the simplification of$e$ is$\lambda x.f'$ where$f'$ is the simplification of$e'$ . -
(Evaluation / $\beta$ reduction.) If
$e$ has the form$e=(\lambda x.e' ; e'')$ then the simplification of$e$ is the simplification of$e'[x \rightarrow e'']$ , which denotes replacing all copies of$x$ in$e'$ bound to the$\lambda$ operator with$e''$ -
(Renaming / $\alpha$ conversion.) The canonical simplification of
$e$ is obtained by taking the simplification of$e$ and renaming the variables so that the first bound variable in the expression is$v_0$ , the second one is$v_1$ , and so on and so forth.
We say that two λ expressions
::: {.solvedexercise title="Equivalence of λ expressions" #lambdaeuivexer}
Prove that the following two expressions
:::
::: {.solution data-ref="lambdaeuivexer"}
The canonical simplification of
Like Turing machines and NAND-TM programs, the simplification process in the λ calculus can also enter into an infinite loop. For example, consider the λ expression
If we try to simplify lambdainfloopeq{.eqref} by invoking the left-hand function on the right-hand one, then we get another copy of lambdainfloopeq{.eqref} and hence this never ends. There are examples where the order of evaluation can matter for whether or not an expression can be simplified, see evalorderlambdaex{.ref}.
We now discuss the λ calculus as a computational model. We will start by describing an "enhanced" version of the λ calculus that contains some "superfluous features" but is easier to wrap your head around. We will first show how the enhanced λ calculus is equivalent to Turing machines in computational power. Then we will show how all the features of "enhanced λ calculus" can be implemented as "syntactic sugar" on top of the "pure" (i.e., non-enhanced) λ calculus. Hence the pure λ calculus is equivalent in power to Turing machines (and hence also to RAM machines and all other Turing-equivalent models).
The enhanced λ calculus includes the following set of objects and operations:
-
Boolean constants and IF function: There are λ expressions
$0$ ,$1$ and$IF$ that satisfy the following conditions: for every λ expression$e$ and$f$ ,$IF; 1;e;f = e$ and$IF;0;e;f = f$ . That is,$IF$ is the function that given three arguments$a,e,f$ outputs$e$ if$a=1$ and$f$ if$a=0$ . -
Pairs: There is a λ expression
$PAIR$ which we will think of as the pairing function. For every λ expressions$e,f$ ,$PAIR; e; f$ is the pair$\langle e,f \rangle$ that contains$e$ as its first member and$f$ as its second member. We also have λ expressions$HEAD$ and$TAIL$ that extract the first and second member of a pair respectively. Hence, for every λ expressions$e,f$ ,$HEAD; (PAIR ; e; f) = e$ and$TAIL ; (PAIR ; e; f) = f$ .^[In Lisp, the$PAIR$ ,$HEAD$ and$TAIL$ functions are traditionally calledcons
,car
andcdr
.] -
Lists and strings: There is λ expression
$NIL$ that corresponds to the empty list, which we also denote by$\langle \bot \rangle$ . Using$PAIR$ and$NIL$ we construct lists. The idea is that if$L$ is a$k$ element list of the form$\langle e_1, e_2, \ldots, e_k, \bot \rangle$ then for every λ expression$e_0$ we can obtain the$k+1$ element list$\langle e_0,e_1, e_2, \ldots, e_k, \bot \rangle$ using the expression$PAIR; e_0 ; L$ . For example, for every three λ expressions$e,f,g$ , the following corresponds to the three element list$\langle e,f,g,\bot \rangle$ : $$ PAIR ; e ; \left(PAIR; f ; \left( PAIR; g ; NIL \right) \right) ;. $$
The λ expression
-
List operations: The enhanced λ calculus also contains the list-processing functions
$MAP$ ,$REDUCE$ , and$FILTER$ . Given a list$L= \langle x_0,\ldots,x_{n-1}, \bot \rangle$ and a function$f$ ,$MAP; L ; f$ applies$f$ on every member of the list to obtain the new list$L'= \langle f(x_0),\ldots,f(x_{n-1}), \bot \rangle$ . Given a list$L$ as above and an expression$f$ whose output is either$0$ or$1$ ,$FILTER; L; f$ returns the list$\langle x_i \rangle_{f x_i = 1}$ containing all the elements of$L$ for which$f$ outputs$1$ . The function$REDUCE$ applies a "combining" operation to a list. For example,$REDUCE; L ; + ; 0$ will return the sum of all the elements in the list$L$ . More generally,$REDUCE$ takes a list$L$ , an operation$f$ (which we think of as taking two arguments) and a λ expression$z$ (which we think of as the "neutral element" for the operation$f$ , such as$0$ for addition and$1$ for multiplication). The output is defined via
$$REDUCE;L;f;z = \begin{cases}z & L=NIL \ f;(HEAD; L) ; (REDUCE;(TAIL; L);f;z) & \text{otherwise}\end{cases};.$$ See reduceetalfig{.ref} for an illustration of the three list-processing operations.
-
Recursion: Finally, we want to be able to execute recursive functions. Since in λ calculus functions are anonymous, we can't write a definition of the form
$f(x) = blah$ where$blah$ includes calls to$f$ . Instead we use functions$f$ that take an additional input$me$ as a parameter. The operator$RECURSE$ will take such a function$f$ as input and return a "recursive version" of$f$ where all the calls to$me$ are replaced by recursive calls to this function. That is, if we have a function$F$ taking two parameters$me$ and$x$ , then$RECURSE; F$ will be the function$f$ taking one parameter$x$ such that$f(x) = F(f,x)$ for every$x$ .
::: {.solvedexercise title="Compute NAND using λ calculus" #NANDlambdaex}
Give a λ expression
::: {.solution data-ref="NANDlambdaex"}
The
$$ N = \lambda x,y.IF(x,IF(y,0,1),1) $$ :::
::: {.solvedexercise title="Compute XOR using λ calculus" #XORlambdaex}
Give a λ expression
::: {.solution data-ref="XORlambdaex"} First, we note that we can compute XOR of two bits as follows: $$ NOT = \lambda a. IF(a,0,1) \label{lambdanot} $$ and $$ XOR_2 = \lambda a,b. IF(b,NOT(a),a) \label{lambdaxor} $$
(We are using here a bit of syntactic sugar to describe the functions. To obtain the λ expression for XOR we will simply replace the expression lambdanot{.eqref} in lambdaxor{.eqref}.) Now recursively we can define the XOR of a list as follows:
This means that
That is,
We could have also computed
An enhanced λ expression is obtained by composing the objects above with the application and abstraction rules. The result of simplifying a λ expression is an equivalent expression, and hence if two expressions have the same simplification then they are equivalent.
::: {.definition title="Computing a function via λ calculus" #lambdacomputedef }
Let
We say that $exp$ computes $F$ if for every
where
The basic operations of the enhanced λ calculus more or less amount to the Lisp or Scheme programming languages. Given that, it is perhaps not surprising that the enhanced λ-calculus is equivalent to Turing machines:
For every function
::: {.proofidea data-ref="lambdaturing-thm"}
To prove the theorem, we need to show that (1) if
Showing (1) is fairly straightforward. Applying the simplification rules to a λ expression basically amounts to "search and replace" which we can implement easily in, say, NAND-RAM, or for that matter Python (both of which are equivalent to Turing machines in power). Showing (2) essentially amounts to simulating a Turing machine (or writing a NAND-TM interpreter) in a functional programming language such as LISP or Scheme. We give the details below but how this can be done is a good exercise in mastering some functional programming techniques that are useful in their own right. :::
::: {.proof data-ref="lambdaturing-thm"} We only sketch the proof. The "if" direction is simple. As mentioned above, evaluating λ expressions basically amounts to "search and replace". It is also a fairly straightforward programming exercise to implement all the above basic operations in an imperative language such as Python or C, and using the same ideas we can do so in NAND-RAM as well, which we can then transform to a NAND-TM program.
For the "only if" direction we need to simulate a Turing machine using a λ expression.
We will do so by first showing for every Turing machine
A configuration of
By nextstepfunctionlem{.ref}, for every
INPUT: List $L = \langle \alpha_0,\alpha_1,\ldots, \alpha_{m-1}, \bot \rangle$ encoding a configuration $\alpha$.
OUTPUT: List $L'$ encoding $NEXT_M(\alpha)$
PROCEDURE{ComputeNext}{$L_{prev},L,L_{next}$}
If $ISEMPTY \; L_{prev}$
return $NIL$
Endif
$a \leftarrow HEAD \; L_{prev}$
If $ISEMPTY\; L$
$b \leftarrow \varnothing$ # Encoding of $\varnothing$ in $\{0,1\}^\ell$
Else
$b \leftarrow HEAD\;L$
Endif
If $ISEMPTY\; L_{next}$
$c \leftarrow \varnothing$
Else
$c \leftarrow HEAD \; L_{next}$
Endif
Return $PAIR \; r(a,b,c) \; ComputeNext(TAIL\; L_{prev}\;,\;TAIL\; L\;,\;TAIL\; L_{next})$
Endprocedure
$L_{prev} \leftarrow PAIR \; \varnothing \; L$ # $L_{prev} = \langle \varnothing , \alpha_0,\ldots, \alpha_{m-1},\bot \rangle$
$L_{next} \leftarrow TAIL\; L$ # $L_{next} = \langle \alpha_1,\ldots,\alpha_{m-1}, \bot \}$
Return $ComputeNext(L_{prev},L,L_{next})$
Once we can compute
Checking whether a configuration is halting (i.e., whether it is one in which the transition function would output $\mathsf{H}$alt) can be easily implemented in the
:::
While the collection of "basic" functions we allowed for the enhanced λ calculus is smaller than what's provided by most Lisp dialects, coming from NAND-TM it still seems a little "bloated". Can we make do with less? In other words, can we find a subset of these basic operations that can implement the rest?
It turns out that there is in fact a proper subset of the operations of the enhanced λ calculus that can be used to implement the rest.
That subset is the empty set.
That is, we can implement all the operations above using the λ formalism only, even without using
::: { .pause }
This is a good point to pause and think how you would implement these operations yourself. For example, start by thinking how you could implement
There are λ expressions that implement the functions
The idea behind enhancedvanillalambdathm{.ref} is that we encode
-
We define
$0$ to be the function that on two inputs$x,y$ outputs$y$ , and$1$ to be the function that on two inputs$x,y$ outputs$x$ . We use Currying to achieve the effect of two-input functions and hence$0 = \lambda x. \lambda y.y$ and$1 = \lambda x.\lambda y.x$ . (This representation scheme is the common convention for representingfalse
andtrue
but there are many other alternative representations for$0$ and$1$ that would have worked just as well.) -
The above implementation makes the
$IF$ function trivial:$IF(cond,a,b)$ is simply$cond ; a; b$ since$0ab = b$ and$1ab = a$ . We can write$IF = \lambda x.x$ to achieve$IF(cond,a,b) = (((IF cond) a) b) = cond ; a ; b$ . -
To encode a pair
$(x,y)$ we will produce a function$f_{x,y}$ that has$x$ and$y$ "in its belly" and satisfies$f_{x,y}g = g x y$ for every function$g$ . That is,$PAIR = \lambda x,y. \left(\lambda g. gxy\right)$ . We can extract the first element of a pair$p$ by writing$p1$ and the second element by writing$p0$ , and so$HEAD = \lambda p. p1$ and$TAIL = \lambda p. p0$ . -
We define
$NIL$ to be the function that ignores its input and always outputs$1$ . That is,$NIL = \lambda x.1$ . The$ISEMPTY$ function checks, given an input$p$ , whether we get$1$ if we apply$p$ to the function$zero = \lambda x,y.0$ that ignores both its inputs and always outputs$0$ . For every valid pair of the form$p = PAIR x y$ ,$p zero = p x y = 0$ while$NIL zero=1$ . Formally,$ISEMPTY = \lambda p. p (\lambda x,y.0)$ .
::: {.remark title="Church numerals (optional)" #Churchnumrem}
There is nothing special about Boolean values. You can use similar tricks to implement natural numbers using λ terms.
The standard way to do so is to represent the number
In this representation, we can compute
Now we come to a bigger hurdle, which is how to implement
We can define
The only fly in this ointment is that the λ calculus does not have the notion of recursion, and so this is an invalid definition.
But of course we can use our
myreducereceq{.ref} means that implementing
How can we implement recursion without recursion?
We will illustrate this using a simple example - the
where
def xor2(a,b): return 1-b if a else b
def head(L): return L[0]
def tail(L): return L[1:]
def xor(L): return xor2(head(L),xor(tail(L))) if L else 0
print(xor([0,1,1,0,0,1]))
# 1
Now, how could we eliminate this recursive call?
The main idea is that since functions can take other functions as input, it is perfectly legal in Python (and the λ calculus of course) to give a function itself as input.
So, our idea is to try to come up with a non-recursive function tempxor
that takes two inputs: a function and a list, and such that tempxor(tempxor,L)
will output the XOR of L
!
::: { .pause } At this point you might want to stop and try to implement this on your own in Python or any other programming language of your choice (as long as it allows functions as inputs). :::
Our first attempt might be to simply use the idea of replacing the recursive call by me
.
Let's define this function as myxor
def myxor(me,L): return xor2(head(L),me(tail(L))) if L else 0
Let's test this out:
myxor(myxor,[1,0,1])
If you do this, you will get the following complaint from the interpreter:
TypeError: myxor() missing 1 required positional argument
The problem is that myxor
expects two inputs- a function and a list- while in the call to me
we only provided a list.
To correct this, we modify the call to also provide the function itself:
def tempxor(me,L): return xor2(head(L),me(me,tail(L))) if L else 0
Note the call me(me,..)
in the definition of tempxor
: given a function me
as input, tempxor
will actually call the function me
with itself as the first input.
If we test this out now, we see that we actually get the right result!
tempxor(tempxor,[1,0,1])
# 0
tempxor(tempxor,[1,0,1,1])
# 1
and so we can define xor(L)
as simply return tempxor(tempxor,L)
.
The approach above is not specific to XOR.
Given a recursive function f
that takes an input x
, we can obtain a non-recursive version as follows:
-
Create the function
myf
that takes a pair of inputsme
andx
, and replaces recursive calls tof
with calls tome
. -
Create the function
tempf
that converts calls inmyf
of the formme(x)
to calls of the formme(me,x)
. -
The function
f(x)
will be defined astempf(tempf,x)
Here is the way we implement the RECURSE
operator in Python. It will take a function myf
as above, and replace it with a function g
such that g(x)=myf(g,x)
for every x
.
def RECURSE(myf):
def tempf(me,x): return myf(lambda y: me(me,y),x)
return lambda x: tempf(tempf,x)
xor = RECURSE(myxor)
print(xor([0,1,1,0,0,1]))
# 1
print(xor([1,1,0,0,1,1,1,1]))
# 0
From Python to the tempf
defined above can be written as λ me. myf(me me)
.
This means that if we denote the input of RECURSE
by
The online appendix contains an implementation of the λ calculus using Python.
Here is an implementation of the recursive XOR function from that appendix:^[Because of specific issues of Python syntax, in this implementation we use f * g
for applying f
to g
rather than fg
, and use λx(exp)
rather than λx.exp
for abstraction. We also use _0
and _1
for the λ terms for
# XOR of two bits
XOR2 = λ(a,b)(IF(a,IF(b,_0,_1),b))
# Recursive XOR with recursive calls replaced by m parameter
myXOR = λ(m,l)(IF(ISEMPTY(l),_0,XOR2(HEAD(l),m(TAIL(l)))))
# Recurse operator (aka Y combinator)
RECURSE = λf((λm(f(m*m)))(λm(f(m*m))))
# XOR function
XOR = RECURSE(myXOR)
#TESTING:
XOR(PAIR(_1,NIL)) # List [1]
# equals 1
XOR(PAIR(_1,PAIR(_0,PAIR(_1,NIL)))) # List [1,0,1]
# equals 0
::: {.remark title="The Y combinator" #Ycombinator}
The
It is one of a family of a fixed point operators that given a lambda expression
"[In 1934], Church had been speculating, and finally definitely proposed, that the λ-definable functions are all the effectively calculable functions .... When Church proposed this thesis, I sat down to disprove it ... but, quickly realizing that [my approach failed], I became overnight a supporter of the thesis.", Stephen Kleene, 1979.
"[The thesis is] not so much a definition or to an axiom but ... a natural law.", Emil Post, 1936.
We have defined functions to be computable if they can be computed by a NAND-TM program, and we've seen that the definition would remain the same if we replaced NAND-TM programs by Python programs, Turing machines, λ calculus, cellular automata, and many other computational models. The Church-Turing thesis is that this is the only sensible definition of "computable" functions. Unlike the "Physical Extended Church-Turing Thesis" (PECTT) which we saw before, the Church-Turing thesis does not make a concrete physical prediction that can be experimentally tested, but it certainly motivates predictions such as the PECTT. One can think of the Church-Turing Thesis as either advocating a definitional choice, making some prediction about all potential computing devices, or suggesting some laws of nature that constrain the natural world. In Scott Aaronson's words, "whatever it is, the Church-Turing thesis can only be regarded as extremely successful". No candidate computing device (including quantum computers, and also much less reasonable models such as the hypothetical "closed time curve" computers we mentioned before) has so far mounted a serious challenge to the Church-Turing thesis. These devices might potentially make some computations more efficient, but they do not change the difference between what is finitely computable and what is not. (The extended Church-Turing thesis, which we discuss in ECTTsec{.ref}, stipulates that Turing machines capture also the limit of what can be efficiently computable. Just like its physical version, quantum computing presents the main challenge to this thesis.)
We can summarize the models we have seen in the following table:
Computational problems | Type of model | Examples |
---|---|---|
Finite functions |
Non-uniform computation (algorithm depends on input length) | Boolean circuits, NAND circuits, straight-line programs (e.g., NAND-CIRC) |
Functions with unbounded inputs |
Sequential access to memory | Turing machines, NAND-TM programs |
-- | Indexed access / RAM | RAM machines, NAND-RAM, modern programming languages |
-- | Other | Lambda calculus, cellular automata |
Table: Different models for computing finite functions and functions with arbitrary input length.
Later on in spacechap{.ref} we will study memory bounded computation. It turns out that NAND-TM programs with a constant amount of memory are equivalent to the model of finite automata (the adjectives "deterministic" or "non-deterministic" are sometimes added as well, this model is also known as finite state machines) which in turn captures the notion of regular languages (those that can be described by regular expressions), which is a concept we will see in restrictedchap{.ref}.
- While we defined computable functions using Turing machines, we could just as well have done so using many other models, including not just NAND-TM programs but also RAM machines, NAND-RAM, the λ-calculus, cellular automata and many other models.
- Very simple models turn out to be "Turing complete" in the sense that they can simulate arbitrarily complex computation.
::: {.exercise title="Alternative proof for TM/RAM equivalence" #RAMTMalternativeex}
Let $SEARCH:{0,1}^* \rightarrow {0,1}^$ be the following function.
The input is a pair $(L,k)$ where $k\in {0,1}^$,
-
Prove that
$SEARCH$ is computable by a Turing machine. -
Let
$UPDATE(L,k,v)$ be the function whose input is a list$L$ of pairs, and whose output is the list$L'$ obtained by prepending the pair$(k,v)$ to the beginning of$L$ . Prove that$UPDATE$ is computable by a Turing machine. -
Suppose we encode the configuration of a NAND-RAM program by a list
$L$ of key/value pairs where the key is either the name of a scalar variablefoo
or of the formBar[<num>]
for some number<num>
and it contains all the non-zero values of variables. Let$NEXT(L)$ be the function that maps a configuration of a NAND-RAM program at one step to the configuration in the next step. Prove that$NEXT$ is computable by a Turing machine (you don't have to implement each one of the arithmetic operations: it is enough to implement addition and multiplication). -
Prove that for every
$F:{0,1}^* \rightarrow {0,1}^*$ that is computable by a NAND-RAM program,$F$ is computable by a Turing machine. :::
::: {.exercise title="NAND-TM lookup" #lookup}
This exercise shows part of the proof that NAND-TM can simulate NAND-RAM. Produce the code of a NAND-TM program that computes the function
::: {.exercise title="Pairing" #pair-ex}
Let
-
Prove that for every
$x^0,x^1 \in \N$ ,$embed(x^0,x^1)$ is indeed a natural number. \ -
Prove that
$embed$ is one-to-one \ -
Construct a NAND-TM program
$P$ such that for every$x^0,x^1 \in \N$ ,$P(pf(x^0)pf(x^1))=pf(embed(x^0,x^1))$ , where$pf$ is the prefix-free encoding map defined above. You can use the syntactic sugar for inner loops, conditionals, and incrementing/decrementing the counter. \ -
Construct NAND-TM programs
$P_0,P_1$ such that for every$x^0,x^1 \in \N$ and$i \in N$ ,$P_i(pf(embed(x^0,x^1)))=pf(x^i)$ . You can use the syntactic sugar for inner loops, conditionals, and incrementing/decrementing the counter. :::
::: {.exercise title="Shortest Path" #shortestpathcomputableex}
Let
::: {.exercise title="Longest Path" #longestpathcomputableex}
Let
::: {.exercise title="Shortest path λ expression" #shortestpathlambda}
Let
::: {.exercise title="Next-step function is local" #nextstepfunctionlemex} Prove nextstepfunctionlem{.ref} and use it to complete the proof of onedimcathm{.ref}. :::
Prove that for every λ-expression
::: {.exercise title="Evaluation order example in λ calculus" #evalorderlambdaex}
-
Let
$e = \lambda x.7 \left( (\lambda x.xx) (\lambda x.xx) \right)$ . Prove that the simplification process of$e$ ends in a definite number if we use the "call by name" evaluation order while it never ends if we use the "call by value" order. -
(bonus, challenging) Let
$e$ be any λ expression. Prove that if the simplification process ends in a definite number if we use the "call by value" order then it also ends in such a number if we use the "call by name" order. See footnote for hint.^[Use structural induction on the expression$e$ .] :::
::: {.exercise title="Zip function" #zipfunctionex}
Give an enhanced λ calculus expression to compute the function zip
compression file format.]
:::
::: {.exercise title="Next-step function without
::: {.exercise title="λ calculus to NAND-TM compiler (challenging)" #lambdacompiler }
Give a program in the programming language of your choice that takes as input a λ expression GOTO
and all NAND-CIRC syntactic sugar in your output program. You can use any encoding of λ expressions as binary string that is convenient for you. See footnote for hint.^[Try to set up a procedure such that if array Left
contains an encoding of a λ expression Right
contains an encoding of another λ expression Result
will contain
::: {.exercise title="At least two in
Prove that
::: {.exercise title="Locality of next-step function" #stringsprogramex}
This question will help you get a better sense of the notion of locality of the next step function of Turing machines. This locality plays an important role in results such as the Turing completeness of STRINGS
to be the a programming language that has the following semantics:
-
A
STRINGS
program$Q$ has a single string variablestr
that is both the input and the output of$Q$ . The program has no loops and no other variables, but rather consists of a sequence of conditional search and replace operations that modifystr
. -
The operations of a
STRINGS
program are:-
REPLACE(pattern1,pattern2)
wherepattern1
andpattern2
are fixed strings. This replaces the first occurrence ofpattern1
instr
withpattern2
-
if search(pattern) { code }
executescode
ifpattern
is a substring ofstr
. The codecode
can itself include nestedif
's. (One can also add anelse { ... }
to execute ifpattern
is not a substring of condf). - the returned value is
str
-
-
A
STRING
program$Q$ computes a function $F:{0,1}^* \rightarrow {0,1}^$ if for every $x\in {0,1}^$, if we initializestr
to$x$ and then execute the sequence of instructions in$Q$ , then at the end of the executionstr
equals$F(x)$ .
For example, the following is a STRINGS
program that computes the function $F:{0,1}^* \rightarrow {0,1}^$ such that for every $x\in {0,1}^$, if
if search('110011') {
replace('110011','00')
} else if search('110111') {
replace('110111','00')
} else if search('111011') {
replace('111011','00')
} else if search('111111') {
replace('1111111','00')
}
Prove that for every Turing machine program STRINGS
program STRINGS
program fully, but you do need to give a convincing argument that such a program exists.
:::
Chapters 7 in the wonderful book of Moore and Mertens [@MooreMertens11] contains a great exposition much of this material. .
The RAM model can be very useful in studying the concrete complexity of practical algorithms. Its theoretical study was initiated in [@cook1973time]. However, the exact set of operations that are allowed in the RAM model and their costs vary between texts and contexts. One needs to be careful in making such definitions, especially if the word size grows, as was already shown by Shamir [@shamir1979]. Chapter 3 in Savage's book [@Savage1998models] contains a more formal description of RAM machines, see also the paper [@hagerup1998]. A study of RAM algorithms that are independent of the input size (known as the "transdichotomous RAM model") was initiated by [@fredman1993]
The models of computation we considered so far are inherently sequential, but these days much computation happens in parallel, whether using multi-core processors or in massively parallel distributed computation in data centers or over the Internet.
Parallel computing is important in practice, but it does not really make much difference for the question of what can and can't be computed.
After all, if a computation can be performed using
The λ-calculus was described by Church in [@church1941]. Pierce's book [@pierce2002types] is a canonical textbook, see also [@barendregt1984]. The "Currying technique" is named after the logician Haskell Curry (the Haskell programming language is named after Haskell Curry as well). Curry himself attributed this concept to Moses Schönfinkel, though for some reason the term "Schönfinkeling" never caught on.
Unlike most programming languages, the pure λ-calculus doesn't have the notion of types. Every object in the λ calculus can also be thought of as a λ expression and hence as a function that takes one input and returns one output. All functions take one input and return one output, and if you feed a function an input of a form it didn't expect, it still evaluates the λ expression via "search and replace", replacing all instances of its parameter with copies of the input expression you fed it. Typed variants of the λ calculus are objects of intense research, and are strongly related to type systems for programming language and computer-verifiable proof systems, see [@pierce2002types]. Some of the typed variants of the λ calculus do not have infinite loops, which makes them very useful as ways of enabling static analysis of programs as well as computer-verifiable proofs. We will come back to this point in restrictedchap{.ref} and chapproofs{.ref}.
Tao has proposed showing the Turing completeness of fluid dynamics (a "water computer") as a way of settling the question of the behavior of the Navier-Stokes equations, see this popular article.