+ All Categories
Home > Documents > cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226...

cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226...

Date post: 14-Aug-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
102
CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer * January 4, 2004 1 Introduction 1.1 The Topic of Computability Theory A computable function is a function f : A B such that there is a mechanical procedure for computing for every a A the result f (a) B. Computability theory is the study of computable functions. In computability theory we study the limits of this notion. 1.2 Examples (a) exp : N N, exp(n) := 2 n . (N = {0, 1, 2,... }). It is trivial to write a program for computing exp. in almost any programming language – so exp is computable. However: Can be really compute say exp(10000000000000000000000000000000000)? (b) Let String be the set of strings of ASCII symbols. Define a function check : String →{0, 1} by check(p) := 1 if p is a syntactically correct Pascal program, 0 otherwise. Clearly, the function check is computable (syntax checker, which is part of the Pascal com- piler), provided one refers to a precisely defined notion of what a Pascal program is. (c) Define a function terminate : String →{0, 1}, terminate(p) := 1 if p is a syntactically correct Pascal program without interaction with the user, which terminates, 0 otherwise. Is terminate computable? We will later see that terminate is an instance of the Turing Halting Problem, and therefore non-computable. * email [email protected], www http://www.cs.swan.ac.uk/csetzer/index.html, tel. 01792 51-3368, room 226. These notes are based on lecture notes by U. Berger. 1-1
Transcript
Page 1: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

CS 226 Computability Theory

Course Notes, Spring 2003

Anton Setzer∗

January 4, 2004

1 Introduction

1.1 The Topic of Computability Theory

A computable function is a functionf : A → B

such that there is a mechanical procedure for computing for every a ∈ A the result f(a) ∈ B.

Computability theory is the study of computable functions. In computability theory we study thelimits of this notion.

1.2 Examples

(a) exp : N → N, exp(n) := 2n.(N = {0, 1, 2, . . .}).It is trivial to write a program for computing exp. in almost any programming language – soexp is computable.However: Can be really compute say exp(10000000000000000000000000000000000)?

(b) Let String be the set of strings of ASCII symbols.Define a function check : String → {0, 1} by

check(p) :=

{1 if p is a syntactically correct Pascal program,0 otherwise.

Clearly, the function check is computable (syntax checker, which is part of the Pascal com-piler), provided one refers to a precisely defined notion of what a Pascal program is.

(c) Define a function terminate : String → {0, 1},

terminate(p) :=

1 if p is a syntactically correct Pascal programwithout interaction with the user, whichterminates,

0 otherwise.

Is terminate computable?We will later see that terminate is an instance of the Turing Halting Problem, and thereforenon-computable.

∗email [email protected], www http://www.cs.swan.ac.uk/∼csetzer/index.html, tel. 01792 51-3368,

room 226. These notes are based on lecture notes by U. Berger.

1-1

Page 2: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-2

(d) Define a function issortingfun : String → {0, 1},

issortingfun(p) :=

1 if p is a syntactically correctPascal program, which has as input a listand returns a sorted list,

0 otherwise.

So issortingfun checks whether its input is a sorting function. Of course it has to be specifiedprecisely what it means for a program to take a list as input and returns it (e.g. input fromconsole and output on console).

If we could decide this problem, we could decide the previous one: Take a Pascal program forwhich we have to check whether it is a sorting function. Create a new pascal program, whichtakes as input a list, then runs as the original one, until that one terminates. If it terminates,the new program takes the original list, sorts it and returns it. Otherwise the new programnever terminates. The new program is a sorting function if and only if the original programterminated. Since the problem, whether a program terminates, is undecidable, the problemof verifying, whether we have a sorting function, is undecidable as well.

1.3 Problems in Computability

In order to properly understand and answer the questions raised in the previous examples we haveto:

• First give a precise definition of what “computable” means.

– That will be a mathematical definition.

∗ For showing that a function is computable, it suffices to show how it can be com-puted by a computer. For this an intuitive understanding of “computable” suffices.

∗ In order to show that a function is non-computable, however, we need to showthat it cannot be computed in principle. For this we need a very precise notion ofcomputable.

• Then provide evidence that our definition of “computable” is the correct one.

– That will be a philosophical argument.

• Develop methods for proving that certain functions are computable and certain functions arenon-computable.

Questions related to the above are the following:

• Given a function f : A → B can be computed, can it be done effectively? (Complexitytheory.)

• Can the task of deciding a given problem P1 be reduced to deciding another problem P2?(Reducibility theory).

Other interesting questions, which are beyond the scope of this lecture are:

• Can the notion of computability be extended to computations on infinite objects (like e.g.streams of data, real numbers, higher type operations)? (Higher and abstract computabilitytheory).

• What is the relationship between computing (producing actions, data etc.) and proving.

Computability theory involves three areas:

Page 3: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-3

• Mathematics.

– To give a precise definition of computability and analyse this concept.

• Philosophy.

– to verify that notions found, like “computable”, are the correct ones.

• Computer science.

– To investigate the relationship of these theoretical concepts and computing in the realworld.

Remark: In computability theory, one usually abstracts from limitations on time and space. Aproblem will be computable, if it can be solved on an idealised computer, even if it it would takelonger than the life time of the universe to actually carry out the computation.

1.4 Remarks on the History of Computability Theory

• Leibnitz (1670) Leibnitz built a first mechanical calculator. He was thinking about buildinga machine that could manipulate symbols in order to determine the truth values of math-ematical statements. He noticed that a first step would be to introduce a precise formallanguage, and he was working on defining such a language.

Gottfried Wilhelmvon Leibnitz (1646 – 1716)

• Hilbert (1900) poses in his famous list “Mathematical Problems” as 10th problem to decideDiophantine equations.

David Hilbert (1862 – 1943)

Page 4: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-4

• Hilbert (1928) poses the “Entscheidungsproblem” (decision problem).

– He has (already 1900) developed a theory for formalising mathematical proofs, andbelieves that it is complete and sound, i.e. that it shows exactly all true mathematicalformulae.

– Hilbert asks, whether there is an algorithm, which decides whether a mathematicalformula is a consequence of his theory. Assuming that his theory is complete and sound,such an algorithm would decide the truth of all mathemtical formulae, expressible inthe language of his theory.

– The question, whether there is an algorithm for deciding the truth of mathematicalformulae is later called the “Entscheidungsproblem”.

• Godel, Kleene, Post, Turing (1930s) introduce different models of computation andprove that they all define the same class of computable functions.

Kurt Godel (1906 – 1978)

Stephen Cole Kleene(1909 – 1994)

Emil Post (1897 – 1954)

Page 5: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-5

Alan Mathison Turing(1912 – 1954)

• Godel (1931) proves in his incompleteness theorem that that any reasonable recursive isincomplete, i.e there is a formula which is neither provable nor its negation. By the Church-Turing thesis to be established later (see below), it will follow that the recursive functions arethe computable ones. Therefore no such theory proves all true formulae (true in the sense ofbeing true in a certain model, which could be standard structure assumed in mathematics).Therefore, the “Entscheidungsproblem” is unsolvable – an algorithm for deciding the truthof mathematical formulae would give rise to a complete and sound theory fulfilling Godel’sconditions.

• Church, Turing (1936) postulate that the models of computation established above defineexactly the set of all computable functions (Church-Turing thesis).

• Both establish undecidable problems and conclude that the Entscheidungsproblem is unsolv-able, even for a class of very simple formulae.

– Church shows the undecidability of equality in the λ-calculus.

– Turing shows the unsolvability of the halting problem.That problem turns out to be the most important undecidable problem.

Alonzo Church (1903 - 1995)

• Post (1944) studies degrees of unsolvability. This is the birth of degree theory.

• Matiyasevic (1970) solves Hilbert’s 10th problem negatively: The solvability of Diophan-tine equations is undecidable.

Page 6: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-6

Yuri VladimirovichMatiyasevich (∗ 1947)

• Cook (1971) introduces the complexity classes P and NP and formulates the problem,whether P 6= NP.

Stephen Cook(Toronto)

• Today:

– The problem P 6= NP is still open. Complexity theory has become a big research area.

– Intensive study of computability on infinite objects (e.g. real numbers, higher typefunctionals) is carried out (e.g. Dr. Berger in Swansea).

– Computability on inductive and co-inductive data types is studied.

– Research on program synthesis from formal proofs (e.g. Dr. Berger in Swansea).

– Concurrent and game-theoretic models of computation are developed (e.g. Prof. Mollerin Swansea).

– Automata theory further developed.

– Alternative models of computation are studied (quantum computing, genetic algo-rithms).

– · · ·

• Remarks on the Name “Computability Theory”:

– The original name was recursion theory, since the mathematical concept claimed tocover exactly the computable functions is called “recursive function”.

– This name was changed to computability theory during the last 10 years.

– Many books still have the title “recursion theory”.

Page 7: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-7

1.5 Administrative Issues

Lecturer:Dr. A. SetzerDept. of Computer ScienceUniversity of Wales SwanseaSingleton ParkSA2 8PPUKRoom: Room 211, Faraday BuildingTel.: (01792) 513368

Fax. (01792) 295651

Email [email protected]

Home page: http://www.cs.swan.ac.uk/∼csetzer/

Assessment:

• 80% Exam.

• 20% Coursework.

The course home page: is located athttp://www.cs.swan.ac.uk/∼csetzer/lectures/computability/03/index.htmlThere is an open version of the slides and a, for copyright-reasons, password-protected version.The password is .

1.6 Plan for this Module.

(Might be changed, since the lecturer is teaching this module for the first time).

1. Introduction.

• This section.

2. Encoding of data types into N.

• We show how to encode elements of some data types as elements of N, the set of naturalnumbers.

• This shows that by studying computability of natural numbers we treat computabilityon many other data types as well.

• We discuss as well the notions of countable vs. uncountable sets.

3. The Unlimited Register Machine (URM) and the halting problem.

• We study one model of computation, the URM. It is one of the easiest to work with.

• We show that the halting problem is undecidable.

4. Turing machines.

• Turing machines as a more intuitive model of computation.

5. Algebraic view of computability.

• We study the notion of primitive recursive functions.

• We introduce the concept of a partial recursive function, a third model of computation.

Page 8: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-8

6. Lambda-definable functions.

• We study a fourth model of computation.

• We show that lambda-definable functions and partial recursive functions are the sameclass.

7. Equivalence theorems and the Church-Turing thesis.

• We show that the four notions of computation above coincide.

• We discuss the Church-Turing thesis, namely that the in an intuitive sense computablefunctions are exactly those which can be computed by one of the equivalent models ofcomputation.

8. Enumeration of the computable functions.

• We show how to enumerate in a computable way all computable functions.

9. Recursively enumerable predicates and the arithmetic hierarchy.

• We study a notion, which covers relations which are undecidable, but which can still becomputed: we can in a computable way enumerate the elements fulfilling it.

• We study the arithmetic hierarchy.

10. Reducibility.

• We study the concept of reducibility.

• We show Rice’s theorem, which allows to show the undecidability of many sets.

11. Computational complexity.

• We study basic complexity theory.

1.7 Aims of this Module

• To become familiar with fundamental models of computation and the relationship betweenthem.

• To develop an appreciation for the limits of computation and to learn techniques for recog-nising unsolvable or unfeasible computational problems.

• To understand the historic and philosophical background of computability theory.

• To be aware of the impact of the fundamental results of computability theory to areas ofcomputer science such as software engineering and artificial intelligence.

• To understand the close connection between computability theory and logic.

• To be aware of recent concepts and advances in computability theory.

• To learn fundamental proving techniques like induction and diagonalisation.

Page 9: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 1-9

1.8 Literature:

• N J Cutland: Computability: An Introduction to Recursive Function Theory. CambridgeUniversity Press, 1984.

– Main course book

• H. R. Lewis and C. H. Papadimitriou: Elements of the Theory of Computation. 2nd Edition,Prentice Hall, 1998.

• M. Sipser: Introduction to the Theory of Computation. PWS, 1997.

• J. Martin: Introduction to Languages and the Theory of Computation. McGraw-Hill, 2003.

• J. E. Hopcroft, R. Motwani, J. D. Ullman: Introduction to Automata Theory, Languages,and Computation. Addison-Wesley, 2000.

– Book on automata theory and context free grammars.

• J. R. Hindley: Basic Simple Type Theory. Cambridge University Press, Cambridge Tractsin Theoretical Computer Science 42, 1997.

– Best book on the λ-calculus.

• D. J. Velleman: How To Prove It. Cambridge University Press, 1994.

– Basic mathematics. Recommended to fill any gaps in your mathematical background.

• E Griffor (Ed.):Handbook of Computability Theory. North-Holland, 1999.

– Very expensive. State of the art of computability theory, postgraduate level.

Page 10: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-1

2 Encoding of Data Types into N

In this section, we show how to encode elements of some data types as elements of N, the setof natural numbers. This shows that by studying computability of natural numbers we treatcomputability on many other data types as well. We discuss as well the notions of countable vs.uncountable sets.

2.1 Countable and Uncountable Sets

Notation 2.1 If A is a finite set, let |A| be the number of elements in A.

Remark 2.2 One sometimes writes #A for |A|.

If A and B are finite sets, then |A| = |B|, if and only if there is a bijection between A and B:

Bijection exists

a

b

c

d

e

f

No Bijection?

a

b

c

d

e

No Bijection?

a

b

c

d

e

For arbitrary (possibly infinite) sets, the above generalises as follows:

Definition 2.3 Two sets A and B have the same cardinality, written as A ' B, if there existsa bijection between A and B, i.e. if there exists functions f : A → B and g : B → A which areinverse with each other: ∀x ∈ A.g(f(x)) = x and ∀x ∈ B.f(g(x)) = x.

It follows immediately:

Remark 2.4 If A and B are finite sets, then A ' B if and only A and B have the same numberof elements.

Lemma 2.5 ' is an equivalence relation, i.e. for all sets A, B, C we have:

(a) Reflexivity. A ' A.

(b) Symmetry. If A ' B, then B ' A.

(c) Transitivity. If A ' B and B ' C, then A ' C.

Proof:(a): The function id : A → A, id(a) = a is a bijection.(b): If f : A → B is a bijection, so is its inverse f−1.(c): If f : A → B and g : B → C are bijections, so is the composition g ◦ f : A → C.

Page 11: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-2

Theorem 2.6 A set A and its power set P(A) := {B | B ⊆ A} never have the same cardinality:

A 6' P(A)

Proof:This is a typical diagonalisation argument.We first consider the case A = N and show that there is no bijection between N and P(N).Assume we have such a bijection f : N → P(N). We define a set C ⊆ N s.t. C 6= f(n) for everyn ∈ N. C = f(n) will be violated at element n:

• If n ∈ f(n), we add n not to C, therefore n ∈ f(n) ∧ n 6∈ C.

• If n 6∈ f(n), we add n to C, therefore n 6∈ f(n) ∧ n ∈ C.

Example: We take an arbitrary function f , and show how C is defined in this case:

f(0) = { 0, 1, 2, 3, 4, . . . }f(1) = { 0, 2, 4, . . . }f(2) = { 1, 3, . . . }f(3) = { 0, 1, 3, 4, . . . }

· · ·C = { 1, 2, . . . }

We were considering above the nth element of f(n), so we were going through the diagonal in theabove matrix. Therefore this argument is called a diagonalisation argument.So we obtain the following definition

C := {n ∈ N | n 6∈ f(n)} .

Now C = f(n) is violated at element n: If n ∈ C , then n 6∈ f(n). If n 6∈ C, then n ∈ f(n).C is not in the image of f , a contradiction.In short, the above argument reads as follows:Assume f : N → P(A) is a bijection. Define C := {n ∈ N | n 6∈ f(n)}. C is in the image of f .Assume C = f(n). Then

n ∈ CDefinition of C⇔ n 6∈ f(n)

C=f(n)⇔ n 6∈ C

a contradiction

General Situation: The proof for the general situation is almost identical:Assume f : A → P(A) is a bijection.We define a set C, s.t. C = f(a) is violated for a:

C := {a ∈ A | a 6∈ f(a)}

C is in the image of f . Assume C = f(a). Then we have

a ∈ CDefinition of C⇔ a 6∈ f(a)

C=f(a)⇔ a 6∈ C

a contradiction

Definition 2.7 • A set A is countable, if it is finite or A ' N.

• A set, which is not countable, is called uncountable.

Page 12: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-3

Examples:

• N is countable.

– Since N ' N.

• Z := {. . . ,−2,−1, 0, 1, 2, . . .} is countable.

– We can enumerate the elements of Z in the following way:0,−1, +1,−2, +2,−3, +3,−4, +4, . . ..So we have the following map:0 7→ 0, 1 7→ −1, 2 7→ 1, 3 7→ −2, 4 7→ 2, etc.This map can be described as follows:g : N → Z,

g(n) :=

{n2 if n is even,−n+1

2 if n is odd.

Exercise: Show that g is bijective.

• P(N) is uncountable.

– P(N) is not finite.

– N 6' P(N).

• P({1, . . . , 10}) is finite, therefore countable.

Lemma 2.8 A set A is countable, if and only if there is an injective map g : A → N.

Remark 2.9 Intuitively, Lemma 2.8 expresses: A is countable, if we can assign to every elementa ∈ A a unique code f(a) ∈ N. It is not required that each element of N occurs as a code.

Proof of Lemma 2.8:“⇒”: Assume A is countable. Show there is an injective f : A → N.

• Case A is finite. Assume A = {a0, . . . , an} with ai 6= aj for i 6= j. Define f : A → N, ai 7→ i.f is injective.

• Case A is infinite. A is countable, so there exists a bijection from A into N, which is thereforeinjective.

“⇐”: Assume f : A → N is injective. Show A is countable.If A is finite, we are done.Assume A is infinite. Then f is for instance something like the following:

a

b

c

d

0

1

2

3

4

5

6

7

8

9

fA N

In order to obtain a bijection g : A → N, we have to jump over the gaps in the image of f :

Page 13: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-4

a

b

c

d

0

1

2

3

4

5

6

7

8

9

0

1

2

3

A N Nf

g

So we have

• f(a) = 1, which is the element number 0 in the image of f . g should instead map a to 0.

• f(b) = 4, which is the element number 1 in the image of f . g should instead map b to 1.

• etc.

1 is element number 0 in the image of f , because the number of elements f(a′) below f(a) is 0. 4is element number 1 in the image of f , because the number of elements f(a′) below f(b) is 1. Soin general we define g : A → N.

g(a) := |{a′ ∈ A | f(a′) < f(a)}|

g is well defined, since f is injective, so the number of a′ ∈ A s.t. f(a′) < f(a) is finite.We show that g is a bijection:

• g is injective:Assume a, b ∈ A, a 6= b. Show g(a) 6= g(b).By the injectivity of f we have f(a) 6= f(b). Let for instance f(a) < f(b).

.

.

.

.

.

.

.

.

.

.

.

.g(b)

g(a)

A

fg

NN

.

.

.

b

f(b)

.

.

.f(a)

.

.

.

.

.

.a

.

.

.

Then

{a′ ∈ A | f(a′) < f(a)}⊂

6= {a′ ∈ A | f(a′) < f(b)} ,

therefore

g(a) = |{a′ ∈ A | f(a′) < f(a)}| < |{a′ ∈ A | f(a′) < f(b)}| = g(b) ,

g(a) 6= g(b).

• g is surjective:We define by induction on k for k ∈ N an element ak ∈ A s.t. g(ak) = k. Then the assertionfollows:Assume we have defined already a0, . . . , ak−1.

Page 14: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-5

A

fg

...

.

.

....

a(k−1)

a0 f(a0)

k−1

0NN

.

.

.f(a(k−1))

.

.

.

.

..

n...

k

a’

a

There exist infinitely many a′ ∈ A, f is injective, so there must be at least one a′ ∈ A s.t.f(a′) > f(ak−1). Let n be minimal s.t. n = f(a) for some a ∈ A and n > f(ak−1). Let a bethe unique element of A s.t. f(a) = n. Then we have

{a′′ ∈ A | f(a′′) < f(a)} = {a′′ ∈ A | f(a′′) < f(ak−1)} ∪ {ak−1} .

Therefore

g(a) = |{a′′ ∈ A | f(a′′) < f(a)}|= |{a′′ ∈ A | f(a′′) < f(ak−1)}| + 1

= g(ak−1) + 1

= k − 1 + 1

= k .

Let ak := a. Then g(ak) = k.

Corollary 2.10 (a) If B is countable and g : A → B injective, then A is countable.

(b) If A is uncountable and g : A → B injective, then B is uncountable.

(c) If B is countable and A ⊆ B, then A is countable.

(d) If A is uncountable and A ⊆ B, then B is uncountable.

Proof:(a): If f : B → N is an injection, so is f ◦ g : A → N.(b): By (a). Why? (Exercise).(c): By (a). (What is g?; exercise).(d): By (c). Why? (Exercise).

Lemma 2.11 Let A be a non-empty set. A is countable, if and only if there is a surjectionh : N → A.

Proof:“⇒”: Assume A is non-empty and countable. Show there exists a surjection f : N → A.

• Case A is finite. Assume A = {a0, . . . , an}. Define f : N → A,

f(k) :=

{ak if k ≤ n ,a0 otherwise .

f is clearly surjective.

• Case A is infinite. A is countable, so there exists a bijection from N to A, which is thereforesurjective.

Page 15: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-6

“⇐”: Assume h : N → A is surjective. Show A is countable.Define g : A → N, g(a) := min{n | h(n) = a}. g(a) is well-defined, since h is surjective: thereexists some n s.t. h(n) = a, therefore the minimal such n is well-defined.It follows that for all a ∈ A, h(g(a)) = a. Therefore g is injective: If a, a′ ∈ A, g(a) = g(a′), thena = h(g(a)) = h(g(a′)) = a′. By Lemma 2.8, A is countable.

Lemma 2.12 The following sets are uncountable:

(a) P(N).

(b) F := {f | f : N → {0, 1}}.

(c) G := {f | f : N → N}.

(d) The set of real numbers R.

Proof:(a): P(N) is not finite and P(N) 6' N by Theorem 2.6.(b): We introduce a bijection between F and P(N). Then by P(N) uncountable it follows that Fis uncountable.Define for A ∈ P(N) χA : N → {0, 1},

χA(n) :=

{1 n ∈ A,0 otherwise.

χA is called the characteristic function of A. For instance, if A = {2, 4, 6, 8, . . . , }, then χA is asfollows:

0 1 2 3 4 5 6 7 8

A = {0 2, 4, 6, 8, . . .

n

X_A(n)

1

χ is a function from P(N) to N → {0, 1}, where we write the application of χ to an element A asχA instead of χ(A). χ has an inverse, namely the function χ−1 from N → {0, 1} into P(N), wherefor f : N → {0, 1}, χ−1(f) := {n ∈ N | f(n) = 1}.For instance, if f : N → N, f(n) =

{1 if n is even,0 otherwise,

(i.e. f = χA for the set A of even numbers

as above), then χ−1(f) is as follows

2, 4, 6, 8, . . . = {0

0 1 2 3 4 5 6 7 8

n

1

f

X^{−1}(f)

We show that χ and χ−1 are in fact inverse:χ−1 ◦ χ is the identity: If A ⊆ N, then

χ−1(χA) = {n ∈ N | χA(n) = 1}= {n ∈ N | n ∈ A}= A

Page 16: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-7

χ ◦ χ−1 is the identity: If f : N → {0, 1}, then

χχ−1(f)(n) = 1 ⇔ n ∈ χ−1(f)

⇔ f(n) = 1

and

χχ−1(f)(n) = 0 ⇔ n 6∈ χ−1(f)

⇔ f(n) 6= 1

⇔ f(n) = 0 .

Therefore χχ−1(f) = f .It follows that χ is bijective and F is uncountable.(c): By (c), since F ⊆ G.(d):y A first idea is to define a function f0 : F → R, f0(g) = (0.g(0)g(1)g(2) · · · )2, where the righthand side is a number in binary format. If f0 were injective, then by F uncountable we could con-clude R is uncountable. The problem is that (0.a0a1 · · · ak01111 · · · )2 and (0.a0a1 · · · ak10000 · · · )2denote the same real number, so f0 is not injective.In order to avoid this problem, we modify f0, so that we don’t obtain any binary numbers of theform (0.a0a1 · · · ak1111 · · · )2. We define instead f : F → R, f(g) = (0.g(0) 0 g(1) 0 g(2) 0 · · · )2,i.e. f(g) = (0.a0a1a2 · · · )2, where

ak :=

{0 if k is odd,g(k

2 ) otherwise.

If two sequences (b0b1 · · · ) and (c0c1 · · · ) don’t end in 1111 · · · , i.e. are not of the form (d0d1 · · · dl11111 · · · ),then

(0.b0b1 · · · )2 = (0.c0c1 · · · )2 ⇔ (b0b1b2 · · · ) = (c0c1c2 · · · )Therefore

f(g) = f(g′) ⇔ (0.g(0) 0 g(1) 0 g(2) 0 · · · )2 = (0.g′(0) 0 g′(1) 0 g′(2) 0 · · · )2⇔ (g(0) 0 g(1) 0 g(2) 0 · · · ) = (g′(0) 0 g′(1) 0 g′(2) 0 · · · )⇔ (g(0) g(1) g(2) · · · ) = (g′(0) g′(1) g′(2) · · · )⇔ g = g′ ,

f is injective.

Remark on the Continuum Hypothesis:One can show that the four sets above all have the same cardinality.So there are two infinite N and R, which have different cardinality. The question arises whetherthere is a set B with cardinality in between, i.e. s.t. there are injections from N into B and fromB into P(N), but s.t. B 6' N and B 6' P(N). This question was Hilbert’s first problem.That there is no such set is called the Continuum Hypothesis. (R is called the continuum). PaulCohen has shown that this question is independent from ordinary set theory, i.e. we can neithershow that such a B exists, nor that such a B doesn’t exist.

Page 17: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-8

Paul Cohen(∗ 1934)

2.2 Encoding of Sequences into N

Definition 2.13 Let A, B be sets.

• Ak := {(a1, . . . , ak) | a1, . . . , ak ∈ A}. Ak is the set of k-tuples of elements of A, and is calledthe k-fold Cartesian product.

Especially A0 = {()}.

• A∗ :=⋃

n∈NAk.

A∗ is the set of tuples of A of arbitrary length, and called the Kleene-Star.

• A → B := {f | f : A → B}.A → B is the set of functions from A to B and is called the function space.

Remark: A∗ can be considered as the set of strings having letters in the alphabet A.

We want to define a bijective function π : N2 → N. π will be called the pairing function.To define such a bijection means, that we have to enumerate all elements of N2. Pairs of naturalnumbers can be enumerated in the following way:

y 0 1 2 3 4x

0 0 2 5 9 14

1 1 4 8 13 19

2 3 7 12 18 25

3 6 11 17 24 32

4 10 16 23 31 40

• We start in the top left corner and define π(0, 0) = 0.

• Then we go to the next diagonal. π(1, 0) = 1 and π(0, 1) = 2.

• Then we go to the next diagonal. π(2, 0) = 3, π(1, 1) = 4, π(0, 2) = 5.

• etc.

Page 18: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-9

Note, that the following naıve attempt to enumerate the pairs, fails:

y 0 1 2 3 4x0 π(0, 0) → π(0, 1) → π(0, 2) → π(0, 3) → π(0, 4) → π(0, 5) → · · ·1 → → → → → → · · ·2 → → → → → → · · ·3 → → → → → → · · ·4 → → → → → → · · ·

In this enumeration, the first pair is π(0, 0), which gets number 0. The next one is π(0, 1), andit gets number 1. π(0, 2) gets number 2 etc. But this way we never reach the pair (1, 0), since ittakes infinitely long to enumerate the elements of the first row – this enumeration does not work.

We will investigate the first solution, which worked. If we look at the diagonals we see thatfollowing:

• For the pairs in the diagonal we have the property that x + y is constant.

– The first diagonal, consisting of (0, 0) only, is given by x + y = 0.

– The second diagonal, consisting of (1, 0), (0, 1), is given by x + y = 1.

– The third diagonal, consisting of (2, 0), (1, 1), (0, 2), is given by x + y = 2.

– Etc.s

• The diagonal given by x + y = n, consists of n + 1 pairs:

– The first diagonal, given by x + y = 0, consists of (0, 0) only, i.e. of 1 pair.

– The second diagonal, given by x + y = 1, consists of (1, 0), (0, 1), i.e. of 2 pairs.

– The third diagonal, given by x + y = 2, consisting of (2, 0), (1, 1), (0, 2), i.e. of 3 pairs.

– etc.

y 0 1 2 3 4x

0 0 2 5

1 1 4

2 3 7

3 6

If we look at how many elements are before the pair (x0, y0) in the above order we have thefollowing:

• We have to count all elements of the previous diagonals. These are those given by x + y = nfor n < x0 + y0.

– In the above example for the pair (2, 1), these are the diagonals given by x + y = 0,x + y = 1, x + y = 2.

– The diagonal, given by x+y = n, has n+1 elements, so in total we have∑x+y−1

i=0 (i+1) =

1 + 2 + · · · + (x + y) =∑x+y

i=1 i.

Page 19: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-10

– Gauß showed already as a student at school that∑n

i=1 i = n(n+1)2 . Therefore the above

is (x+y)(x+y+1)2 .

• Further, we have to count all pairs in the current diagonal, which occur in this orderingbefore the current one. These are y pairs.

– Before (2, 1) there is only one pair, namely (3, 0).

– Before (3, 0) there are 0 pairs.

– Before (0, 2) there are 2 pairs, namely (2, 0), (1, 1).

• Therefore we get that there are in total (x+y)(x+y+1)2 + y pairs before (x, y), therefore the

pair (x, y) is the pair number ( (x+y)(x+y+1)2 + y) in this order. We have now the following

definition:

Definition 2.14

π(x, y) :=(x + y)(x + y + 1)

2+ y (= (

x+y∑

i=1

i) + y)

Exercise: Prove that∑n

i=1 i = n(n+1)2 .

Lemma 2.15π is bijective.

Proof:We show π is injective: We prove first that, if x + y < x′ + y′, then π(x, y) < π(x′, y′):

π(x, y) = (

x+y∑

i=1

i) + y < (

x+y∑

i=1

i) + x + y + 1 =

x+y+1∑

i=1

i

≤ (

x′+y′∑

i=1

i) + y′ = π(x′, y′)

Assume now π(x, y) = π(x′, y′) and show x = x′ and y = y′. We have by the above x+y = x′ +y′.Therefore

y = π(x, y) − (

x+y∑

i=1

i) = π(x′, y′) − (

x′+y′∑

i=1

i) = y′

and x = (x + y) − y = (x′ + y′) − y′ = x′.We show π is surjective: Assume n ∈ N. Show π(x, y) = n for some x, y ∈ N. The sequence

(∑k′

i=1 i)k′∈N is strictly existing. Therefore there exists a k s.t.

a :=

k∑

i=1

i ≤ n <

k+1∑

i=1

i (∗)

So, in order to obtain π(x, y) = n, we need x + y = k. By y = π(x, y) − ∑x+yi=1 i, we need to define

y := n − a and, by k = x + y, therefore x := k − y. By (∗) it follows 0 ≤ y < k + 1, and therefore

x, y ≥ 0. Further, π(x, y) = (∑x+y

i=1 i) + y = (∑k

i=1 i) + (n − ∑ki=1 i) = n.

Since π is bijective, we can define π0, π1 as follows:

Definition 2.16 Let π0 : N → N and π1 : N → N be s.t. π0(π(x, y)) = x, π1(π(x, y)) = y.

Page 20: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-11

Remark: π, π0, π1 are computable in an intuitive sense.

“Proof:” π is obviously computable. π0(n), π1(n) can be computed, by first searching for a k s.t.k(k+1)

2 ≤ n < (k+1)(k+2)2 and then defining as in the proof of the surjectivity of π (Lemma 2.15)

π1(n) := n − k(k+1)2 and π0(n) = k − π1(n).

Remark 2.17 For all z ∈ N, π(π0(z), π1(z)) = z.

Proof: Assume z ∈ N and show z = π(π0(z), π1(z)). π is surjective, so there exists x, y s.t.π(x, y) = z. Then π(π0(z), π1(z)) = π(π0(π(x, y)), π1(π(x, y))) = π(x, y) = z.

We want to encode now elements of Nk as natural numbers. I order to encode an element(l, m, n) ∈ N3 = ((N × N) × N), we encode first l, m as π(l, m) ∈ N, and then the completetriple as π(π(l, m), n). Similarly we encode (l, m, n, p) ∈ N4 as π(π(π(l, m), n), p). So π3(l, m, n) =π(π(l, m), n), π4(l, m, n, p) = π(π(π(l, m), n), p), etc. The corresponding decoding function πk

i :N → N should return the ith component. For k = 3 we see that, if x = π3(l, m, n) = π(π(l, m), n),then π0(π0(x)) = l, π1(π0(x)) = m, π1(x) = n, so we define π3

0(x) = π0(π0(x)), π31(x) =

π1(π0(x)), π31(x) = π1(x). Similarly, for k = 4 we have to define π4

0(x) = π0(π0(π0(x))),π4

1(x) = π1(π0(π0(x))), π42(x) = π1(π0(x)). π4

3(x) = π1(x).In general one defines for k ≥ 1 πk : Nk → N, πk(x0, . . . , xk−1) = π(· · · π(π(x0, x1), x2) · · ·xk−1),and for i < k πk

i : N → N, πk0 (x) = π0(· · ·π0(︸ ︷︷ ︸

k − 1 times

x) · · · ) and for 0 < i < k, πki (x) = π1( π0(π0(· · ·π0(︸ ︷︷ ︸

k − i − 1 times

x) · · · ))).

An induction definition of πk, πki is as follows:

Definition 2.18 (a) A bijective function πk : Nk → N.We define by induction on k for k ∈ N, k ≥ 1

πk : Nk → N

π1(x) := x

For k > 0 πk+1(x0, . . . , xk) := π(πk(x0, . . . , xk−1), xk)

(b) The inverse of πk.We define by induction on k for i, k ∈ N s.t. 1 ≤ k, 0 ≤ i < k

πki : N → N

π10(x) := x

πk+1i (x) := πk

i (π0(x)) for i < k

πk+1k (x) := π1(x)

Lemma 2.19 (a) For (x0, . . . , xk−1) ∈ Nk, i < k, xi = πki (πk(x0, . . . , xk−1)).

(b) For x ∈ N, x = πk(πk0 (x), . . . , πk

k−1(x)).

Proof: Induction on k.Base case k = 0: Proof of (a): Let (x0) ∈ N1. Then π1

0(π1(x0) = x0.Proof of (b): Let x ∈ N. Then π1(π1

0(x)) = x.Induction step k → k + 1: Assume the assertion has been shown for k.Proof of (a): Let (x0, . . . , xk) ∈ Nk+1. Then

for i < k πk+1i (πk+1(x0, . . . , xk)) = πk

i (π0(π(πk(x0, . . . , xk−1), xk)))

= πki (πk(x0, . . . , xk−1))

IH= xi

and πk+1k (πk+1(x0, . . . , xk)) = π1(π(πk(x0, . . . , xk−1), xk)))

= xk

Page 21: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-12

Proof of (b): Let x ∈ N.

πk+1(πk+10 (x), . . . , πk+1

k (x)) = π(πk(πk+10 (x), . . . , πk+1

k−1(x)), πk+1k (x))

= π(πk(πk0 (π0(x)), . . . , πk

k−1(π0(x))), π1(x))

IH= π(π0(x), π1(x))

Rem. 2.17= x

We want to encode now N∗ into N. N∗ = N0 ∪ ⋃n≥n Nn. N0 = {〈〉}, and we can encode its only

element 〈〉 as 0. For n ≥ 1, we have already defined an encoding of πn : Nn → N. Note that anatural number can (and in fact will) both be of the form πn(a0, . . . , an−1) and πk(b0, . . . , bn−1)for n 6= k. In order to distinguish between those elements we have to add the length to that codeand encode (a0, . . . , an−1) as an element of

⋃n≥1 Nn as π(n − 1, πn(a0, . . . , an−1)). Note that we

can use n − 1 instead of n as first component, since we are referring here to n ≥ 1. In order toobtain a code for elements of

⋃n≥n Nn, which is different from the code for 〈〉, we add 1 to the

above, so (a0, . . . , an−1) is encoded as π(n − 1, πn(a0, . . . , an−1)) + 1. One can see now that wehave obtain in fact a bijection from N∗ to N. The complete definition is as follows:

Definition 2.20 (a) A bijective function λx.〈x〉 : N∗ → N.Define for x ∈ N∗, 〈x〉 : N as follows:

〈〉 := 0 ,

〈x0, . . . , xk〉 := 1 + π(k, πk+1(x0, . . . , xk))

(b) The length of a code of an element of N∗: We define

lh : N → N ,

lh(0) := 0 ,

lh(x) := π0(x − 1) + 1 if x > 0 .

lh(x) is called the length of x.

(c) We define for x ∈ N and i < lh(x), (x)i ∈ N as follows:

(x)i := πlh(x)i (π1(x − 1))

For lh(x) ≤ i, let (x)i := 0.

Lemma 2.21 (a) lh(〈〉) = 0, lh(〈x0, . . . , xk〉) = k + 1.

(b) For i ≤ k, (〈x0, . . . , xk〉)i = xi.

(c) For x ∈ N, x = 〈(x)0, . . . , (x)lh(x)−1〉.

Proof: (a) lh(〈〉) = lh(0) = 0.

lh(〈x0, . . . , xk〉) = π0(〈x0, . . . , xk〉 − 1) + 1

= π0(π(k, · · · ) + 1 − 1) + 1

= k + 1

Page 22: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-13

(b) lh(〈x0, . . . , xk〉) = k + 1. Therefore

(〈x0, . . . , xk〉)i = πk+1i (π1(〈x0, . . . , xk〉 − 1))

= πk+1i (π1(1 + π(k, πk+1(x0, . . . , xk)) − 1))

= πk+1i (πk+1(x0, . . . , xk))

Lem 2.19(a)= xi

(c) If x = 0, lh(x) = 0, and therefore 〈(x)0, . . . , (x)lh(x)−1〉 = 〈〉 = 0 = x.

If x > 0, let x − 1 = π(y, l). Then lh(x) = l + 1, (x)i = πl+1i (y) and therefore

〈(x)0, . . . , (x)lh(x)−1〉 = 〈πl+10 (y), . . . , πl+1

l (y)〉= π(πl+1(πl+1

0 (y), . . . , πl+1l (y)), l) + 1

Lem 2.19(a)= π(y, l) + 1

= x

Theorem 2.22 (a) If A is countable, so are Ak, A∗.

(b) If A, B are countable, so are A × B, A ∪ B.

(c) If An are countable sets for n ∈ N, so is⋃

n∈NAn.

(d) Q, the set of rational numbers, is countable.

Proof:(a) Assume A is countable.We show first that A∗ is countable: there exists f : A → N, f injective. Define f ∗ : A∗ → N∗,f∗(a0, . . . , an) = (f(a0), . . . , f(an)). Then f is injective.λx.〈x〉 : N∗ → N is bijective. Therefore (λx.〈x〉) ◦ f∗ : A∗ → N is injective, A∗ is countable.Ak ⊆ A∗, therefore Ak is countable as well.(b) Assume A, B countable. Show A × B, A ∪ B are countable.If A = ∅ or B = ∅, then A × B = ∅, therefore countable, and A ∪ B = A or A ∪ B = B, thereforecountable.Assume A, B 6= ∅. Then there exist surjections f : N → A and g : N → B. Then f×g : N2 → A×B,(f ×g)(n, m) := (f(n), g(m)) is surjective as well. Further λx.(π0(x), π1(x)) : N → N2 is surjective,therefore as well (f × g) ◦ (λx.(π0(x), π1(x))) : N → A × B. Therefore A × B is countable.Further h : N2 → A ∪ B, h(0, n) := f(n), h(k, n) := g(n) for k > 0, is surjective, therefore as wellh ◦ (λx.(π0(x), π1(x))) : N → A ∪ B.(c) Assume An are countable for n ∈ N. Show A :=

⋃n∈N

An is countable as well.If all An are empty, so is

⋃n∈N

An. Assume Ak is non-empty for some k. By replacing allAl which are empty by Ak we get a sequence of sets, s.t. their union is the same as A. Sowithout loss of generality we can assume that An 6= ∅ for all n. An are countable, so there existfn : N → An surjective. Then f : N2 → A, f(n, m) := fn(m) is surjective as well. Thereforef ◦ (λx.(π0(x), π1(x))) : N → A is as well surjective, A is countable.(d) Z×N is countable, since Z and N are countable. Let A := {(z, n) ∈ Z×N, n 6= 0}. A ⊆ Z×N,therefore A is countable as well. Since A 6= ∅ and countable, there exists a surjection f : N → A.Define g : A → Q, g(z, n) := z

n. g is surjective, therefore as well g ◦ f : N → Q. So we have defined

a surjective function from N onto Q, Q is countable.

2.3 Reduction of Computability on some Data Types to Computabilityon N

We want to reduce computability on some data types A to computability on N. This avoids havingto introduce the notion of computability for each of those types individually. In order to obtain

Page 23: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-14

this, we need to encode elements of A as elements of N, i.e. we need a function code : A → N,and we need to decode them again, i.e. we need a decoding function decode : N → A. If thesefunctions are computable, then from a computable function f : A → A we can obtain a computablef : N → N, by taking an element n ∈ N, decoding it into A, applying then f and then encoding itback into a natural number:

Af - A

N

decode

6

f - N

code

?

We want as well to recover from f the original function f . For this we need that, if we first encodean element a ∈ A as a natural number and then decode it again, we obtain the original value a,i.e. we need decode(code(a)) = a.If we have these properties, we say that A has a computable encoding into N. Since we are referringto the notion of “computable” in an intuitive sense, and since this is not a precise mathematicaldefinition, we call the following an “informal definition”, and lemmata referring to it “informallemmata”.Informal Definition:A data type A has a computable encoding into N, if there exist in an intuitive sense computablefunctions code : A → N, and decode : N → A such that decode(code(a)) = a for all a ∈ A.

Remark: Note that we do not require that code(decode(n)) = n for every n ∈ N.So every a ∈ A has a unique code in N, i.e. code is injective, as expressed by decode(code(a)) = a.However, not every n ∈ N needs to be a code for an element of A, (which would imply code(decode(n)) =n), code need not be surjective.

The most common data used is the data type of finite strings over a finite alphabet A, and the setof such strings is A∗. Now A∗ has a computable encoding into N:

Informal Lemma:If A is a finite set, then A∗ has a computable encoding into N.

“Proof”:If A is empty, then A∗ = {〈〉}, and we can define code : A∗ → N, code(a) = 0, and decode :N → A∗, decode(n) = 〈〉. Both function are clearly computable, and if a ∈ A∗, then a = 〈〉 anddecode(code(a)) = decode(0) = 〈〉 = a.Assume now A 6= ∅, A = {a0, . . . , an}, where ai 6= aj for i 6= j. Define f : A → N, f(ai) = i,and g : N → A, g(i) = ai for i ≤ n, g(i) = a0 for i > n. f and g are in an intuitive sensecomputable, and therefore as well code : A∗ → N, code(a1, . . . , an) = 〈f(a1), . . . , f(an)〉 and thefunction decode : N → A∗, decode(n) := (g((n)0), . . . , g((n)lh(n)−1)).We show decode(code(x)) = x for x ∈ A∗: Let x = (a1, . . . , an)

decode(code(x)) = decode(code(a1, . . . , an))

= decode(〈f(a1), . . . , f(an)〉)= (g(f(a1)), . . . , g(f(an))

= (a1, . . . , an)

= x .

Remark: Assume A, B have computable encodings codeA : A → N, decodeA : N → A andcodeB : B → N, decodeB : N → B. If f : A → B is in an intuitive sense computable, then we candefine an in an intuitive sense computable function f : N → N, f := codeB ◦ f ◦ decodeA:

Page 24: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-15

Af - B

N

decodeA

6

codeA

? f - N

decodeB

6

codeB

?

From f we can recover f , since f = decodeB ◦ f ◦ codeA (easy exercise). So by consideringcomputability on N we cover computability on data types with a computable encoding into N aswell.

2.4 Appendix: Some Mathematical Background

2.4.1 Injective – Surjective – Bijective

DefinitionLet f : A → B, A′ ⊆ A.

(a) f [A′] := {f(a) | a ∈ A} is called the image of A′ under f .

(b) The image of A under f is called the image of f .

A B

a

b

c

d

e

f

g

h

A’f[A’] A B

a

b

c

d

e

f

g

h

f[A]

f

Image of A′ under f Image of f

Definition 2.23Let A, B be sets, f : A → B.

(a) f is injective or an injection or one-to-one, if f applied to different elements of A has differentresults: ∀a, b ∈ A.a 6= b → f(a) 6= f(b).

(b) f is surjective or a surjection or onto, if every element of B is in the image of f : ∀b ∈B.∃a ∈ A.f(a) = b.

(c) f is bijective or a bijection or a one-to-one correspondence iff it is both surjective and injec-tive.

If we visualise a function by having arrows from elements a ∈ A to f(a) ∈ B then we have thefollowing:

• A function is injective, if for every element of B there is at most one arrow pointing to it:

Page 25: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-16

a

b

c

d

e

a

b

c

d

e

injective non-injective

• A function is surjective, if for for every element of B there is at least one arrow pointing toit:

a

b

c

d

e

a

b

c

d

f

e

surjective non-surjective

• A function is bijective, if for every element of B there is exactly one arrow pointing to it:

a

b

c

d

e

f

bijective

• Note that, since we have a function, for every element of A there is exactly one arroworiginating from there.

2.4.2 Sequences vs. Functions N → A

A function f : N → A is nothing but an infinite sequence of elements of A numbered by elementsof N, namely f(0), f(1), f(2), . . ., usually written as (f(n))n∈N. We identify functions f : N → Awith infinite sequences of elements of A. So the following denotes the same mathematical object:

• The function f : N → N, f(n) =

{0 if n is odd,1 if n is even.

• The sequence (1, 0, 1, 0, 1, 0, . . .).

• The sequence (an)n∈N where an =

{0 if n is odd,1 if n is even.

Page 26: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-17

2.4.3 Partial Functions

A partial function f : A∼→ B is the same as a function f : A → B, but it need not be defined for

all values, so we do not require that for every a ∈ A there is a value f(a).The key example of a partial function is the function computed by a computer program, which hasan input a ∈ A and possibly returns an output f(a) ∈ B. (We assume that the program alwaysreturns the same result, which can be guaranteed by forbidding references to variables, which aredefined outside the scope of this function.) If the program applied to some a ∈ A returns a valueb ∈ B, then f(a) is defined and equal to b. If the program applied to a does not terminate, thenf(a) is undefined.Other examples of partial functions are the function f : R

∼→ R, f(x) = 1x, which is not defined for

x = 0; or g : R∼→ R, g(x) =

√x, which is only defined for x ≥ 0.

Definition 2.24

• Let A, B be sets. A partial function f from A to B, written f : A∼→ B, is a function

f : A′ → B for some A′ ⊆ A. A′ is called the domain of f , written as A′ = dom(f).

• Let f : A∼→ B.

– f(a) is defined, written as f(a) ↓, if a ∈ dom(f).

– f(a) ' b (f(a) is partially equal to b) :⇔ f(a) ↓ ∧f(a) = b.

We want to work with terms like f(g(2), h(3)), where f, g, h are partial functions. The questionis here what happens if g(2) or h(3) is undefined. There is a theory of partial functions, in whichf(g(2), h(3)) might be defined, even if g(2) or h(3) is undefined. This makes senses for instancefor the function f : N2 ∼→ N, f(x, y) = 0. Such functions, which are defined, even if some of itsarguments are undefined, are called non-strict, whereas functions, which are defined only if all ofits arguments are defined are called strict. In this lecture, functions will always be strict. Thereforea term like f(g(2), h(3)) is defined only, if g(2) and h(3) are defined, and if f applied to the resultsthe results of evaluating g(2) and h(3) is defined. f(g(2), h(3)) is then evaluated as for ordinaryfunctions: We first compute g(2) and h(3), and then evaluate f applied to the results of thosecomputations.

Definition 2.25

• For expressions t formed from constants, variables and partial functions we define t ↓ andt ' b as follows:

– If t = a is a constant, then t ↓ is always true and t ' b :⇔ a = b.

– If t = x is a variable, then t ↓ is always true and t ' x :⇔ a = x.

–f(t1, . . . , tn) ' b :⇔ ∃a1, . . . , an.t1 ' a1 ∧ · · · ∧ tn ' an ∧ f(a1, . . . , an) ' b .

f(t1, . . . , tn) ↓:⇔ ∃b.f(t1, . . . , tn) ' b

• s ↑:⇔ ¬(s ↓).

• We define for expressions s, t formed from constants and partial functions s ' t :⇔ (s ↓↔t ↓) ∧ (s ↓→ ∃a, b.s ' a ∧ t ' b ∧ a = b).

• t is total means t ↓.

• A function f : A∼→ B is total, iff ∀a ∈ A.f(a) ↓.

Page 27: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-18

Remark: Total partial functions are ordinary (non-partial) functions.

Remark: Quantifiers always range over defined elements. So by ∃m.f(n) ' m we mean: thereexists a defined m s.t. f(n) ' m. So from f(n) ' g(k) we cannot conclude ∃m.f(n) ' m unlessg(k) ↓.Remark 2.26

(a) If s ' a, s ' b, then a = b.

(b) For all terms we have t ↓⇔ ∃a.t ' a.

(c) f(t1, . . . , tn) ↓⇔ ∃a1, . . . , an.t1 ' a1 ∧ · · · ∧ tn ' an ∧ f(a1, . . . , an) ↓.Examples:Assume f : N

∼→ N, dom(f) = {n ∈ N | n > 0}, f(n) := n + 1 for n ∈ dom(f). Let g : N∼→ N,

dom(g) = {0, 1, 2}, g(n) := n. Then we have

• f(1) ↓, f(0) ↑, f(1) ' 2, f(0) 6' n for all n.

• g(f(0)︸︷︷︸↑

) ↑, since f(0) ↑.

• g(f(1)︸︷︷︸'2

) ↓, since f(1) ↓, f(1) ' 2, g(2) ↓.

• g(f(2)︸︷︷︸'3

) ↑, since f(2) ↓, f(2) ' 3, but g(3) ↑.

• g(f(2)︸︷︷︸↑

) ' f(0)︸︷︷︸↑

, since both expressions are undefined.

• g(f(1))︸ ︷︷ ︸'2

' f(g(1))︸ ︷︷ ︸'2

, since both sides are defined and equal to 2.

• g(f(2))︸ ︷︷ ︸↑

6' f(g(1))︸ ︷︷ ︸↓

, since the left hand side is undefined, the right hand side is defined.

• f(f(1))︸ ︷︷ ︸'3

6' f(1)︸︷︷︸'2

, since both sides evaluate to different (defined) values.

• +, · etc. can be treated as partial functions. So for instance

– f(1)︸︷︷︸↓

+ f(2)︸︷︷︸↓

↓, since f(1) ↓, f(2) ↓, and + is total.

– f(1)︸︷︷︸'2

+ f(2)︸︷︷︸'3

' 5.

– f(0)︸︷︷︸↑

+f(1) ↑, since f(0) ↑.

Remark: Strict evaluation of functions corresponds to the so called “call-by-value” evaluationof functions, as it is done in most imperative and some functional programming languages: whenevaluating f(t1, . . . , tn), one first evaluates t1, . . . , tn. Then one evaluates f applied to the resultsof this evaluation. Undefined values corresponds to an infinite computation, and if one of the ti isundefined, the computation of f(t1, . . . , tn) doesn’t terminate.In Haskell we have “call-by-name” evaluation, which means ti are evaluated only if they are neededin the computation of f . For instance, if we have f : N2 → N, f(x, y) = x, and t is an undefinedterm, then with call-by-need, the term f(2, t) can be evaluated to 2, since we never need to evaluatet. So functions in Haskell are non-strict. In our setting, functions are strict, so f(2, t) is undefined.

Page 28: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-19

2.4.4 λ-Notation

When using in an informal context (i.e. outside the λ-calculus to be introduced later) by λx.t wemean the function mapping x to t. E.g. λx.x + 3 is the function f s.t. f(x) = x + 3, λx.

√x is

the function mapping x to√

x. This notation will be used if we want to introduce such a functionwithout giving it a name. Note that we do not specify the domain and codomain – when thisnotation is used, it will either not matter or clear from the context.

2.4.5 Some Standard Sets

• N is the set of natural numbers, i.e.

N := {0, 1, 2, . . .} .

– Note that 0 is a natural number.

– When elements of a sequence, e.g. when counting the arguments of a function, we willusually start with 0:

∗ The 0th element of a sequence is what is usually called the first element. (So, whenconsidering variables x0, . . . , xn−1, x0 is the 0th variable).

∗ The first element of a sequence is what is usually called the second element. (So,when considering variables x0, . . . , xn−1, x1 is the first variable).

∗ etc.

• Z is the set of integers:Z := N ∪ {−x | x ∈ N} .

• Q is the set of rationals, e.g.

Q := {x

y| x, y ∈ Z, y 6= 0} .

• R is the set of real numbers.

• If A, B are sets, then A × B is the product of A and B, i.e.

A × B := {(x, y) | x ∈ A ∧ y ∈ B}

• If A is a set, k ∈ N, k > 0, then Ak is the set of k-tuples of elements of A, i.e.

Ak := {(x0, . . . , xn−1) | x0, . . . , xk−1 ∈ A} .

We identify A1 (which is formally correct {(x) | x ∈ A}) with A. (Note that, if one ispedantic, (x) is different from x: (x) is the tupel consisting of one element x, whereas x isone element, from which we haven’t formed a tuple. Ignoring this difference usually doesn’tcause problems).

2.4.6 Relations, Predicates and Sets

• A predicate on a set A is a property P of elements of A. In this lecture, A will usually be Nk

for some k ∈ N, k > 0.

• We write P (a) for “predicate P is true for the element a of A”.

• We often write “P (x) holds” for “P (x) is true”.

• We can use P (a) in formulas. Therefore:

Page 29: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 2-20

– ¬P (a) (“not P (a)”) means that “P (a) is not true”.

– P (a) ∧ Q(q) means that “both P (a) and Q(a) are true”.

– P (a) ∨ Q(q) means that “P (a) or Q(a) is true” (especially, if both P (a) and Q(a) aretrue, then P (a) ∨ Q(a) is true as well).

– ∀x ∈ B.P (x) means that “for all elements x of the set B P (x) is true”.

– ∃x ∈ B.P (x) means that “there exists an element x of the set B s.t. P (x) is true”.

• In this lecture, “relation” is another word for “predicate”.

• We identify a predicate P on a set A with {x ∈ A | P (x)}. Therefore predicates and sets willbe identified. E.g., if P is a predicate,

– x ∈ P stands for x ∈ {x ∈ A | P (x)}, which is equivalent to P (x),

– ∀x ∈ P.ϕ(x) for a formula ϕ stands for ∀x.P (x) → ϕ(x).

– etc.

• An n-ary relation or predicate on N is a relation P ⊆ Nn. A unary, binary, ternary relationon N is a 1-ary, 2-ary, 3-ary relation on N, respectively.

• An n-ary function on N is a function f : Nn → N. A unary, binary, ternary function on N isa 1-ary, 2-ary, 3-ary function on N, respectively.

2.4.7 Other Notations

The “dot”-notation. When writing expressions such as ∀x.A(x)∧B(x), the convention is thatif we have a dot after the quantifier together with a variable (in the example the dot after “∀x”),then the scope of the quantifier includes as much as possible to the right. So in the example∀x. refers to A(x) ∧ B(x), and the formula expresses that for all x both A(x) and B(x) hold, or,using brackets, ∀x.(A(x) ∧ B(x)). However, in (A → ∀x.B(x) ∧ C(x)) ∨ D(x), ∀x refers only toB(x) ∧ C(x), since this is the maximum scope possible – it doesn’t make sense to include into thescope of ∀x as well “) ∨ D(x)”.Similarly, in ∃x.A(x) ∧ B(x), ∃x refers to A(x) ∧ B(x), whereas in (A ∧ ∃x.B(x) ∨ C(x)) ∧ D(x),∃x refers to B(x) ∨ C(x).This applies as well to λ-expressions, so λx.x + x is the function taking an x and returning x + x.

~x, ~y etc. We will often refer to arguments of functions and relations, to which we don’t referexplicitly. An example are the variables x0, . . . , xn−1 in the definition of a function f : Nn+1 → N,

f(x0, . . . , xn−1, y) =

{g(x0, . . . , xn−1), if y = 0,h(x0, . . . , xn−1), if y > 0.

In order to avoid having to write in such kind of examples often x0, . . . , xn−1, we write ~x for it.In general, ~x stands for x0, . . . , xn−1, where n, i.e. how many variables are meant by ~x, is usuallyclear from the context.Examples:

• f : Nn+1 → N, then in f(~x, y), ~x needs to stand for n arguments, therefore ~x = x0, . . . , xn−1.

• If f : Nn+2 → N, then in f(~x, y), ~x needs to stand for n + 1 arguments, so ~x = x0, . . . , xn.

• If P is an n + 4-ary relation, then in P (~x, y, z), ~x stands for x0, . . . , xn+1.

Similarly, we write ~y for y0, . . . , yn−1, where n is clear from the context, similar for ~z, ~n, ~m, etc.

Page 30: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-1

3 The Unlimited Register Machine (URM) and the HaltingProblem

A model of computation consists of a set of partial functions together with methods, which describe,how to compute those functions. We will usually consider models of computation, which containall computable functions. One usually aims at describing an as simple model of computation aspossible, i.e. to minimise the constructs used, while still being able to represent all computablefunctions. This makes it easier to show for another model of computation, that the first modelcan be interpreted in it. Furthermore, models of computations are more used for showing thatsomething is not computable rather than showing that something is computable, and this is easierto show if the model of computation is simple.The URM (the unlimited register machine) is one model of computation, which is particularly easyto understand. It is a virtual machine, i.e. it is the description of, how a computer would execute itsprogram, but it is not intended to be actually implemented. It is an abstract virtual machine, so wedo not intend to simulate this machine on another computer (although there are implementations)– the machine serves as a mathematical model, which is then investigated mathematically. Forinstance, one usually doesn’t write programs in it – instead one shows that in principal there is away of writing a certain program in this language. In fact it turns out that it is difficult to writeactual programs for the URM. We will have a very low level programming language – there willbe no functions, objects, etc., not even while loops. The only loop construct will be the goto. Sothe URM does not support structured programming in any sense.A URM is an idealised machine, as expressed by the word “unlimited”. So we don’t assume anybounds on the amount of memory or execution time – however all values will be finite.There exist various variants of URMs, we will here introduce a URM which is particularly simple.

3.1 Syntax and Semantics of the URM

• The URM consists of

– infinitely many registers Ri, each of which can store an arbitrarily big natural number;

– a finite sequence of instructions I0, I1, I2, . . . In;

– and a program counter PC, which can store a natural number. If it contains a number0 ≤ i ≤ n, this means that it points to instruction Ii. If the PC is set to another number,the program will stop.

R0 R1 R2 R3 R4 R5 R6 R7 R8 · · ·

I0 I1 I2 · · · In

PC

Solid line: Execute instructionDotted line: Program has terminated

• There are three kinds of URM instructions:

Page 31: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-2

– The successor instruction succ(n) where n ∈ N.

If the PC points to this instruction, the URM performs the following operation: It adds1 to register Rn, and increments then the PC by one. So the PC points then either tothe next instruction, if there is one, or the program stops, if there is none.

– The predecessor instruction pred(n), where n ∈ N.

If the PC points to this instruction, the URM performs the following operation: If Rn

contains a value > 0, then it subtracts 1 from it, otherwise it leaves it as it is. Then thePC is incremented by 1.

– The conditional jump instruction ifzero(n, q), 1 ≤ n ≤ N , q ∈ N. If the PC points tothis instruction, the URM performs the following operation:

∗ If Rn contains value 0, then the PC is set to value q – so there is an instruction Iq,it continues executing that instruction; if there is no such instruction the programwill stop.

∗ If R contains a value > 0, then the PC is incremented by 1, so the program continuesexecuting the next instruction or it stops, if the current instruction is the last one.

• Note that a URM program I0, . . . , In will refer only to finitely many registers, namely thoseexplicitly occurring in I0, . . . , In.

• If P = I0, . . . , In is a URM program, it computes a function P (k) : Nk ∼→ N. P (k)(a0, . . . , ak−1)is computed as follows:

– Initialisation: Set PC to 0, store a0, . . . , ak−1 in registers R0, . . . , Rk−1, respectively,and set all other registers to 0 (it suffices to do this for those registers, referenced in theprogram).

– Iteration: As long as the PC holds a value j ∈ {0, . . . , n}, execute Instruction Ij .Continue with the next instruction as given by the PC.

– Output: If the PC value is bigger than n, the program stops, and the function returnsthe value b contained in register R0: P (k)(a0, . . . , ak−1) := b. If the program neverstops, then P (k)(a0, . . . , ak−1) is undefined.

• A partial function f : Nk ∼→ N is URM-computable, if f = P (k) for some k ∈ N and someURM program P .

Remark:

1. A URM program P defines many URM-computable functions:

– As a unary function P(1) : N∼→ N, one stores its argument in R0, sets all other registers

to 0 and then executes P.

– As a binary function P(2) : N2 ∼→ N, one stores its two arguments in R0, R1, sets allother registers to 0 and then executes P.

– As a ternary function P(3) : N3 ∼→ N, one stores its three arguments in R0, R1, R2, setsall other registers to 0 and then executes P.

– etc.

2. For a partial function f to be computable we do not need to determine, whether f(n) isdefined or not. We only need to determine in case f(n) ↓ after finite amount of time thatf(n) is actually defined and to compute the value of f(n). In case f(n) ↑, we will waitinfinitially long for an answer.

The Turing halting problem, which will be discussed problem, is the problem to decide,whether f(n) ↓ or f(n) ↑. We will see later that this problem is undecidable.

Page 32: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-3

If one wants to make sure that we always get an answer, one has to consider total computablefunctions f , i.e computable functions, which are always defined.

In order to introduce total computable functions, we have to introduce first the partialcomputable functions and then take the subset of those functions, which are total. There isno programming language (in which it is decidable whether a program is actually a program),in which all functions are total and which allows to define all computable functions.

Example: The function f : N∼→ N, f(x) ' 0 is URM-computable. We derive a URM-program

for computing it in several steps.Step 1: The program should, if initially R0 contains x and all other registers contain 0, terminatewith value 0 in register 0. A higher level program (not in the language of the URM) would be asfollows:

R0:= 0

Step 2: Since we have only a successor, and predecessor operation available, we replace the programby the following:

while R0 6= 0 do {R0 := R0−· 1}

Here x−· y := max{x − y, 0}, i.e. x−· y =

{x − y if y ≤ x,0 otherwise.

.

Step 3: We replace the while-loop by a goto:

LabelBegin : if R0 = 0 then goto LabelEnd;R0 := R0−· 1;goto LabelBegin;

LabelEnd :

Step 4: The last goto will be replaced by a conditional goto, depending on R1 = 0. Since R1

is initially true, and never changed during the program, this jump will always be carried out.LabelBegin : if R0 = 0 then goto LabelEnd;

R0 := R0−· 1;if R1 = 0 then goto LabelBegin;

LabelEnd :

Step 5: We translate the program into a URM program I0, I1, I2:

I0 = ifzero(0, 3)I1 = pred(0)I2 = ifzero(1, 0)

3.2 Translating Higher-Level Constructs into the Language of URM

Remark on jump addresses In the following, we will often insert URM programs P as part ofanother URM program P ′, for instance as follows:

succ(0)Ppred(0)

By this we mean that all jump addresses in P are adapted accordingly, so that they fit to thenew numbers of instructions. In the example above, for instance, one has to add 1 to each jumpaddress, since the new numbers of the instructions is the old number of it, increased by 1.Furthermore, when inserting a piece of program, we assume that whenever P terminates, it termi-nates with PC equal to the number of the first instruction following it. So in the example above,if the execution of P as part of P ′ terminates, then the next instruction is the instruction succ(0).This can be achieved by renaming jump addresses, which jump outside of P , accordingly.

In order to introduce more complex URM programs, we introduce some constructions for formingURM programs:

Page 33: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-4

Labelled URM programs: We replace numbers as jump addresses by labels. A label is a symbolwe assign to certain program lines. If mylabel refers to Ik, then ifzero(n, mylabel) stands forifzero(n, k). It is trivial to translate a program with labels into an ordinary URM program.LabelEnd will be a special label, denoting the first instruction following the program.So the above program reads with labels as follows:

LabelBegin : I0 = ifzero(0, LabelEnd)I1 = pred(0)I2 = ifzero(1, LabelBegin)

We can now omit the notations Ik = and write the program as follows:

LabelBegin : ifzero(0, LabelEnd)pred(0)ifzero(1, LabelBegin)

Replacement of registers by variable names. Similarly we write variable names instead ofregisters. So, if we write x for R0, y for R1, the above program reads as follows:

LabelBegin : ifzero(x, LabelEnd)pred(x)ifzero(y, LabelBegin)

Writing of statements in a more readable way:

• x := x + 1; stands for succ(x).

• x := x−· 1; stands for pred(x).

• if x = 0 then goto mylabel;stands for ifzero(x, mylabel).

The above program reads now as follows:

LabelBegin : if x = 0 then goto LabelEnd;x := x−· 1;if y = 0 then goto LabelBegin;

Introduction of more complex statements. We introduce now some more complex statements,which translate back into URM programs, using any of the new statements introduced before. Mostnew statements will require some extra registers, which are not used elsewhere in the program andare not used for storing the input and the output of the function. This is not a problem, since wehave arbitrary many registers available. The auxiliary registers will at the end of the instructionalways be set to 0. Since all registers, except of the argument registers, are 0, and since otherstatements don’t change those registers, they will be always zero, when this statement is started.By “let aux be a new variable” we mean, that aux is a variable denoting a new register.LabelEnd will in the following the next instruction, following the instructions forming the complexstatement.

(a) The statement

goto mylabel;

executes an unconditional jump to statement with label mylabel. It stands for the (labelled)URM statement:

if aux = 0 then goto mylabel;

where aux is new variable.

Page 34: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-5

(b) If 〈Instructions〉 is a sequence of instructions, the following statementwhile x 6= 0 do {〈Instructions〉};

stands for the following URM program:

LabelLoop : if x = 0 then goto LabelEnd;〈Instructions〉goto LabelLoop;

(c) The statement

x := 0

sets register associated with x to 0. It stands for the following program:

while x 6= 0 do {x := x−· 1; };

(d) The statement

y := x;

sets register associated with y to the value x. All other registers (except for auxiliary regis-ters), including x will remain unchanged. Let aux be a new variable.

In case x and y denote the same variable, it stands for the empty URM program (no instruc-tions). Otherwise the program is executed in two steps: First we decrement x and incrementsimultaneously aux, until x is 0. At the end x contains 0, and aux contains the original valueof x. Then we set y to 0 and then, in a loop, we decrement aux and increment simultaneouslyx and y, until aux = 0. At the end, x and y contain the previous value of aux, which is theoriginal value of x, and aux = 0. The complete program is as follows:

while x 6= 0 do {x := x−· 1;aux := aux+ 1; };

y := 0;while aux 6= 0 do {aux := aux−· 1;x := x + 1;y := y + 1; };

(e) Assume x, y, z denote different registers. The statement

x := y + z; computes in x the sum of y and z. All other registers, including y, z, except for

auxiliary ones, remain as they are. It is computed as follows (aux is an additional variable):

x := y;aux := z;

while aux 6= 0 do {aux := aux−· 1;x := x + 1; };

(f) Assume x, y, z denote different registers. The statement

x := y−· z;

Page 35: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-6

computes in x the difference between y and z. If z is bigger than y, then x becomes 0. Allother registers, including y, z, except for auxiliary ones, remain as they are. It is computedas follows (aux is an additional variable):

x := y;aux := z;while aux 6= 0 do {aux := aux−· 1;x := x−· 1; };

(g) Assume x, y denote different registers, and let 〈Statements〉 be a sequence of statements.The statement

while x 6= y do {〈Statements〉};

executes statements, as long as x 6= y. It can be defined as follows (aux, aux0, aux1 are newvariables):

aux0 := x−· y;aux1 := y−· x;aux := aux0 + aux1;while aux 6= 0 do {〈Statements〉aux0 := x−· y;aux1 := y−· x;aux := aux0 + aux1; };

We could now continue and introduce more and more complex statements into the language ofURM, and it should be clear by now, that in principle one could translate languages of high-levelprogramming languages into URM. We will stop here, since we have now enough material in orderto prove the next lemmata.

3.3 Constructions for Introducing URM-Computable Functions

We introduce notations for some partial functions

Definition 3.1 (a) Define the zero function zero : N∼→ N, zero(x) = 0.

(b) Define the successor function succ : N∼→ N, succ(x) = x + 1.

(c) Define for 0 ≤ i < n the projection function projni

: Nn ∼→ N, projni (x0, . . . , xn−1) = xi.

(d) Assume g : (B0 × · · · × Bk−1)∼→ C, and hi : A

∼→ Bi (i = 0, . . . , k − 1). Then define thecomposition of g with h0, . . . , hk−1 as g ◦ (h0, . . . , hk−1) : A

∼→ C,

(g ◦ (h0, . . . , hk−1))(a) :' g(h0(a), . . . , hk−1(a))

(e) Assume g : Nk ∼→ N, h : Nk+2 ∼→ N. Then we define the function defined by primitive recursionfrom g and h, namely primrec(g, h) : Nk+1 ∼→ N, as follows: Let f := primrec(g, h).

f(n0, . . . , nk−1, 0) :' g(n0, . . . , nk−1)

f(n0, . . . , nk−1, m + 1) :' h(n0, . . . , nk−1, m, f(n0, . . . , nk−1, m))

In the special case k = 0, it doesn’t make sense to use g() – in this case let g be a naturalnumber instead. So the case k = 0 reads as follows:

Page 36: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-7

We define for n ∈ N, h : N2 ∼→ N, f := primrec(n, h) : N∼→ N:

f(0) :' n

f(m + 1) :' h(m, f(m))

(f) Let g : Nn + 1∼→ N. We define µ(g) : Nn ∼→ N,

µ(g)(x0, . . . , xn−1) ' (µy.g(x0, . . . , xn−1, y) ' 0))

where we define the partial expression µy.g(x0, . . . , xn−1, y) ' 0 as follows

(µy.g(x0, . . . , xn−1, y) ' 0) :'

the least y ∈ N s.t. g(x0, . . . , xn−1, y) ' 0 andg(x0, . . . , xn−1, y

′) ↓ for 0 ≤ y′ < y if such y exists,undefined otherwise

Examples for definition by primitive recursion.

• Addition can be defined using primitive recursion: Let f(x, y) := x + y. We have

f(x, 0) = x + 0 = x ,

f(x, y + 1) = x + (y + 1) = (x + y) + 1 = f(x, y) + 1 .

Therefore

f(x, 0) = g(x) ,

f(x, y + 1) = h(x, y, f(x, y)) ,

where g : N → N, g(x) := x, and h : N3 → N, h(x, y, z) := z + 1. So f = primrec(g, h).

• Multiplication can be defined using primitive recursion: Let f(x, y) := x · y. We have

f(x, 0) = x · 0 = 0 ,

f(x, y + 1) = x · (y + 1) = x · y + x = f(x, y) + x .

Therefore, we have

f(x, 0) = g(x) ,

f(x, y + 1) = h(x, y, f(x, y)) ,

where g : N → N, g(x) := 0, and h : N3 → N, h(x, y, z) := z + x. So f := primrec(g, h).

• Define pred : N → N, pred(n) := n−· 1 =

{n − 1 if n > 0,0 otherwise.

pred can be defined using primitive recursion:

pred(0) = 0 ,

pred(x + 1) = x .

Therefore, we have

pred(0) = 0 ,

pred(x + 1) = h(x, pred(x)) ,

where h : N2 → N, h(x, y) := y. Therefore, pred = primrec(0, h).

Page 37: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-8

• x−· y can be defined using primitive recursion: Let f(x, y) := x−· y. We have

f(x, 0) = x−· 0 = x ,

f(x, y + 1) = x−· (y + 1) = (x−· y)−· 1 = pred(f(x, y)) .

Therefore,

f(x, 0) = g(x) ,

f(x, y + 1) = h(x, y, f(x, y)) ,

where g : N → N, g(x) := x, and h : N3 → N, h(x, y, z) := pred(z). Therefore, f =primrec(g, h).

Examples for definition by µ.

• Let f : N2 → N, f(x, y) := x−· y. Then µ(f)(x) ' (µy.f(x, y) ' 0) ' x.

• Let f : N∼→ N, f(0) ↑, f(n) = 0 for n > 0. Then (µy.f(y) ' 0) ↑.

• Let f : N → N, f(n) :=

{1 if there exist primes p, q < 2n + 4 s.t. 2n + 4 = p + q,0 otherwise

µy.f(y) ' 0 is the first n s.t. there don’t exist primes p, q s.t. 2n + 4 = p + q. Goldbach’sconjecture says that every even number ≥ 4 is the sum of two primes. Goldbach’s conjecture isequivalent to (µy.f(y) ' 0) ↑. It is one of the most important open problems in mathematicsto show (or refute) Goldbach’s conjecture. If we could decide whether a partial computingfunction is defined (which we can’t), we could decide Goldbach’s conjecture.

We want to show that the set of URM-computable functions is closed under the just introducedoperations. The notion of URM-computable function is not directly suitable, since it refers tospecific registers and, since the values of the arguments are destroyed by running it.

Remark on µ. We need in the definition of µ the condition “g(x0, . . . , xn−1, y′) ↓ for 0 ≤ y′ < y”.

If we defined instead

(µ′y.g(x0, . . . , xn−1, y) ' 0) :'{

the least y ∈ N s.t. g(x0, . . . , xn−1, y) ' 0 if such y exists,undefined otherwise

then in general t := (µ′y.g(x0, . . . , xn−1, y) ' 0) would be non-computable: Assume for instanceg(x0, . . . , xn−1, 1) ' 0. Before we have finished computing g(x0, . . . , xn−1, 0), we don’t know yetwhether the value is 0, non-zero or undefined. But we have t ' 0 ⇔ g(x0, . . . , xn−1, 0) ' 0,t ' 1 ⇔ g(x0, . . . , xn−1, 0) 6' 0, where the latter includes the case g(x0, . . . , xn−1, 0) ↑.The above was only a heuristics, and does not show yet that we cannot compute t by using anytrick. However, one can reduce the decidability of the Turing halting problem to the problem ofdetermining the value of t, and therefore show that t is for some functions g non-computable.

Lemma and Definition 3.2 Assume f : Nk ∼→ N is URM-computable. Assume x0, . . . , xk−1, y,z0, . . . ,zl are variable names for different registers. Then one can define a URM program, which,computes f(x0, . . . , xk−1) and stores the result in y in the following sense: If f(x0, . . . , xk−1) ↓,then the program ends at the first instruction following this program, and stores the result in y.If f(x0, . . . , xk−1) ↑, the program never terminates. Further the program can be defined so that itdoesn’t change the arguments x0, . . . , xk−1 and the variables z0, . . . , zl.For P we say it is a URM program which computes y ' f(x0, . . . , xk−1) and avoids z0, . . . , zl.

Page 38: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-9

Proof:Let P be a URM program s.t. P (k) = f . Let u0, . . . , uk−1 be registers different from the above.By renumbering of registers and of jump addresses, we obtain a program P ′, which computes theresult of f(u0, . . . , uk−1) in u0 (if it is defined – if not it does not terminate), leaves the registersmentioned in the lemma unchanged, and which, if it terminates, terminates in the first instructionfollowing P ′. The following is a program as intended:

u0 := x0;· · ·uk−1 := xk−1;P ′

y := u0;

Lemma 3.3 (a) zero, succ and projni are URM-computable.

(b) If f : Nn ∼→ N, gi : Nk ∼→ N are URM-computable, so is f ◦ (g0, . . . , gn−1).

(c) If g : Nn ∼→ N, and h : Nn+2 ∼→ N are URM-computable, so is primrec(g, h).

(d) If g : Nn+1 ∼→ N is URM-computable, so is µ(g).

Proof:Let xi denote register Ri.Proof of (a)

• zero is computed by the following program:

x0 := 0.

• succ is computed by the following program:

x0 := x0 + 1.

• projnk is computed by the following program:

x0 := xk.

Proof of (b)Assume f : Nn ∼→ N, gi : Nk ∼→ N are URM-computable. Show f ◦ (g0, . . . , gn−1) is computable.A plan for the program is as follows:

• The input is stored in registers x0, . . . , xk−1. Let ~x := x0, . . . , xk−1.

• First we compute gi(~x) for i = 0, . . . , k − 1 and store the result in registers yi.

• Then we compute f(y0, . . . , yn−1) (which is ' f(g0(~x), . . . , gn−1(~x))), and store the result inthe output register x0.

A problem could be that, when computing yi ' gi(~x),

• we might change one of the registers ~x, which are still to be used by later computations ofyj ' gj(~x) for j > i

• we might change yj for j < i which contains the result from previous computations of gj(~x).

We have already seen in Lemma and Definition 3.2 that we can define programs, which avoidchanging their arguments and certain other registers. Therefore programs Pi as follows do exist.

• Let Pi be a URM program (i = 0, . . . , n − 1), which computes yi ' gi(~x) and avoids yj forj 6= i. (Note that our definition includes that the arguments are changed neither).

Page 39: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-10

• Let Q be a URM program, which computes x0 ' f(y0, . . . , yn−1).

A URM program R for computing f ◦ (g0, . . . , gn−1) is defined as follows:

P0

· · ·Pn−1

Q

We show R(n)(~x) ' (f ◦ (g0(~x), . . . , gn−1(~x))).

• Case 1: For one i gi(~x) ↑. The program will loop in program Pi for the first such i, thereforeR(n)(~x) ↑. Further, (f ◦ (g0, . . . , gn−1)(~x)) ↑, the program computes correctly.

• Case 2: gi(~x) ↓ for all i. The program will execute Pi, and execute yi ' gi(x0, . . . , xk−1) andreaches the beginning of program Q.

– Case 2.1: f(g0(~x), . . . , gn−1(~x)) ↑. The program Q will loop, and we have R(n)(~x) ↑ and(f ◦ (g0, . . . , gn−1)(~x)) ↑.

– Case 2.2: Otherwise, the program will reach the end of program Q and result in x0 'f(g0(~x), . . . , gn−1(~x)), so R(n)(~x) ' (f ◦ (g0, . . . , gn−1))(~x).

In all cases, we have R(n)(~x) ' (f ◦ (g0, . . . , gn−1))(~x).

Proof of (c)Assume g : Nn ∼→ N, and h : Nn+2 ∼→ N are URM-computable. Let f := primrec(g, h). Show f isURM-computable.Note that the defining equations for f are as follows (let ~n := n0, . . . , nn−1:

• f(~n, 0) ' g(~n),

• f(~n, k + 1) ' h(~n, k, f(~n, k)).

So in order to compute f(~n, l) for l > 0 we have to make the following computations:

• Compute f(~n, 0) as g(~n).

• Compute f(~n, 1) as h(~n, 0, f(~n, 0)), using the previous result.

• Compute f(~n, 2) as h(~n, 1, f(~n, 1)), using the previous result.

• · · ·

• Compute f(~n, l) as h(~n, l − 1, f(~n, l − 1)), using the previous result.

A plan for the program is as follows:

• Let ~x := x0, . . . , xn−1. Let y, z, u be new registers.

• We will compute f(~x, y) for y = 0, 1, 2, . . . , xn, and store the result in z.

– Initally we start with y = 0 (which is the case since all registers except of xi for i =0, . . . , n contain initially 0), and need to compute z ' f(~x, 0).This is achieved by computing z ' g(~x). Then y = 0 and z ' f(~x, y).

– In the step from y to y + 1 we assume that when entering one iteration of the loop wehave initially z ' f(~x, y). We want to achieve that after increasing of y by 1 we stillhave z ' f(~x, y). Then the loop-invarant z ' f(~x, y) is preserved. This is obtained asfollows:

∗ We first compute u ' h(~x, y, z) (' h(~x, y, f(~x, y)) ' f(~x, y + 1)).

Page 40: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-11

∗ Then execute z := u (' f(~x, y + 1)).

∗ Finally execute y := y + 1.

∗ Then we have z ' f(~x, y) for the new value of y.

– This loop is iterated as long as y hasn’t reached xn.

• Once y has reached xn, z contains f(~x, y) ' f(~x, xn).

• Execute x0 := z.

Let therefore

• P be a URM program, which computes z ' g(~x), and avoids y.

• Q be a program, which computes u ' h(~x, y, z).

The following program R computes f (everything following % is a comment):

P % Compute u ' g(~x, y, z)while xn 6= y do {

Q % Compute u ' h(~x, y, z)% will be ' h(~x, y, f(~x, y)) ' f(~x, y + 1)

z := u;y := y + 1; };

x0 := z;

Correctness of this program: When P is terminated, we have y = 0 and z ' g(~x) ' f(~x, y).After each iteration of the while loop, we have y := y′ + 1 and z ' h(~x, y′, z′), where y′, z′ are theprevious values of y, z, respectively. This amounts to having z ' f(~x, y). The loop terminates,when y has reached xn, and then z contains f(~x, y), which is then stored in x0.If P loops for ever, or in one of the iterations Q loops for ever, then R loops as well, R(n+1)(~x, xn) ↑.But then we have as well f(~x, k) ↑ for some k ≤ xn, and therefore f(~x, l) is undefined for all l > k(f(~x, k + 1) ' h(~x, k, f(~x, k)) ↑, f(~x, k + 2) ' h(~x, k + 1, f(~x, k + 1)) ↑, etc.) Especially, f(~x, xn) ↑.Therefore f(~x, xn) ' R(n+1)(~x, xn) (since both sides are undefined).

Proof of (d)Assume g : Nn+1 ∼→ N is URM-computable. Show µ(g) is URM-computable as well.Note µ(g)(x0, . . . , xk−1) is the minimal z s.t. g(x0, . . . , xk−1, z) ' 0. Let ~x := x0, . . . , xk−1 and lety, z be registers different from ~x.Plan for the program:

• Compute g(~x, 0), g(~x, 1) , . . . until we find a k s.t. g(~x, k) ' 0.Then return k.

• This is carried out by executing z ' g(~x, y) and successively increasing y by 1 until we havez = 0.

• Since we haven’t introduced a repeat-until-loop, we replace this by a while loop as follows:

– Initially we set z to something not 0.As long as z 6= 0, compute z ' g(~x, y) and then increase y by 1.If the while-loop terminates, we have y = n + 1 for the minimal n s.t. g(~x, n) ' 0.Decrease y once, and store result in x0.

Let P be a program which computes z ' g(x0, . . . , xk−1, y). Then the following program Rcomputes µ(g) (note that initially y = 0, z = 0).

Page 41: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-12

z := z + 1;while z 6= 0 do {

Py := y + 1; };

y := y−· 1;x0 := y;

By setting z initially to 1, we guarantee that the while-loop is executed at least once. Furtherinitially y = 0. After each iteration of the while loop, we have y := y′+1 and z ' g(x0, . . . , xk−1, y

′),where y′ is the value of y before starting this iteration. If the loop terminates, then we have thereforez ' 0 and y = y′ + 1 for the first value, such that g(x0, . . . , xk−1, y

′) ' 0. At the end x0 is setto that value. If ever P runs into a loop, then for some k we have g(~x, k) ↑ and for all l < kg(~x, k) 6' 0, and therefore µ(g)(~x) ↑, and R(n)(~x) ↑. If P always terminates, but the while-loopdoesn’t terminate, then there is no k s.t. g(~x, k) ' 0, again µ(g)(~x) ↑ and R(n)(~x) ↑.

3.4 The Undecidability of the Halting Problem

The undecidability of the Halting Problem was first proved 1936 by Alan Turing in this paper “Oncomputable numbers, with an application to the Entscheidungsproblem”. We recast his proof usingURMs instead of Turing machines.

In the following, by “computable” means “URM-computable”. This will be justified later when wewill argue that URM-computability does coincide with the intuitive notion of computability (theChurch-Turing-thesis).

Definition 3.4 (a) A problem is an n-ary predicate M(x0, . . . , xn−1) of natural numbers, i.e. aproperty of n-tuples of natural numbers.

(b) A problem M is decidable, if the characteristic function of M defined by

χM (x0, . . . , xn−1) :=

{1 if M(x0, . . . , xn−1) holds,0 otherwise

is computable.

Example the binary predicate

Multiple(x, y) :⇔ x is a multiple of y

is a problem.χM (x0, . . . , xn−1) decides, whether M(x0, . . . , xn−1) holds (then it returns 1 for yes), or not (thenit returns 0 for no). For instance,

χMultiple(x, y) =

{1 if x is a multiple of y,0 if x is not a multiple of y.

This function is intuitively computable (and one can show in fact that it is URM-computable),therefore Multiple is decidable.

URM-programs can be written as a string of ASCII-symbols, which is an element of A∗, whereA is the set of ASCII-symbols, and which can be encoded as a natural number. Let for a URMprogram P , code(P ) be the number encoding P . It is intuitively decidable, whether a stringof ASCII symbols is a URM-program, and therefore it is as well intuitively decidable, whethern = code(P ) for a URM-program P . With some effort one can show that this property can bedecided by a URM-program. However, the following problem is undecidable:

Page 42: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-13

Definition 3.5 The Halting Problem is the following binary predicate:

Halt(x, y) :⇔{

1 if x = code(P ) for a URM program P and P (1)(y) ↓0 otherwise

Example: Let P be a code for the URM program

ifzero(0, 0)

If the input is > 0, the program terminates immediately, and R0 remains unchanged, so P (0)(k) ' kfor k > 0.If the input is = 0, the program loops for ever. Therefore, P (0)(0) ↑.Let e = code(P ). Then Halt(e, n) holds for n > 0 and does not hold for n = 0.

Remark: We will see below that Halt is undecidable. However, the following function is com-putable:

WeakHalt(x, y) :'{

1 if x = code(P ) for a URM program P and P (1)(y) ↓undefined otherwise

A program for computing WeakHalt(x, y) can be defined as follows: One first checks whetherx encodes a valid URM program. If this is not the case, the program enters an infinite loop.Otherwise, one simulates the corresponding URM program P with input y. This can be done– it is an programming exercise to write a program which simulates a URM, and therefore thissimulation is intuitively computable. It is not too complicated to show that there exists in fact aURM program for carrying out the simulation. If the simulation stops, we output 1, otherwise theprogram loops for ever.

Theorem 3.6 The halting problem is undecidable.

Proof:Assume there exist a URM P s.t. P (2) decides the Halting Problem. Therefore we have

P (2)(x, y) '{

1 if x codes a URM program Q s.t. Q(1)(y) ↓0 otherwise.

We argue now similar to the proof that N 6' P(N): We define a computable function f : N∼→ N,

which cannot be computed by the URM with code e – this will be violated on input e:

• If P (2)(e, e) ' 1, i.e. e encodes a URM Q and Q(1)(e) ↓, then we let f(e) ↑. Therefore,Q(1) 6= f .

• If P (2)(e, e) ' 0, i.e. e doesn’t encode a URM, or it encodes a URM Q and Q(1)(e) ↑, thenwe let f(e) ↓ and define f(e) :' 0. (We could have defined f(e) ' n for any other naturalnumber n, it only matters that f(e) is a defined.) Therefore, if e encodes a URM Q, we haveQ(1) 6= f .

The complete definition is therefore as follows:

f(e) '{

0 if P (2)(e, e) ' 0undefined otherwise.

f is not URM-computable, for f = R(1) is violated by f(code(R)) 6' R(1)(code(R)). Assume fwere computed by R, i.e. f = R(R). Then:

R(1)(code(R)) ↓ property of P⇔ P (2)(code(R), code(R)) ' 1

Def of f⇔ f(code(R)) ↑f=R(1)

⇔ R(1)(code(R)) ↑ ,

Page 43: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 3-14

a contradiction. However, f can be intuitively computed (the program makes use of programP ), and and it can easily be shown that it is URM-computable. So we get a contradiction, andthe assumption that P (2) decides the Halting Problem is wrong. Hence, the Halting Problem isundecidable.

Remark: The above proof can easily be adapted to any reasonable programming language, inwhich one can define all computable functions. Such programming languages are called Turing-completelanguages. For instance Babbage’s machine was, if one removes the restriction to finite memory,Turing-complete, since it had a conditional jump.Applied to standard programming language, which are Turing complete, the unsolvability of theTuring-halting problem means: it is not possible to write a program, which checks, whether aprogram on given input terminates.

Page 44: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-1

4 Turing Machines

4.1 Motivation

URMs is a model of computation which is very easy to understand. The model of URMs hashowever two drawbacks:

(1) The execution of a single URM instruction, e.g. succ(n), might take arbitrarily long:

For instance, if Rn contains the binary number 111 · · ·111︸ ︷︷ ︸k times

, we have to replace it by the binary

number 1 000 · · ·000︸ ︷︷ ︸k times

, i.e. we have to replace k ones by zeros. Since k is arbitrary, this single

step might take arbitrarily long time.

Therefore URMs are unsuitable as a basis for defining the complexity of algorithms. Ofcourse, there exist meta theorems which relate the complexity of URM programs to theactual URM programs. But for defining the complexity, a different notion is needed, and theTuring machine is the currently most widely accepted model of computation used for thispurpose.

(2) We are aiming at a notion of computability, which covers all possible ways of computingsomething, independently of any concrete machine. URMs are a model of computation whichcovers computers, as they are currently used. However, there might be completely differentnotions of computability, based on symbolic manipulations of a sequence of characters, whereit might be more complicated to see directly that all such computations can be simulated bya URM. It is more easy to see that such notions are covered by the Turing machine modelof computation.

We will therefore introduce a second model of computation, the Turing machine (TM). We willlater show that the sets of Turing-computable and of URM-computable functions coincide.The definition of computability proposed by Turing in 1936 is based on an analysis of how a humanbeing (called agent) carries out a computation on a piece of paper.

15 . 16=1590

240

In order to formulate this, we will make the following steps:

• An algorithm should be deterministic, and therefore we assume that the agent uses onlyfinitely many symbols, which he puts at discrete positions on the paper.

1 5 . 1 6 =1 5

9 0− − −2 4 0

Sideremark: If one has doubts, whether this is always possible (we might squeeze a smallsymbol between two existing ones, which might violate having a grid of symbols), one can

Page 45: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-2

go to level of pixels: We assume that the agent’s eye can only distinguish pixels of a certainsize. If we represent, what the agent has written on a piece of paper, as pixels, we obtaina grid. Each cell in this grid is a pixle, which contains one colour. Since the agent canonly distinguish finitely many colours, each of which can be represented by one symbol,we obtain that the original piece of paper can be represented as a grid, each cell of whichcontains one of finitely many symbols. For instance, if one writes A1 on the paper, where Ais black (written as B) and 1 is red (written as R), we obtain the following representation:

B B R RB B R RB B RB B RB B B B RB B RB B RB B R

• We don’t have to work with a two-dimensional piece of paper, one can always assume thatthe whole paper is written on one long tape (a two dimensional paper can be simulated byhaving a symbol which separates the lines of the paper). Each entry on this tape is called acell.

· · · 1 5 . 1 6 = CR 1 5 CR · · ·

• In the real situation, an agent can look at several cells at the same time. But the amountof cells he can look at simultaneously is physically bounded. Looking at finitely many cellssimultaneously can be simulated by looking at one symbol at each time and moving around,in order to observe the cells in the neighbourhood. Therefore, we assume that the agent looksonly at one symbol at a time. This is modelled by having a head of the Turing machine,which indicates the position of the tape we are currently looking at.

· · · 1 5 . 1 6 = CR 1 5 CR · · ·↑

Head

• In reality the agent can make larger jumps between positions on the tape, but the distancecan only be finite, bounded by a fixed length corresponding to the physical ability of theagent to make one step. Such steps can be simulated by finitely many one-step movements.Therefore, we assume that the agent can move in one step only one symbol to the left or tothe right.

• The agent will work purely mechanistically, read the current symbol and depending on itmake a movement or change the symbol on the tape. The agent himself will only have finiteamount of memory – data exceeding this has to be stored on the tape. Therefore thereare only finitely many states of the agent, and depending on this state and the symbol atthe head of the tape, the agent will move to a next state, change the symbol, and make amovement of the head.

A final diagram of a Turing machine looks as follows:

· · · 1 5 . 1 6 = CR 1 5 CR · · ·↑s0

Page 46: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-3

4.2 Definition of a Turing Machine

Using these considerations we arrive at the following definition of a Turing machine:A Turing machine is a quintuple (Σ, S, I, xy, s0), where

• Σ is a finite set of symbols, called the alphabet of the Turing machine. On the tape, thesymbols in Σ will be written.

• S is a finite set of states.

• I is a finite sets of quintuples (s, a, s′, a′, D), where s, s′ ∈ S, a, a′ ∈ Σ D ∈ {L, R}, s.t. forevery s ∈ S, a ∈ Σ there is at most one s′, a′, D s.t. (s, a, s′, a′, D) ∈ S. The elements of I arecalled instructions.

• xy ∈ Σ (a symbol for blank).

• s0 ∈ S (the initial state).

If (s, a, s′, a′, D) ∈ I, this instruction means the following:

• If the Turing machine is in state s, and the symbol at position of the head is a, then

– the state is changed to s′,

– the symbol at this position is changed to a′,

– if D = L the head moves left,

– if D = R the head moves right.

For instance, if we have the following instructions:(s0, 1, s1, 0, R)(s1, 6, s2, 7, L)Then we have the following sequence of configurations:

• Initially:

· · · 1 5 . 1 6 = CR 1 5 CR · · ·↑s0

• Symbol is 1, state is s0, so instruction (1, s0, 0, s1, R) expresses: replace this symbol by 0,change to state s1, move once to the right:

· · · 1 5 . 0 6 = CR 1 5 CR · · ·↑s1

• Symbol is 6, state is s1, so instruction (6, s1, 7, s2, L) expresses: replace this symbol by 7,change to state s2, move once to the left:

· · · 1 5 . 0 7 = CR 1 5 CR · · ·↑s2

Example We develop a Turing machine over Σ = {0, 1, xy}, (where xy stands for a blank entry),which does the following: Assume the tape contains initially a binary number, and to the left andright of it there are only blanks, and the head is pointing to any digit of this number, and theTuring machine is in state s0, e.g.

· · · 1 0 1 0 0 1 0 0 1 1 1 · · ·↑s0

Page 47: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-4

Then the Turing machine will stop with the head at the most significant bit of the original binarynumber, incremented by 1, and all other cells will be blank:

· · · 1 0 1 0 0 1 0 1 0 0 0 · · ·↑s3

So we have that the Turing machine is ({0, 1, xy}, S, I, xy, s0), where we develop in the followingthe set of states S and the set of instructions I:

• First, we will move the head to the least significant bit of the binary number. So, as long asthe head shows a digit 0 or 1, we move the head right. If we encounter the symbol xy, thenwe move once left, and the Turing machine stops in state s1.

So we have the states s0, s1 and the following instructions:

– (s0, 0, s0, 0, R)

– (s0, 1, s0, 1, R)

– (s0, xy, s1, xy, L).

• In the above example, at the end of this step the state of the TM is as follows:

· · · 1 0 1 0 0 1 0 0 1 1 1 · · ·↑s1

• Increasing a binary number b is done as follows:We have two cases.

– If the number consists of 1 only i.e. b = (111 · · ·111︸ ︷︷ ︸k times

)2. In this case, b+1 = (1 000 · · ·000︸ ︷︷ ︸k times

)2.

b+1 is obtained by replacing all ones by zeros, and then replacing the first blank symbolby 1.

– Otherwise the binary representation of the number contains a 0, followed by a finitesequence of ones until we reach the least significant bit. This includes the case if theleast significant bit is 0, so the finite sequence of ones has length 0:

∗ Example 1: b = (0100010111)2, one 0 followed by 3 ones.

∗ Example 2: b = (0100010010)2, least significant digit is 0.

Let b = (b0b1 · · · bk0 111 · · ·111︸ ︷︷ ︸l times

)2. (In the special case l = 0 we have b = (b0b1 · · · bk0)2.)

Then b + 1 is obtained by replacing the final block of ones by 0 and the 0 by 1: b + 1 =(b0b1 · · · bk1 000 · · ·000︸ ︷︷ ︸

l times

)2. (In the special case l = 0 we have b + 1 = (b0b1 · · · bk1)2.)

In general we have to replace, as long as we find ones only starting from the right, the onesby zeros, and move left, until we encounter a 0 or a xy, which is replaced by a 1. If we finda 0 or a xy, then we replace this symbol by a 1.

So we have the new state s2 and the following instructions:

– (s1, 1, s1, 0, L).

– (s1, 0, s2, 1, L).

– (s1, xy, s2, 1, L).

Page 48: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-5

At the end the head will be one field to the left of the 1 written, and the state will be s2. Inthe example above, this situation is as follows:

· · · 1 0 1 0 0 1 0 1 0 0 0 · · ·↑s2

• Finally, we have to move the most significant bit, which is done as follows:

– (s2, 0, s2, 0, L).

– (s2, 1, s2, 1, L).

– (s2, xy, s3, xy, R).

The program will terminate in state s3, in the example above this is as follows:

· · · 1 0 1 0 0 1 0 1 0 0 0 · · ·↑s3

So the complete Turing machine is as follows:

({0, 1, xy},{s0, s1, s2, s3},{(s0, 0, s0, 0, R),(s0, 1, s0, 1, R),(s0, xy, s1, xy, L),(s1, 1, s1, 0, L),(s1, 0, s2, 1, L),(s1, xy, s2, 1, L),(s2, 0, s2, 0, L),(s2, 1, s2, 1, L),(s2, xy, s3, xy, R)},s0,xy)

4.3 Turing-Computable Functions

Definition 4.1 Let M = (Σ, S, I, xy, s0) be a Turing machine with {0, 1} ⊆ Σ Then we define forevery k ∈ N M (k) : Nk ∼→ N, where M (k)(a0, . . . , ak−1) is computed as follows:

• Initialisation:

– The head is an arbitrary position.

– Starting from this, we write bin(a0)xybin(a1)xy · · · xybin(ak−1) on the tape, where bin(a)is the binary string corresponding to a.

∗ E.g. if k = 3, a0 = 0, a1 = 3, a2 = 2 then we write 0xy11xy10.

– All other cells contain xy.

– The state is set to s0.

• Iteration: As long as there is an instruction, corresponding to the state of the TM and thesymbol at the head, the TM performs the operation according to this instruction.

• Output:

Page 49: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-6

– Case 1: The TM stops.At any time, only finitely many cells contain a non-blank symbol: initially this is thecase, and in finitely many steps, only finitely many cell contents are changed.

Therefore there cannot be an infinite sequence of symbols in {0, 1} starting from thehead position to the right, and let the kth cell be the first one not containing 0 or 1.Therefore the TM contains, starting from the head position, symbols b0b1 · · · bk−1c, wherebi ∈ {0, 1} and c 6∈ {0, 1}. k might be 0 (if the symbol at the position of the head itselfis 6= 0, 1).

Let a = (b0, . . . , bk−1)2 (in case k = 0, a = 0). Then M (k)(a0, . . . , ak−1) ' a.

– Case 2: Otherwise. Then M (k)(a0, . . . , ak−1) ↑.

Example: Let Σ = {0, 1, a, b, xy}.

• If tape contains at the end, starting with the head position, 01010xy0101xy or 01010axy, theoutput is (01010)2 = 10.

• If tape contains at the end, starting with head position, abxy, a or xy, the output is 0.

Definition 4.2 f : Nk ∼→ N is Turing-computable, in short TM-computable, if f = M (k) for someTM M , the alphabet of which contains {0, 1}.

Example: The above example of a TM treated in detail shows that succ : N∼→ N is Turing-

computable.

Theorem 4.3 If f : Nn ∼→ N is URM-computable, then it is as well Turing-computable by a TMwith alphabet {0, 1, xy}.

Proof:We first introduce the following notation:By saying the tape of a TM contains a0, . . . , al we mean that the tape contains, starting with thehead position, a0, . . . , al, and all other cells of the tape contain xy.

Furthermore, in this proof, bin(n) will stand for any binary representation of n, for instance bin(2)could be any of 10, 010, 0010 etc. This is needed, since when performing operations, we don’t wantto bother with normalising the representation of numbers on the tape, i.e. we don’t want to dealwith having to delete possible leading zeros.

Assume f is URM-computable by URM P , i.e. f = P (n). Assume P refers only to registersR0, . . . , Rl−1 and that the input registers (i.e. R0, . . . , Rn−1) of P are among R0, . . . , Rl−1, i.e.that l > nWe will define a Turing machine M , which simulates P . This will be done as follows:

• If the registers R0, . . . , Rl−1 of the URM P contain a0, . . . , al−1, this is modelled in the TMas the tape containing bin(a0)xy · · · xybin(al−1).

• An instruction Ij will be simulated by finitely many states qj,0, . . . , qj,i of the TM withinstructions for those states.

The instructions and states will be defined in such a way that the following holds:

• Assume for the URM P

– R0, . . . , Rl−1 contain a0, . . . , al−1,

– the URM is about to execute Ij .

• Assume that after executing Ij the URM P arrives at a state where

Page 50: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-7

– R0, . . . , Rl−1 contain b0, . . . , bl−1,

– the PC contains j′.

• Then, if the configuration of the TM P is s.t.

– the tape contains bin(a0)xybin(a1)xy · · · xybin(al−1),

– and the state of is qj,0,

• then the TM will, after executing the corresponding instructions, arrive at a configuration,in which

– the tape contains bin(b0)xybin(b1)xy · · ·xybin(bl−1),

– the state is qj′ ,0.

For instance, if we simulate the instruction I4 = pred(2) then we have the following:

• Assume the URM is about to execute instruction I4 with register contentsR0 R1 R2

2 1 3

So PC is initially 4

• Then the URM will end with the PC containing 5 and register contentsR0 R1 R2

2 1 2

So, when simulating this in the TM we want the following:

• If the TM is in state q4,0, with the tape containing bin(2)xybin(1)xybin(3)

• it should reach state q5,0 with the tape containing bin(2)xybin(1)xybin(2)

Furthermore, we need an initialisation for the TM consisting of states qinit,0, . . . , qinit,j and corre-sponding instructions, s.t.

• if the TM initially contains the configuration corresponding to arguments b0, . . . , bn−1 of thefunction to be defined, namelybin(b0)xybin(b1)xy · · ·xybin(bn−1),

• it will reach state q0,0 with the tape containingbin(b0)xybin(b1)xy · · ·xybin(bn−1)xy 0xy0xy · · ·xy0︸ ︷︷ ︸

l − n times

xy.

Remark: We assume here that 0 is represented on the tape by 0, enclosed by blanks separating thebinary numbers of the registers. We could as represent it as the empty string, which means that wehave the two enclosing blanks with no symbol in between. Then the initial configuration of the TMwould represent already bin(b0)xybin(b1)xy · · · xybin(bn−1)xy bin(0)xybin(0)xy · · · xybin(0)︸ ︷︷ ︸

l − n times

xy, since

bin(0)xybin(0)xy · · · xybin(0)︸ ︷︷ ︸l − n times

xy is just xyxy · · ·xy︸ ︷︷ ︸l − n times

. No initialisation would be needed.

If we have achieved the above simulation steps, then a computation of P (n)(a0, . . . , an−1) is simu-lated as follows: Assume the run of the URM, starting with Ri containing a0,i = ai i = 0, . . . , n−1,

Page 51: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-8

and a0,i = 0 for i = n, . . . , l − 1 is as follows:

Instruction R0 R1 · · · Rn−1 Rn · · · Rl−1

I0 a0 a1 · · · an−1 0 · · · 0= = = = = =Ik0 a0,0 a0,1 · · · a0,n−1 a0,n · · · a0,l−1

Ik1 a1,0 a1,1 · · · a1,n−1 a1,n · · · a1,l−1

Ik2 a2,0 a2,1 · · · a2,n−1 a2,n · · · a2,l−1

· · ·

Then the corresponding TM will successively reach the following configurations:

State Tape containsqinit,0 bin(a0)xybin(a1)xy · · · xybin(an−1)xy

qk0,0 bin(a0)xybin(a1)xy · · · xybin(an−1)xybin(0)xy · · ·xybin(0)xy

= =q0,0 bin(a0,0)xybin(a0,1)xy · · ·xybin(a0,l−1)xy

qk1,0 bin(a1,0)xybin(a1,1)xy · · ·xybin(a1,l−1)xy

qk2,0 bin(a2,0)xybin(a2,1)xy · · ·xybin(a2,l−1)xy

· · ·

Example:We take the URM program P

I0 = ifzero(0, 3)I1 = pred(0)I2 = ifzero(1, 0)The corresponding unary function P (1) operates as follows: R1 is initially zero and never changed,therefore R1 is always 0 and the jump in I1 is always executed. Therefore a run of the program isas follows: It checks whether R0 contains 0. If yes, it will stop. Otherwise, it will reduce R0 by1, and then jump back to the beginning. So the program always terminates with R0 containing 0,and we have P (1)(n) ' 0.A run of this program corresponding to a computation of P (1)(2) is as follows:

Instruction R0 R1

I0 2 0I1 2 0I2 1 0I0 1 0I1 1 0I2 0 0I0 0 0I3 0 0URM Stops

We start with instruction I0 and R0 containing 2 and R1 containing 0. Jump I0 is not executed, sowe arrive at I1 with register values unchanged. Then R0 is reduced by 1 and we move to I2. Thenwe jump back to I0 with the register values unchanged. Etc.The corresponding Turing machine M should simulate this program. A run of M 1(2) will start instate qinit,0 with initial configuration bin(2)xy. Then in the initialisation part, the TM expands thisto bin(2)xybin(0)xy and arrives at state q0,0. Then each of the steps of the URM is simulated inthe TM. So, when simulating I0, the TM checks whether the first binary number on the tape is 0or not. If it is 0, it switches to q3,0, otherwise, it switches to q1,0. When simulating I1, it decreasesthe first binary number on the tape by one. When simulating I2, it checks, whether the secondbinary number on the tape is 0 or not. If it is zero, it switches to q0,0, otherwise to q3,0.

Page 52: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-9

Assuming that we have introduced such a TM, we can write the configuration of the URM and ofthe TM in one table and obtain the following:

Instruction R0 R1 State of TM Content of Tapeqinit,0 bin(2)xy

I0 2 0 q0,0 bin(2)xybin(0)xy

I1 2 0 q1,0 bin(2)xybin(0)xy

I2 1 0 q2,0 bin(1)xybin(0)xy

I0 1 0 q0,0 bin(1)xybin(0)xy

I1 1 0 q1,0 bin(1)xybin(0)xy

I2 0 0 q2,0 bin(0)xybin(0)xy

I0 0 0 q0,0 bin(0)xybin(0)xy

I3 0 0 q3,0 bin(0)xybin(0)xy

URM Stops TM Stops

Once we have introduced such a simulation we can see that P (n) = M (n):

• If P n(a0, . . . , an−1) ↓, P n(a0, . . . , an−1) ' j, then P will eventually stop with Ri containingsome values bi, where b0 = j.Then, the TM M starting with bin(a0)xy · · · xybin(an−1) will eventually terminate in a con-figuration bin(b0)xy · · · xybin(bl−1) and therefore Mn(a0, . . . , an−1) ' b0 = j.

• If P n(a0, . . . , an−1) ↑, the URM P will loop and the TM M will carry out the same stepsas the URM and loop as well, therefore Mn(a0, . . . , an−1) ↑, again P n(a0, . . . , an−1) 'Mn(a0, . . . , an−1).

Therefore, we have P (n) = M (n) and we are done, provided we have defined the simulation.We describe informally, how the instructions of the URM are simulated and the initialisation isobtained.

• Initialisation. We start with the tape containing bin(a0)xy · · · xybin(an−1), and have toextend this to bin(a0)xy · · · xybin(an−1)xy bin(0)xy · · · xybin(0)︸ ︷︷ ︸

l − n times

.

This is achieved by moving the head to the end of the initial position (by moving to the nthblank to the right), and then inserting, starting from the next blank, l − n-times 0xy, andthen moving back to the beginning (by moving to the lth blank to the left).

• Simulation of URM instructions.

– Simulation of instruction Ik = succ(j). We have to increase the (j + 1)st binarynumber on the tape by 1 (note that register number 0 is the first number on the tape,register number 1 is the second number on the tape, etc.).

Initially, the configuration is:

bin(c0) xy bin(c1) xy · · · xy bin(cj) xy · · · xy bin(cl) xy

↑qk,0

∗ We first move to the (j + 1)st blank to the right. Then we are at the end of the(j + 1)st binary number.

bin(c0) xy bin(c1) xy · · · xy bin(cj) xy · · · xy bin(cl) xy

Page 53: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 4-10

∗ Now perform the operation for increasing by 1 as above.At the end we obtain:

bin(c0) xy bin(c1) xy · · · xy bin(cj + 1) xy · · · xy bin(cl) xy

↑However, it might be that we needed to write over the separating blank a 1, inwhich case we have:

bin(c0) xy bin(c1) xy · · · bin(cj−1) bin(cj + 1) xy · · · xy bin(cl) xy

↑∗ If we are in the latter case, then we have to shift all symbols to the left once left,

in order to obtain a separating xy between the lth and l − 1st entry. This can beachieved easily (until one has reached the l − 1st blank) and we obtain

bin(c0) xy bin(c1) xy · · · bin(cj−1) xy bin(cj + 1) xy · · · xy bin(cl) xy

↑∗ Otherwise, we move the head to the left, until we reached the (j +1)st blank to the

left, and then move it once to the right. We obtain

bin(c0) xy bin(c1) xy · · · xy bin(cj + 1) xy · · · xy bin(cl) xy

↑– Simulation of instruction Ik = pred(j). We have to decrease the (j + 1)st binary

number on the tape by 1.

∗ Assume the configuration at the beginning is :

bin(c0) xy bin(c1) xy · · · bin(cj) xy · · · xy bin(cl) xy

↑We want to decrease the jth number by 1 (or leave it as it is, if it is zero.So we want to achieve the following configuration:

bin(c0) xy bin(c1) xy · · · bin(cj−· 1) xy · · · xy bin(cl) xy

↑Done as follows:

∗ We move as before to the end of the (j + 1)st number.

∗ Then we check, if the number consists only of zeros or not.

· If it consists only of zeros, i.e. it represents 0, then pred(j) doesn’t changeanything.

· Otherwise, the number is of the form b0 · · · bk1 00 · · ·0︸ ︷︷ ︸l′ times

.

We have (b0 · · · bk1 00 · · · 0︸ ︷︷ ︸l′ times

)2 − 1 = (b0 · · · bk0 11 · · · 1︸ ︷︷ ︸l′ times

)2.

So, we have to replace the binary string by b0 · · · bk0 11 · · ·1︸ ︷︷ ︸l′ times

. This can be done

similarly as for the successor operation.

∗ Finally we move back to the beginning.

– Simulation of instruction Ik = ifzero(j, k′).

The URM instruction does the following: if Rj contains zero, then the next instruction isIk′ (or the program stops if there is no such instruction). Otherwise the next instructionis Ik+1 (or the program stops if there is no such instruction).

This is simulated as follows on the TM: It moves to the (j + 1)st binary number on thetape, checks whether it consists only of zeros. If yes, we switch to state qk′,0, otherwisewe switch to state qk+1,0.

Page 54: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-1

This completes the simulation of the URM P .

Remark: We will later show the other direction, namely that every TM-computable programis URM-computable. Therefore both models of computation define the same functions. Thiswill be done by showing first that every TM-computable function is partial recursive (where thepartial recursive functions is a third model of computation), and then that every partial recursivefunction is URM-computable. One could show as well directly that every TM-computable functionis URM-computable, but the way taken in this lecture is probably easier.

Extension to arbitrary alphabets. Let A be a finite alphabet s.t. xy 6∈ A, and B := A∗. To aTuring machine T = (Σ, S, I, xy, s0) with A ⊆ Σ corresponds a partial function T (A,n) : Bn ∼→ B,where T (A,n)(a0, . . . , an−1) is computed as follows:

• Initially write a0xy · · ·xyan−1 on the tape, otherwise blanks. Start in state s0 on the leftmost position of a0.

• Iterate TM as before.

• In case of termination, the output of the function is c0 · · · cl−1, if the tape contains, startingwith the head position c0 · · · cl−1d with ci ∈ A, d 6∈ A.

• Otherwise, the function value is undefined.

This notion is modulo encoding of A∗ into N equivalent to the notion of Turing-computability on N.However, when considering complexity bounds, it might be more appropriate, since the encodingand decoding of natural numbers might exceed the intended complexity bounds.

Remark on Turing-computable predicates: A predicate A is Turing-decidable, iff χA isTuring-computable. However, instead of simulating χA, which amounts to finally writing theoutput of χA (a binary number 0 or 1) on the tape, it is more convenient, to take TM with twoadditional special states strue and sfalse corresponding to truth and falsity of the predicate. Thenwe can say that a predicate is Turing decidable, if, when we write initially the inputs as before onthe tape and start executing the TM, it always terminates in strue or sfalse, and it terminates instrue, iff the predicate holds for the inputs, and in sfalse, otherwise. This notion, which is equivalentto the first notion, is usually taken as basis for complexity considerations.

5 Algebraic View of Computability

The previous models of computations (URMs and TMs) were based on programming languages,which allow to introduce all computable functions. In this section we discuss a model of computa-tion, in which the set of computable functions is generated from basic functions by using certainoperations. Thus we describe the set of computable functions as the least algebra of functions con-taining the basic functions, and which is closed under these operations. This algebraic approachwas proposed by Godel and Kleene in 1936, and is a elegant and mathematically rigorous definitionof the set of computable functions. In order to show that a function is computable, it is usuallyeasier to show that it is computable in this model.We will first introduce the primitive-recursive functions. This is a set of total functions,which includes all functions which can be realistically computed, and many more, but does notcover all computable functions. Then we will introduce the partial recursive functions, whichis a complete model of computation. Then we will show that the sets of URM-computable, ofTM-computable functions and of partial recursive functions coincide.

5.1 The Primitive-Recursive Functions

Definition 5.1 We define inductively the set of primitive-recursive functions f together with theirarity, i.e. together with the k s.t. f : Nk → N.

Page 55: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-2

We write “f : Nk → N is primitive-recursive” for “f is primitive-recursive with arity k”, and N

for N1.

• The following basic functions are primitive-recursive:

– zero : N → N,

– succ : N → N,

– projki : Nk → N (0 ≤ i ≤ k).

(Remember that these functions have defining equations

– zero(n) = 0,

– succ(n) = n + 1,

– proj(a0, . . . , ak−1) = ai.)

• If g : Nk → N is primitive-recursive, and for i = 0, . . . , k − 1 we have hi : Nn → N isprimitive-recursive, then g ◦ (h0, . . . , hk−1) : Nn → N is primitive-recursive as well.

(Remember that f := g ◦ (h0, . . . , hk−1) has defining equation:

– f(~x) = g(h0(~x), . . . , hk−1(~x)).)

Some notations: In the special case k = 1 we write g ◦ h instead of g ◦ (h), and we writeg0 ◦ g1 ◦ g2 ◦ · · · ◦ gn instead of g0 ◦ (g1 ◦ (g2 ◦ · · · ◦ gn)).

• If g : Nn → N, and h : Nn+2 → N are primitive-recursive, then primrec(g, h) : Nn+1 → N isprimitive-recursive as well.

(Remember that f := primrec(g, h) has defining equations

– f(~x, 0) = g(~x),

– f(~x, n + 1) = h(~x, n, f(~x, n)).)

• If k ∈ N and h : N2 → N is primitive-recursive, then primrec(k, h) : N → N is primitive-recursive as well.

(Remember that f := primrec(k, h) has defining equations

– f(0) = k,

– f(n + 1) = h(n, f(n)).)

Remark: That we defined inductively this set means that the set of primitive-recursive functionsis the least set closed under the above mentioned operations, or that it is the set generated by theabove operations: The primitive-recursive functions are exactly those for which we can introducea term formed from zero, succ, projni , ◦ ( , . . . , ) (i.e if f, gi are terms so is f ◦ (g0, . . . , gn−1)) andprimrec, provided we respect the arities of the functions as above.Examples:

• primrec(proj10︸︷︷︸:N→N

, succ︸︷︷︸:N→N

◦ proj32︸︷︷︸:N3→N︸ ︷︷ ︸

:N3→N

)

︸ ︷︷ ︸:N2→N

: N2 → N is prim. rec.

(We will see below that this is the function add : N2 → N, add(x, y) := x + y).

• primrec( 0︸︷︷︸∈N

, proj20︸︷︷︸:N2→N

)

︸ ︷︷ ︸:N1→N

: N → N is prim. rec..

(We will see below that this is the function pred : N → N, pred(x) := x−· 1).

Page 56: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-3

Definition 5.2 A relation R ⊆ Nn is a primitive-recursive relation, if the characteristic functionχR : Nn → N is primitive-recursive.

Examples:

• The identity function id : N → N, id(n) = n is primitive-recursive, sinceid = proj10: proj10 : N1 → N, proj10(n) = n = id(n).

• The constant function constn : N → N, constn(k) = n is primitive-recursive, sinceconstn = succ ◦ · · · ◦ succ︸ ︷︷ ︸

n times

◦zero:

succ ◦ · · · ◦ succ︸ ︷︷ ︸n times

◦zero(k) = succ(succ(· · · succ︸ ︷︷ ︸n times

(zero(k))))

= succ(succ(· · · succ(︸ ︷︷ ︸n times

0)))

= 0 +1 + 1 · · ·+ 1︸ ︷︷ ︸n times

= n

= constn(k) .

• Addition is primitive-recursive. We have seen previously that add : N2 → N, add(x, y) = x+yfollows the rules

add(x, 0) = x + 0

= x

= g(x) ,

add(x, y + 1) = x + (y + 1)

= (x + y) + 1

= add(x, y) + 1

= h(x, y, add(x, y)) .

whereg : N → N, g(x) = x, therefore g = id = proj10 prim. rec. ,

andh : N3 → N h(x, y, z) := z + 1 .

We have h = succ ◦ proj30 and therefore prim. rec. :

(succ ◦ proj32)(x, y, z) = succ(proj32(x, y, z))

= succ(z)

= z + 1

= h(x, y, z) .

Therefore add = primrec(proj10, succ ◦ proj32).

Page 57: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-4

• Multiplication is primitive-recursive. We have seen previously that mult : N2 → N, mult(x, y) =x · y follows the rules

mult(x, 0) = x · 0= 0

= g(x) ,

mult(x, y + 1) = x · (y + 1)

= x · y + x

= mult(x, y) + x

= add(mult(x, y), x)

= h(x, y, mult(x, y)) ,

whereg : N → N , g(x) = 0 and therefore g = zero prim. rec.

andh : N3 → N , h(x, y, z) = add(z, x) .

We have h = add ◦ (proj32, proj30) and therefore prim. rec. :

(add ◦ (proj32, proj30))(x, y, z) = add(proj32(x, y, z), proj30(x, y, z))

= add(z, x)

= h(x, y, z) .

Therefore mult = primrec(zero, add ◦ (proj32, proj30)) is primitive-recursive.

Remark:

• Unless a direct reduction to the principle of primitive recursion is demanded, for showingthat a function f : Nk+1 → N is primitive-recursive defined by using primrec it suffices toshow informally that

– f(~x, 0) can be defined by an expression built from previously defined primitive-recursivefunctions, parameters ~x, and constants.

Example:f(x0, x1, 0) = (x0 + x1) · 3 .

– f(~x, y + 1) can be defined by an expression built from previously defined primitive-recursive functions, parameters ~x, the recursion argument y, the recursion hypothesisf(~x, y), and constants.

Example:f(x0, x1, y + 1) = (x0 + x1 + y + f(x0, x1, y)) · 3 .

• Similarly, if one wants to verify that a function is primitive-recursive by giving a directdefinition of it in terms of other primitive-recursive functions (i.e. by using previously definedprimitive-recursive functions and composition), it suffices to show that f(~x) can be definedby an expression built from previously defined primitive-recursive functions, parameters ~xand constants.

Example:f(x, y, z) = (x + y) · 3 + z .

All the following proofs will be given in this style and are therefore examples for this principle.

We continue with the introduction of primitive-recursive functions:

Page 58: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-5

• The predecessor function pred is primitive-recursive, because of the equations

pred(0) = 0 ,

pred(x + 1) = x ,

where the induction step doesn’t refer to the recursion hypothesis, but only to the recursionargument.

• The function f(x, y) = x−· y is primitive-recursive, because of the equations

f(x, 0) = x ,

f(x, y + 1) = pred(f(x, y)) .

• The signum-function sig : N → N,

sig(x) :=

{1, if x > 0,0, if x = 0

is primitive-recursive, since sig(x) = x−· (x−· 1):

– For x = 0 we have

x−· (x−· 1) = 0−· (0−· 1)

= 0−· 0

= 0

= sig(x) .

– For x > 0 we have

x−· (x−· 1) = x − (x − 1)

= x − x + 1

= 1

= σ(x) .

Alternatively, one can show that sig is primitive-recursive by using the principle of primitiverecursion and the equations

sig(0) = 0 ,

sig(x + 1) = 1 .

• The relation “x < y”, i.e. A(x, y) :⇔ x < y is primitive-recursive, since χA(x, y) = sig(y−· x):

– If x < y, theny−· x = y − x > 0 ,

thereforesig(y−· x) = 1 = χA(x, y) .

– If ¬(x < y), i.e. x ≥ y, theny−· x = 0 ,

thereforesig(y−· x) = 0 .

• Consider the sequence of definitions of addition, multiplication, exponentiation:

Page 59: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-6

– Addition:

n + 0 = n ,

n + (m + 1) = (n + m) + 1 ,

Therefore, if we write (+1) for the function N → N, (+1)(n) = n + 1, then

n + m = (+1)m(n) .

– Multiplication:

n · 0 = 0 ,

n · (m + 1) = (n · m) + n ,

Therefore, if we write (+n) for the function N → N, (+n)(k) = k + n, then

n · m = (+n)m(0) .

– Exponentiation:

n0 = 1 ,

nm+1 = (nm) · n ,

Therefore, if we write (·n) for the function N → N, (·n)(m) = n · m, then

nm = (·n)m(1) .

We can extend this sequence further, by defining

– Superexponentiation:

superexp(n, 0) = 1 ,

superexp(n, m + 1) = nsuperexp(n,m) ,

Therefore, if we write (n ↑) for the function N → N, (n ↑)(k) = nk, then

nm = (n ↑)m(1) .

– Supersuperexponentiation:

supersuperexp(n, 0) = 1 ,

supersuperexp(n, m + 1) = superexp(n, supersuperexp(n, m)) ,

– Etc.

This way we obtain a sequence of extremely fast growing functions.

Traditionally, instead of considering such binary functions, one considers a sequence of unaryfunctions. which have a similar growth rate. These functions are called the Ackermann functions,and will exhaust all primitive recursive functions (in the sense of Lemma 5.9):

• For n ∈ N, the n-th branch of the Ackermann function Ackn : N → N, is defined by

Ack0(y) = y + 1 ,

Ackn+1(y) = (Ackn)y+1(1) := Ackn(Ackn(· · ·Ackn(︸ ︷︷ ︸y + 1 times

1))) .

Ackn is primitive-recursive for fixed n. We show this by induction on n:

Page 60: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-7

– Base-case: Ack0 = succ is primitive-recursive.

– Induction step: Assume Ackn is primitive-recursive. Show Ackn+1 is primitive-recursive. We have:

Ackn+1(0) = Ackn(1) ,

Ackn+1(y + 1) = Ackn(Ackn+1(y)) .

Therefore Ackn+1 is primitive-recursive.

Remark:

Ack0(n) = n + 1 .

Ack1(n) = Ackn+10 (1)

= 1 +1 + · · · + 1︸ ︷︷ ︸n + 1 times

= 1 + n + 1 = n + 2 .

Ack2(n) = Ackn+11 (1)

= 1 +2 + · · · + 2︸ ︷︷ ︸n + 1 times

= 1 + 2(n + 1)

= 2n + 3 > 2n .

Ack3(n) = Ackn+12 (1)

> 2 · 2 · · · · · 2︸ ︷︷ ︸n + 1 times

·1

= 2n+1 > 2n .

Ack4(n) = Ackn+13 (1)

> 2···21

︸ ︷︷ ︸n + 1 times

.

Ack5(n) will iterate Ack4 n + 1 times, etc.So for small n, Ack5(n) will exceed the number of particles in the universe, and therefore Ack5 isnot realistically computable.

Whereas Ackn for fixed n is primitive-recursive, we will show below that a uniform version of theAckermann function, i.e. Ack : N2 → N, Ack(x, y) := Ackx(y), is not primitive-recursive.

5.2 Closure Properties of the Primitive-Recursive Functions

• The primitive-recursive relations are closed under union, intersection, and complements:If R, S ⊆ Nn are primitive-recursive, so are R ∪ S, R ∩ S, Nn \ R.Note that

• (R ∪ S)(~x) ⇔ R(~x) ∨ S(~x):

(R ∪ S)(~x) ⇔ ~x ∈ R ∪ S

⇔ ~x ∈ R ∨ ~x ∈ S

⇔ R(~x) ∨ S(~x)

Page 61: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-8

• (R ∩ S)(~x) ⇔ R(~x) ∧ S(~x):

(R ∩ S)(~x) ⇔ ~x ∈ R ∩ S

⇔ ~x ∈ R ∧ ~x ∈ S

⇔ R(~x) ∧ S(~x)

• (Nn \ R)(~x) ⇔ ¬R(~x):

(Nn \ R)(~x) ⇔ ~x ∈ (Nn \ R)

⇔ ~x 6∈ R

⇔ ¬R(~x)

Therefore, the primitive-recursive predicates are essentially closed under ∨, ∧, ¬.

Proof that primitive-recursive predicates are closed under ∪, ∩ and complement:

– χR∪S(~x) = sig(χR(~x) + χS(~x)) (and therefore primitive-recursive):

∗ If R(~x) holds, then

sig(χR(~x)︸ ︷︷ ︸=1

+ χS(~x)︸ ︷︷ ︸≥0︸ ︷︷ ︸

≥1

)

︸ ︷︷ ︸=1

= 1 = χR∪S(~x) .

∗ Similarly, if S(~x) holds, then

sig(χR(~x)︸ ︷︷ ︸≥0

+ χS(~x)︸ ︷︷ ︸=1︸ ︷︷ ︸

≥1

)

︸ ︷︷ ︸=1

= 1 = χR∪S(~x) .

∗ If neither R(~x) nor S(~x) holds, then we have

sig(χR(~x)︸ ︷︷ ︸=0

+ χS(~x)︸ ︷︷ ︸=0︸ ︷︷ ︸

=0

)

︸ ︷︷ ︸=0

= 0 = χR∪S(~x) .

– χR∩S(~x) = χR(~x) · χS(~x))(and therefore R ∩ S is primitive-recursive):

∗ If R(~x) and S(~x) hold, then

χR(~x)︸ ︷︷ ︸=1

·χS(~x)︸ ︷︷ ︸=1︸ ︷︷ ︸

=1

= 1 = χR∩S(~x) .

∗ If ¬R(~x) holds, then χR(~x) = 0, therefore

χR(~x)︸ ︷︷ ︸=0

·χS(~x)

︸ ︷︷ ︸=0

= 0 = χR∩S(~x) .

Page 62: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-9

∗ Similarly, if ¬S(~x), we have

χR(~x) · χS(~x)︸ ︷︷ ︸=0︸ ︷︷ ︸

=0

= 0 = χR∩S(~x) .

– χNn\R(~x) = 1−· χR(~x) (and therefore primitive-recursive):

∗ If R(~x) holds, then χR(~x) = 1, therefore

1−· χR(~x)︸ ︷︷ ︸=1︸ ︷︷ ︸

=0

= 0 = χNn\R(~x) .

∗ If R(~x) does not hold, then χR(~x) = 0,

1−· χR(~x)︸ ︷︷ ︸=0︸ ︷︷ ︸

=1

= 1 = χNn\R(~x) .

• The predicates “x ≤ y” and “x = y” are primitive-recursive:

– x ≤ y ⇔ ¬(y < x).The righthand of this equivalence is primitive-recursive, since ‘y < x” is primitive-recursive, and primitive-recursive predicates are closed under ¬.

– x = y ⇔ x ≤ y ∧ y ≤ x.The righthand side of this equivalence is primitive-recursive, since “x ≤ y” and “y ≤ x”are primitive-recursive, and primitive-recursive predicates are closed under ∧.

• The primitive-recursive functions are closed under definition by cases: Assume g1, g2 : Nn →N are primitive-recursive, and R ⊆ Nn is primitive-recursive, then the function f : Nn → N,

f(~x) :=

{g1(~x), if R(~x),g2(~x), if ¬R(~x),

is primitive-recursive, since

f(~x) = g1(~x) · χR(~x) + g2(~x) · χNn\R(~x) :

– If R(~x) holds, then χR(~x) = 1, χNn\R(~x) = 0,

g1(~x) · χR(~x)︸ ︷︷ ︸=1︸ ︷︷ ︸

=g1(~x)

+ g2(~x) · χNn\R(~x)︸ ︷︷ ︸

=0︸ ︷︷ ︸=0︸ ︷︷ ︸

=g1(~x)

= g1(~x) = f(~x) .

– If ¬R(~x) holds, then χR(~x) = 0, χNn\R(~x) = 1,

g1(~x) · χR(~x)︸ ︷︷ ︸=0︸ ︷︷ ︸

=0

+ g2(~x) · χNn\R(~x)︸ ︷︷ ︸

=1︸ ︷︷ ︸=g2(~x)︸ ︷︷ ︸

=g2(~x)

= g2(~x) = f(~x) .

Page 63: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-10

• The primitive-recursive functions are closed under bounded sums:

If g : Nn+1 → N is primitive-recursive, so is

f : Nn+1 → N , f(~x, y) :=∑

z<y

g(~x, z) ,

where ∑

z<0

g(~x, z) := 0 ,

and for y > 0, ∑

z<y

g(~x, z) := g(~x, 0) + g(~x, 1) + · · · + g(~x, y − 1) .

Proof that f is primitive-recursive:

This follows by the equations:

f(~x, 0) = 0 ,

f(~x, y + 1) = f(~x, y) + g(~x, y) .

Here, the last equations follows from

f(~x, y + 1) =∑

z<y+1

g(~x, z)

= (∑

z<y

g(~x, z)) + g(~x, y)

= f(~x, y) + g(~x, y) .

• The primitive-recursive functions are closed under bounded products:

If g : Nn+1 → N is primitive-recursive, so is

f : Nn+1 → N , f(~x, y) :=∏

z<y

g(~x, z) ,

where ∏

z<0

g(~x, z) := 1 ,

and for y > 0, ∏

z<y

g(~x, z) := g(~x, 0) · g(~x, 1) · · · · · g(~x, y − 1) .

Proof that f is primitive-recursive:

This follows by the equations:

f(~x, 0) = 1 ,

f(~x, y + 1) = f(~x, y) · g(~x, y).

Here, the last equations follows by

f(~x, y + 1) =∏

z<y+1

g(~x, z)

= (∏

z<y

g(~x, z)) · g(~x, y)

= f(~x, y) · g(~x, y) .

Page 64: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-11

Example for closure under bounded products:

f : N → N,f(n) := n! = 1 · 2 · · · · · n

is primitive recursive, since

f(n) =∏

i<n

(i + 1) =∏

i<n

g(i) ,

where g(i) := i + 1 is prim.-rec..

(Note that in the special case n = 1 we have

f(0) = 0! = 1 =∏

i<0

(i + 1) .

• The primitive-recursive relations are closed under bounded quantification:If R ⊆ Nn+1 is primitive-recursive, so are the relations

R1(~x, y) :⇔ ∀z < y.R(~x, z) ,

R2(~x, y) :⇔ ∃z < y.R(~x, z) .

Proof for R1:

χR1(~x, y) =∏

z<y

χR(~x, z) :

– If ∀z < y.R(~x, z) holds, then ∀z < y.χR(~x, z) = 1, therefore∏

z<y

χR(~x, y) =∏

z<y

1 = 1 = χR1(~x, y) .

– If for one z < y we have ¬R(~x, z), then for this z we have χR(~x, z) = 0, therefore∏

z<y

χR(~x, y) = 0 = χR1(~x, y) .

Proof for R2:

χR2(~x, y) = sig(∑

z<y

χR(~x, z)) :

– If ∀z < y.¬R(~x, z), then

sig(∑

z<y

χR(~x, y)) = sig(∑

z<y

0)

= sig(0)

= 0

= χR2(~x, y) .

– If for one z < y we have R(~x, z), then for this z we have χR(~x, z) = 1, therefore∑

z<y

χR(~x, y) ≥ χR(~x, z) = 1 ,

thereforesig(

z<y

χR(~x, y)) = 1 = χR2(~x, y) .

Page 65: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-12

• The primitive-recursive functions are closed under bounded search, i.e. if R ⊆ Nn+1 is aprimitive-recursive predicate, so is f(~x, y) := µz < y.R(~x, z) where

µz < y.R(~x, z) :=

{the least z s.t. R(~x, z) holds, if such z exists,y otherwise.

Proof: Define

Q(~x, y) :⇔ R(~x, y) ∧ ∀z < y.¬R(~x, z) ,

Q′(~x, y) :⇔ ∀z < y.¬R(~x, z) .

Q and Q′ are primitive-recursive. Q(~x, y) holds exactly, if y is the minimal z s.t. R(~x, z)holds (i.e. if R(~x, y) holds, but for all z < y R(~x, z) is false).

We showf(~x, y) = (

z<y

χQ(~x, z) · z) + χQ′(~x, y) · y :

– If there exists z < y s.t. R(~x, z) holds, then for the minimal such z we have Q(~x, z),therefore

χQ(~x, z) · z = z .

For all other z′ we have ¬Q(~x, z′), therefore

for z′ 6= z χQ(~x, z′) · z′ = 0 .

Furthermore, ¬Q′(~x, y), therefore

χQ′(~x, y) · y = 0 .

It follows(∑

z<y

χQ(~x, z) · z) + χQ′(~x, y) · y = z = f(~x, y) .

– If there exists no z < y s.t. R(~x, z) holds, then we have ¬Q(~x, z) for all z < y, therefore

∀z < y.χQ(~x, z) · z = 0 .

Furthermore, Q′(~x, y), therefore

χQ′(~x, y) · y = y .

It follows(∑

z<y

χQ(~x, z) · z) + χQ′(~x, y) · y = y = f(~x, y) .

We show that the functions, which encode fixed length and arbitrary length sequences as naturalnumbers or decode them, are primitive-recursive:

Lemma 5.3 The following functions are primitive-recursive:

(a) The pairing functions π : N2 → N.(Remember, that π(n, m) encodes two natural numbers as one.)

(b) The projections π0, π1 : N → N.(Remember that π0(π(n, m)) = n, π1(π(n, m)) = m, so π0, π1 invert π.)

(c) πk : Nk → N (k ≥ 1).(Remember that πk(n0, . . . , nk−1) encodes the sequence (n0, . . . , nk).

Page 66: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-13

(d) f : N3 → N,

f(x, k, i) =

{πk

i (x), if i < k,x, otherwise.

(Remember that πki (πk(n0, . . . , nk−1)) = ni for i < k.)

We write πki (x) for f(x, k, i), even if i ≥ k.

(e) The function fk : Nk → N, fk(x0, . . . , xk−1) = 〈x0, . . . , xk−1〉.(Remember that 〈x0, . . . , xk−1〉 encodes the sequence x0, . . . , xk−1 as one natural number.Note that it doesn’t make sense to say that λx.〈x〉 : N∗ → N is primitive-recursive, since eachprimitive-recursive function has to have domain Nk for some k.)

(f) lh : N → N.(Remember that lh(〈x0, . . . , xk−1〉) = k.)

(g) g : N2 → N, g(x, i) = (x)i.(Remember that (〈x0, . . . , xk−1〉)i = xi for i < k.)

Proof:(a)

π(n, m) = (∑

i≤n+m

i) + m

= (∑

i<n+m+1

i) + m

is primitive-recursive.(b) One can easily show that n, m ≤ π(n, m). Therefore we can define

π0(n) := µk < n + 1.∃l < n + 1.n = π(k, l) ,

π1(n) := µl < n + 1.∃k < n + 1.n = π(k, l) .

Therefore π0, π1 are primitive-recursive.(c) Proof by induction on k.k = 1: π1(x) = x, so π1 is primitive-recursive.k → k + 1: Assume that πk is primitive-recursive. Show that πk+1 is primitive-recursive as well:

πk+1(x0, . . . , xk) = π(πk(x0, . . . , xk−1), xk) .

Therefore πk+1 is primitive-recursive (using that π, πk are primitive-recursive).(d) We have

π10(x) = x ,

πk+1i (x) = πk

i (π0(x)), if i < k,

πk+1i (x) = π1(x), if i = k,

Therefore

πki (x) =

{π1((π0)

k−i(x)), if i > 0,(π0)

k(x), if i = 0.

and

f(x, k, i) =

x, if i ≥ k,π1((π0)

k−i(x)), if 0 < i < k,(π0)

k(x), if i = 0 < k.

Page 67: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-14

Define g : N2 → N,

g(x, 0) := x ,

g(x, k + 1) := π0(g(x, k)) ,

which is primitive-recursive. Then we get g(x, k) = (π0)k(x), therefore

f(x, k, i) =

x, if i ≥ k,π1(g(x, k−· i)), if 0 < i < k,g(x, k), if i = 0 < k.

So f is primitive-recursive.(e)

fk(x0, . . . , xk−1) = 1 + π(k−· 1, πk(x0, . . . , xk−1))

is primitive-recursive.(f)

lh(x) =

{0, if x = 0,π0(x−· 1) + 1, if x 6= 0.

(g)

(x)i = πlh(x)i (π1(x−· 1))

= f(π1(x−· 1), lh(x), i)

is primitive-recursive.

Lemma and Definition 5.4 There exists primitive-recursive functions with the following prop-erties:

(a) A function snoc : N2 → N s.t. snoc(〈x0, . . . , xn−1〉, x) = 〈x0, . . . , xn−1, x〉.(The name snoc is derived from inverting the letters in cons. Our definition of the encodingof sequences corresponds to appending new elements at the end rather than at the beginning).

(b) Functions last : N → N and beginning : N → N s.t.

last(snoc(x, y)) = y ,

beginning(snoc(x, y)) = x .

Proof:(a) Define

snoc(x, y) =

{〈y〉, if x = 0,1 + π(lh(x), π(π1(x−· 1), y)), otherwise,

so snoc is primitive-recursive.We have

snoc(〈〉, y) = snoc(0, y)

= 〈y〉 ,

snoc(〈x0, . . . , xk〉, y) = snoc(1 + π(k, πk+1(x0, . . . , xk)), y)

lh(〈x0, . . . , xk〉) = k + 1= 1 + π(k + 1, π(π1((1 + π(k, πk+1(x0, . . . , xk)))−· 1), y))

= 1 + π(k + 1, π(π1(π(k, πk+1(x0, . . . , xk))), y))

= 1 + π(k + 1, π(πk+1(x0, . . . , xk), y))

= 1 + π(k + 1, πk+2(x0, . . . , xk, y))

= 〈x0, . . . , xk, y〉 .

Page 68: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-15

(b)Proof for beginning:Define

beginning(x) :=

〈〉, if lh(x) ≤ 1,〈(x)0〉 if lh(x) = 2,1 + π((lh(x)−· 1)−· 1, π0(π1(y−· 1))), otherwise.

Let x = snoc(y, z). Show beginning(x) = y.Case lh(y) = 0: Then

x = snoc(y, z) = 〈z〉therefore lh(x) = 1, and

beginning(x) = 〈〉= y

Case lh(y) = 1: Then y = 〈y′〉 for some y′, snoc(y, z) = 〈y′, z〉,

beginning(x) = 〈(x)0〉= 〈(〈y′, z〉)0〉= 〈y′〉= y

Case lh(y) > 1: Let lh(y) = n + 2,

y = 〈y0, . . . , yn+1〉 = 1 + π(n + 1, πn+2(y0, . . . , yn+1)) .

Thensnoc(y, z) = 1 + π(n + 2, π(π1(y−· 1), z)) .

Therefore

beginning(snoc(y, z)) = 1 + π(((lh(x)−· 1)−· 1), π0(π1(snoc(y, z)−· 1)))

= 1 + π(n, π0(π1((1 + π(n + 2, π(π1(y−· 1), z)))−· 1)))

= 1 + π(n, π0(π1(π(n + 2, π(π1(y−· 1), z)))))

= 1 + π(n, π0(π(π1(y−· 1), z)))

= 1 + π(n, π1(y−· 1))

= 1 + π(n, π1((1 + π(n + 1, πn+2(y0, . . . , yn+1)))−· 1))

= 1 + π(n, π1(π(n + 1, πn+2(y0, . . . , yn+1))))

= 1 + π(n, πn+2(y0, . . . , yn+1)))

= y .

Proof for last:Define

last(x) := (x)lh(x)−· 1

If y = 〈y0, . . . , yn−1〉, then

last(snoc(y, z)) = last(〈y0, . . . , yn−1, z〉)= (〈y0, . . . , yn−1, z〉)lh(〈y0,...,yn−1,z〉)−· 1

= (〈y0, . . . , yn−1, z〉)n

= z .

Page 69: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-16

Lemma 5.5 The primitive-recursive functions are closed under course-of-values recursion:If g : Nn+2 → N is primitive-recursive, then f : Nn+1 → N is primitive-recursive as well, where fis defined by

f(~x, k) = g(~x, k, 〈f(~x, 0), f(~x, 1), . . . , f(~x, k − 1)〉) .

Remark: Informally, this means: If we can define f(~x, y) by an expression which uses previouslydefined primitive-recursive functions, constants, ~x, y and any f(~x, z) s.t. z < y, then f is primitive-recursive – if we have such an expression, we can form a function g as in the lemma above.Example: The Fibonacci numbers are primitive-recursive, i.e. the function fib : N → N, definedby

fib(0) := 1 ,

fib(1) := 1 ,

fib(n) := fib(n − 1) + fib(n − 2), if n > 1,

is primitive-recursive, as shown by course-of-values recursion. (This is the inefficient implementa-tion, the more efficient solution can be seen directly to be primitive-recursive).

Proof that primitive-recursive functions are closed under course-of-values recursion:Let

f(~x, y) := g(~x, k, 〈f(~x, 0), f(~x, 1), . . . , f(~x, k − 1)〉)Show f is prim. rec.Define h : Nn+1 → N,

h(~x, y) := 〈f(~x, 0), f(~x, 1), . . . , f(~x, y − 1)〉 .

(Especially, h(~x, 0) = 〈〉.)It follows

h(~x, 0) = 〈〉 ,

h(~x, y + 1) = 〈f(~x, 0), f(~x, 1), . . . , f(~x, y − 1), f(~x, y)〉= snoc(〈f(~x, 0), f(~x, 1), . . . , f(~x, y − 1)〉︸ ︷︷ ︸

=h(~x,y)

, f(~x, y))

= snoc(h(~x, y), g(~x, y, 〈f(~x, 0), f(~x, 1), . . . , f(~x, y − 1)〉︸ ︷︷ ︸=h(~x,y)

))

= snoc(snoc(h(~x, y), g(~x, y, h(~x, y))) .

Therefore h is primitive-recursive.Now we have that

f(~x, y) = (〈f(~x, 0), . . . , f(~x, y)〉)y+1

= (h(~x, y + 1))y+1

is primitive recursive.

We show that functions, which manipulate codes for sequences of natural numbers, are primitive-recursive.

Lemma and Definition 5.6 There exists primitive-recursive functions with the following prop-erties:

(a) A function append : N2 → N s.t.

append(〈n0, . . . , nk−1〉, 〈m0, . . . , ml−1〉) = 〈n0, . . . , nk−1, m0, . . . , ml−1〉 .

Page 70: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-17

We writen ∗ m

forappend(n, m) .

(b) A function subst : N3 → N s.t. if i < n then

subst(〈x0, . . . , xn−1〉, i, y) = 〈x0, . . . , xi−2, y, xi, xi+1, . . . , xn−1〉 ,

and if i ≥ n, thensubst(〈x0, . . . , xn−1〉, i, y) = 〈x0, . . . , xn−1〉 .

This means that subst(x, i, y) will be the result of substituting in the sequence x the ith elementby y, if such an element exist – if not, x remains unchanged.

We write x[i/y] for subst(x, i, y).

(c) A function substring : N3 → N s.t., if i < n,

substring(〈x0, . . . , xn−1〉, i, j) = 〈xi, xi+1, . . . , xmin(j−1,n−1)〉 ,

and if i ≥ n,substring(〈x0, . . . , xn−1〉, i, j) = 〈〉 .

(d) A function half : N → N, s.t. half(n) = k if n = 2k or n = 2k + 1.

(e) The function bin : N → N, s.t. bin(n) = 〈b0, . . . , bk〉, for bi in normal form (no leading zeros,unless n = 0), s.t. n = (b0, . . . , bk)2

(f) A function bin−1 : N → N, s.t. bin−1(〈b0, . . . , bk〉) = n, if (b0, . . . , bk)2 = n.

Proof:(a) We have

append(〈x0, . . . , xn〉, 0) = append(〈x0, . . . , xn〉, 〈〉)= 〈x0, . . . , xn〉 ,

and for m > 0

append(〈x0, . . . , xn〉, 〈y0, . . . , ym〉) = 〈x0, . . . , xn, y0, . . . , ym〉= snoc(〈x0, . . . , xn, y0, . . . , ym−1〉, ym)

= snoc(append(〈x0, . . . , xn〉, 〈y0, . . . , ym−1〉), ym)

= snoc(append(〈x0, . . . , xn〉, beginning(〈y0, . . . , ym〉)), last(〈y0, . . . , ym〉)) .

Therefore we have

append(x, 0) = x ,

append(x, y) = snoc(append(x, beginning(y)), last(y)) ,

One can see that beginning(x) < x for x > 0, therefore the last equations give a definition of append

by course-of-values recursion, therefore append is primitive-recursive.(b) We have

subst(x, i, y) :=

x, if lh(x) ≤ i,snoc(beginning(x), y), if i + 1 = lh(x),snoc(subst(beginning(x), i, y), last(x)) if i + 1 < lh(x).

Page 71: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-18

Therefore subst is definable by course-of-values recursion.(c) We can define

substring(x, i, j) =

〈〉, if i ≥ lh(x),substring(beginning(x), i, j), if i < lh(x) and j < lh(x),snoc(substring(beginning(x), i, j), last(x)) if i < lh(x) ≤ j,

which is a definition by course-of-values recursion.(d) half(x) = µy < x.(2 · y = x ∨ 2 · y + 1 = x).(e)

bin(x) =

〈0〉, if x = 0,〈1〉 if x = 1,snoc(half(x), x−· (2 · half(x))), if x > 1.

therefore definable by course-of-values recursion.(f)

bin−1(x) =

0, if lh(x) = 0,(x)0 if lh(x) = 1,

bin−1(beginning(x)) · 2 + last(x) if lh(x) > 1,

therefore definable by course-of-values recursion.

5.3 Not All Computable Functions are Primitive-Recursive

All primitive-recursive functions are computable. However, there are computable functions whichare not primitive-recursive. One such function is the Ackermann-function:

Definition 5.7 The Ackermann function Ack : N2 → N is defined as Ack(n, m) := Ackn(m).Therefore Ack has the following recursion equations:

Ack(0, y) = y + 1 ,

Ack(x + 1, 0) = Ack(x, 1) ,

Ack(x + 1, y + 1) = Ack(x, Ack(x + 1, y)) .

Lemma 5.8 For each n, m, the following holds:

(a) Ack(m, n) > n.

(b) Ackm is strictly monotone, i.e. Ack(m, n + 1) > Ack(m, n).

(c) Ack(m + 1, n) > Ack(m, n).

(d) Ack(m, Ack(m, n)) < Ack(m + 2, n).

(e) Ack(m, 2n) < Ack(m + 2, n).

(f) Ack(m, 2k · n) < Ack(m + 2k, n).

Proof:(a) Induction on m.m = 0:

Ack(0, n) = n + 1 > n .

m → m + 1: Side-induction on nn = 0:

Ack(m + 1, 0) = Ack(m, 1)IH> 1 > 0 .

Page 72: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-19

n → n + 1:

Ack(m + 1, n + 1) = Ack(m, Ack(m + 1, n))

Main IH> Ack(m + 1, n)

Side IH> n ,

thereforeAck(m + 1, n + 1) > n + 1 .

(b) Case m = 0:Ack(0, n + 1) = n + 2 > n + 1 = Ack(0, n) .

Case m = m′ + 1:

Ack(m′ + 1, n + 1) = Ack(m′, Ack(m′ + 1, n))

(a)> Ack(m′ + 1, n) .

(c) Induction on m.m = 0:

Ack(1, n) = n + 2 > n + 1 = Ack(0, n)

m → m + 1: Side-induction on n:n = 0:

Ack(m + 2, 0) = Ack(m + 1, 1)(b)> Ack(m + 1, 0) .

n → n + 1:

Ack(m + 2, n + 1) = Ack(m + 1, Ack(m + 2, n))

main-IH> Ack(m, Ack(m + 2, n))

side-IH + (b)> Ack(m, Ack(m + 1, n)) = Ack(m + 1, n + 1) .

(d) Case m = 0:Ack(0, Ack(0, n)) = n + 2 < 2n + 3 = Ack(2, n) .

Assume now m > 0. Proof of the assertion by induction on n:n = 0:

Ack(m + 2, 0) = Ack(m + 1, 1)

= Ack(m, (Ack(m + 1, 0))

> Ack(m, Ack(m, 0)) .

n → n + 1:

Ack(m + 2, n + 1) = Ack(m + 1, Ack(m + 2, n))

IH,(b)> Ack(m + 1, Ack(m, Ack(m, n)))

(b),(c)> Ack(m, Ack(m − 1, Ack(m, n)))

= Ack(m, Ack(m, n + 1)) .

Page 73: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-20

(e) Case m = 0:

Ack(m, 2n) = Ack(0, 2n)

= 2n + 1

< 2n + 3

= Ack(2, n) = Ack(m + 2, n) .

m = m′ + 1:Induction on n:n = 0:

Ack(m′ + 1, 2n) = Ack(m′ + 1, 0) < Ack(m′ + 3, 0) = Ack(m′ + 3, n) .

n → n + 1:

Ack(m′ + 1, 2n + 2) = Ack(m′, Ack(m′, Ack(m′ + 1, 2n))

(d)< Ack(m′ + 2, Ack(m′ + 1, 2n))

IH< Ack(m′ + 2, Ack(m′ + 3, n))

= Ack(m′ + 3, n + 1)

(f) Induction on k:k = 0: trivial.k → k + 1:

Ack(m, 2k+1 · n) = Ack(m, 2 · 2k · n)

(e)< Ack(m + 2, 2k · n)

IH< Ack(m + 2 + 2k, n)

= Ack(m + 2(k + 1), n) .

Lemma 5.9 Every primitive-recursive function f : Nn → N can be majorised by one branch ofthe Ackermann function, i.e. there exists an N s.t.

f(x0, . . . , xn−1) < AckN (x0 + · · · + xn−1)

for all x0, . . . , xn−1 ∈ N.Especially, if f : N → N is primitive-recursive, then there exists an N s.t.

∀x ∈ N.f(x) < AckN (x)

Proof:We write

∑(~x) for x0 + · · ·xn−1, if ~x = x0, . . . , xn−1.

We proof the assertion by induction on the definition of primitive-recursive functions.Basic functions:zero:

zero(x) = 0 < x + 1 = Ack0(x) .

succ:succ(x) = Ack(0, x) < Ack(1, x) = Ack1(x) .

Page 74: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-21

projni :

proj(x0, . . . , xn−1) = xi

< x0 + · · · + xn−1 + 1

= Ack0(x0 + · · · + xn−1) .

Composition: Assume assertion holds for f : Nk → N and gi : Nn → N. Show assertion forh := f ◦ (g0, . . . , gk−1).Assume

f(~y) < Ackl(∑

(~y)) ,

andgi(~x) < Ackmi

(∑

(~x)) .

Let N := max{l, m0, . . . , mk−1}. By 5.8 (c) it follows

f(~y) < AckN (∑

(~y)) ,

gi(~x) < AckN (∑

(~x)) .

Then, with M s.t. l < 2M , we have

h(~x) = f(g0(~x), . . . , gk−1(~x))

< AckN (g0(~x) + · · · + gk−1(~x))

(b)< AckN (AckN (

∑(~x)) + · · · + AckN (

∑(~x)))

= AckN (AckN (∑

(~x)) · k)

< AckN (AckN (∑

(~x)) · 2M )

< AckN+2M (AckN (∑

(~x)))

≤ AckN+2M (AckN+2M (∑

(~x)))

< AckN+2M+2(∑

(~x)) .

Primitive recursion, n > 1: Assume assertion holds for f : Nn → N and g : Nn+2 → N. Showassertion for h := primrec(f, g) : Nn+1 → N.Assume

f(~x) < Ackl(∑

(~x)) ,

g(~x, y, z) < Ackr(∑

(~x) + y + z) .

Let N := max{l, r}, k < 2M . Then

f(~x) < AckN (∑

(~x)) ,

g(~x, y, z) < AckN (∑

(~x) + y + z) .

We showh(~x, y) < AckN+3(

∑(~x) + y)

by induction on y: y = 0:

Page 75: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-22

h(~x, 0) = g(~x)

< AckN (∑

(~x))

< AckN+3(∑

(~x) + 0) .

y → y + 1:

h(~x, y + 1) = g(~x, y, h(~x, y))

< AckN(∑

(~x) + y + h(~x, y))

IH< AckN(

∑(~x) + y + AckN+3(

∑(~x) + y))

< AckN(AckN+3(∑

(~x) + y) + AckN+3(∑

(~x) + y))

= AckN(2 · AckN+3(∑

(~x) + y))

< AckN+2(AckN+3(∑

(~x) + y))

= AckN+3(∑

(~x) + y + 1)

Primitive recursion n = 0: Assume l ∈ N, g : N2 → N. Show assertion for h := primrec(l, g) :N1 → N.Define

f ′ : N → N, f ′(x) = l ,

g′ : N3 → N, g′(x, y, z) = g(y, z) ,

h′ : N2 → N, h := primrec(f ′, g′) .

Using the constructions already shown and IH it follows that

h′(x, y) < AckN (x + y)

for some N .Therefore

h(y) = h′(0, y) < AckN (y) .

Lemma 5.10 The Ackermann function is not primitive-recursive.

Proof: Assume Ack were primitive-recursive. Then f : N → N, f(n) := Ack(n, n) would beprimitive-recursive as well. Then there exists an N ∈ N, s.t. f(n) < Ack(N, n) for all n, especially

Ack(N, N) = f(N) < Ack(N, N) ,

a contradiction.Remark: We can show more directly that there are non-primitive-recursive computable functions:Assume all computable functions are primitive-recursive.Define h : N2 → N as follows:

h(e, n) =

f(n), if e encodes a string in ASCIIwhich is a term denoting a unaryprimitive-recursive function f ,

0, otherwise.

Page 76: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-23

So h(e, n) computes, if e is a code for a primitive-recursive function f , the result of applying f ton. For other e, the result of h(e, n) doesn’t matter as long as it is defined, we have set it to 0.h can be considered as an interpreter for the primitive-recursive functions.So if e is a code for f , then ∀n.f(n) = h(e, n), f = λn.h(e, n). Therefore for each primitive-recursive function f : N → N there exists an e s.t. f = λn.h(e, n) (choose e to be the code forn).h is computable, since the defining laws for primitive-recursive functions give us a way of computingh(e, n). Using the assumption, that all computable recursive functions are primitive-recursive, itfollows that h is primitive-recursive.Define

f : N → N, f(n) := h(n, n) + 1 .

f is primitive-recursive, since h is primitive-recursive. But f is chosen in such a way that it cannotbe of the form λn.h(e, n). This is violated at input e:

f(e) = h(e, e) + 1 6= h(e, e) = (λn.h(e, n))(e) .

But since f is primitive-recursive, there exists a code e for f and therefore f = λe.h(e, n) for somee, which cannot be by the above, so we get a contradiction. This completes the proof.A short version of the proof reads as follows:Assume all computable functions were primitive-recursive. Define h as above. h is computable,therefore primitive-recursive. Define f as above, and let e be a code for f . Then we have

f(n) = h(e, n)

for all n. But thenh(e, e) = f(e) = h(e, e) + 1 ,

a contradiction.Remark 2: The above proof can be used in other contexts as well. It shows as well that there isno programming language, which computes all computable functions and such that all functionsdefinable in this language are total. If we had such a language, then we could define codes e forprograms in this language and therefore we could define h : N2 → N,

h(e, n) =

f(n), if e is a program in this languagefor a unary function f ,

0, otherwise.

If f is a unary function, computable in this language, then f = λn.h(e, n) for the code e for theprogram of f . Now define as above f(n) = h(n, n) + 1. h is computable, therefore as well f ,therefore f is computable in this language. Let f = λn.h(e, n). Then we obtain

h(e, e) + 1 = f(e) = (λn.h(e, n))(e) = h(e, e) ,

a contradiction.

Remark 3: The above proof shows as well for instance that if we extend the primitive-recursivefunctions by adding the Ackermann function as basic function, or any other functions, we still won’tobtain all computable functions – the above argument can be used for for such sets of functions injust the same way as for the set of primitive-recursive functions.

5.4 The Partial Recursive Functions

The primitive-recursive functions will allow to define functions which

• compute the result of running n steps of a URM or TM,

Page 77: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-24

• check whether a URM or TM has stopped,

• obtain, depending on n arguments the initial configuration for computing P (n) for a URMP or M (n) for a TM M ,

• extract from a configuration of P and M , in which this machine has stopped, the resultoptained by P (n), M (n).

We will show the existence of such functions for TMs in Subsection 5.6. However, TMs and URMsdo not allow to compute an n, s.t. after n steps the URM or TM has stopped. In order to obtainsuch an n, we will extend the set of primitive-recursive function by the principle of being closed aswell under µ. The resulting set will be called the set of partial recursive functions.Using the µ-operator, we will be able to find the first n s.t. a TM or URM stops. Together withthe above we will then be able to show that all TM- and URM-computable functions are partialrecursive.Note that µ(f) might be partial, even if f is total, therefore we will necessarily obtain a set of partialfunctions rather than a set of total functions, therefore the name partial recursive functions. Therecursive functions will be the partial recursive functions, which are total.

Definition 5.11 We define inductively the set of partial recursive functions f together with theirarity, i.e. together with the k s.t. f : Nk ∼→ N.We write “f : Nk ∼→ N is partial recursive” for “f is partial recursive with arity k”, and N for N1.

• The following basic functions are partial recursive:

– zero : N∼→ N,

– succ : N∼→ N,

– projki : Nk ∼→ N (0 ≤ i ≤ k).

• If g : Nk ∼→ N is partial recursive, and for i = 0, . . . , k − 1 we have hi : Nn ∼→ N is partialrecursive, then g ◦ (h0, . . . , hk−1) : Nn ∼→ N is partial recursive as well.

As before we write g ◦h instead of g ◦ (h), and g0 ◦ g1 ◦ g2 ◦ · · · ◦ gn for g0 ◦ (g1 ◦ (g2 ◦ · · · ◦ gn)).

• If g : Nn ∼→ N, and h : Nn+2 ∼→ N are partial recursive, then primrec(g, h) : Nn+1 ∼→ N ispartial recursive as well.

• If k ∈ N and h : N2 ∼→ N is partial recursive, then primrec(k, h) : N∼→ N is partial recursive

as well.

• If k ∈ N and g : Nk+2 ∼→ N is partial recursive, then µ(g) : Nk+1 ∼→ N is partial recursive aswell.

(Remember that f := µ(g) has defining equation

f(~x) '{

min{k ∈ N | g(~x, k) ' 0 ∧ ∀l < k.g(~x, l) ↓}, if such a k exists,undefined, otherwise.)

Definition 5.12 (a) A recursive function is a partial recursive function, which is total.

(b) A recursive relation is a relation R ⊆ Nn s.t. χR is recursive.

Example: One can show that the Ackermann function is recursive. Note that it is not primitive-recursive.

Page 78: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-25

5.5 Closure Properties of the Recursive Functions

• Every primitive-recursive function (relation) is recursive.

• The recursive functions and relations have the same closure properties as those discussed forthe primitive-recursive functions and relations in Sect. 5.2, using essentially the same proofs.

• Let for a predicate U ⊆ Nn+1

µz.U(~n, z) :=

{min{z | U(~n, z)}, if such a z exists,undefined, otherwise.

Then we have that, if U is recursive, so is the function

f(~n) :' µz.U(~n, z) .

Proof:f(~n) ' µz.(χNn\U (~n, z) ' 0) .

5.6 Equivalence of URM-Computable, TM-Computable and Partial Re-cursive Functions

Lemma 5.13 All partial recursive functions are URM computable.

Proof: By Lemma 3.3.

Turing-computable functions are partial recursive.

We are going to show that TM-computable functions are partial recursive. We will show an evenstronger result: We will encode Turing machines M as natural numbers code(M) and show thatthere exists for every n ∈ N partial recursive functions fn : Nn+1 ∼→ N s.t. for every TM M wehave

∀~m ∈ Nn.fn(code(M), ~m) ' M (n)(~m) .

These functions fn will be called universal partial recursive functions. The functions fn are es-sentially interpreters for Turing machines – fn essentially evaluates (simulates) a Turing machine(given by its code e) applied to arguments ~m.We will later write {e}n(~m) for fn(e, ~m). The brackets in {e}n are called “Kleene-brackets”, sinceKleene introduced this notation.

Encoding of Turing Machines as Natural Numbers.

We will encode TMs using the following steps:

• In general we use sometimes Godel-brackets in order to denote the code of some object. E.g.one writes dxe for the code of x, dye for the code of y.

• We assume some encoding dxe for the symbols x of the alphabet Σ as natural numbers, s.t.

– d0e = 0,

– d1e = 1,

– dxye = 2.

• We assume that each states q is encoded as natural number dqe, where we have dq0e = 0 forthe initial state q0 of the TM.

• We encode the directions in the instructions by

Page 79: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-26

– dLe = 0,

– dRe = 1.

• We assume that the alphabet consists of {0, 1, xy} and those symbols mentioned in theinstructions. Any other symbols of the TM will never occur during a run of the TM, thereforeomitting those symbols doesn’t change the behaviour of the TM. Therefore we don’t need towrite down the alphabet explicitly.

• Similarly, we can assume that the set of states consists of q0 and all states mentioned inthe instructions, therefore we don’t need to write down the set of states explicitly. Again,omitting any other states won’t change the behaviour of the TM.

• Since we assumed that q0 is always the initial state with code 0, and that xy has code 2, wedon’t need to mention those symbols. Therefore a TM is given by its set of instructions.

• An instruction I = (q, a, q′, a′, D), where q, q′ are states, a, a′ are symbols, D ∈ {L, R}, willbe encoded as

code(I) = π5(dqe, dae, dq′e, da′e, dDe) .

• A set of instructions {I0, . . . , Ik−1} will be encoded as 〈code(I0), . . . , code(Ik−1)〉. This numberis as well the encoding code(M) of the corresponding TM.

Encoding of the Configuration of a TM as a Natural number.

The configuration of a TM can be given by:

• The code dqe of its state q.

• A segment (i.e. a consecutive sequence of cells) of the tape, which includes the cell, the headis pointing to, and all cells which are not blank. (Note that during a run of a TM, only finitelymany cells contain a symbol which is non-blank: initially this is the case and in finitely manysteps only finitely many cells are changed from blank to non-blank. Therefore there existsalways a segment as just stated.) A segment a0, . . . , an−1 is encoded as code(a0, . . . , an−1) :=〈da0e, . . . , dan−1e〉.

• The position of the head on this tape. This can be given as a number i s.t. 0 ≤ i < n, if thesegment represented is a0, . . . , an−1.

A configuration of a TM, given by a state q, a segment a0, . . . , an−1 and a head position i, will beencoded as π3(dqe, code(a0, . . . , an−1), i).Note that the segment represented is not uniquely defined, but that doesn’t cause any problems.We only require that the functions generate and use arbitrary codes for configurations.

Primitive-Recursive Functions Simulating the Steps of a Turing Machine

We will now introduce primitive-recursive function, as outlined in the beginning of Section 5.4,which create the initial configuration of a TM, make one step of the TM, check whether the TMhas stopped and extract the result from its configuration.

• We define primitive-recursive functions symbol, state : N → N, which extract the symbol atthe head and the state of the TM from the current configuration:

symbol(a) = (π31(a))π3

2(a)

state(a) = π30(a)

Note that a = π3(dqe, 〈da0e, . . . , dan−1e〉, i).

Page 80: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-27

• We define a primitive-recursive function lookup : N3 → N, s.t. if e is the code for a TM, qstate, a a symbol, then lookup(e, q, a) is a number π3(dq′e, da′e, dDe) for the first instructionof the TM e of the form (q, a, q′, a′, D), if it exists. If no such instruction exists, the resultwill be 0(= π3(0, 0, 0)).

lookup is defined as follows:

– First find using bounded search the index of the first instruction starting with dqe, da′e.– Then extract the corresponding values from this instruction.

Formally the definition is as follows:

– Define an auxiliary primitive-recursive function g : N3 → N,

g(e, q, a) = µi ≤ lh(e).π50((e)i) = q ∧ π5

1((e)i) = a ,

which finds the index of the first instruction starting with q, q.

– Now define

lookup(e, q, a) := π3(π52((e)g(e,q,a)), π

53((e)g(e,q,a)), π

54((e)g(e,q,a))) .

• There exists a primitive-recursive relation hasinstruction ⊆ N3, s.t. hasinstruction(e, dqe, dae)holds, iff the TM corresponding to e has a instruction (q, a, q′, a′, D) for some q′, a′, D.

– Defined, using the function g from the previous item, as

χhasinstruction(e, q, a) = sig(lh(e)−· g(e, q, a)) .

If there is such an instruction, g(e, q, a) is < lh(e), therefore lh(e)−· g(e, q, a) > 0. Oth-erwise g(e, q, a) is lh(e), lh(e)−· g(e, q, a) = 0.

• There exists a primitive-recursive function next : N2 → N s.t., if e encodes a TM and c is acode for a configuration, then next(e, c) is a code for the configuration after one step of theTM is executed, or equal to c, if the TM has halted.

Informal description of next:

– Assume the configuration is c = π3(q, 〈a0, . . . , an−1〉, i).– Check using hasinstruction, whether the TM has stopped. If it has stopped, return c.

– Otherwise, use lookup to obtain the codes for the next state q′, next symbol a′ anddirection D.

– Replace in 〈a0, . . . , an−1〉, the ith element by a′. Let the result be x.

– It might be that the head in the next step will leave the current segment. Then we haveto extend the segment by one blank to the left or right:

∗ If the direction is D = dLe and i = 0, we have to extend the segment to the left byone blank. So the result of next is

π3(q′, 〈dxye〉 ∗ x, 0) .

∗ If the direction is D 6= dLe and i ≥ n − 1, we have to extend the segment to theright by one blank. Then the result of next is

π3(q′, x ∗ 〈dxye〉, n) .

Page 81: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-28

∗ Otherwise let i′ = i − 1, if D = dLe, and i′ = i + 1, if D 6= dLe. Then the result ofnext is

π3(q′, x, i′) .

The above can easily be formalised as a primitive recursive function, using the previouslydefined functions, list operations (especially substitution and concatenation of lists) andcase distinction.

• There exists a primitive-recursive predicate terminate ⊆ N2, s.t. terminate(e, c) holds, iff theTM e with configuration c stops.

terminate(e, c) :⇔ ¬hasinstruction(e, state(c), symbol(c)) .

• There exists a primitive-recursive function iterate : N3 → N s.t. iterate(e, c, n) is the result ofiterating TM e with initial configuration c for n steps.

The definition is as follows

iterate(e, c, 0) = c ,

iterate(e, c, n + 1) = next(e, iterate(e, c, n)) .

• There exists a primitive-recursive function initn : Nn → N, s.t. initn(~m) is the initial config-uration for computing M (n)(~m) for of any TM M containing alphabet {0, 1, xy}:

initn(m0, . . . , mn−1) = π3(dq0e, bin(m0)∗〈dxye〉∗bin(m1)∗〈dxye〉∗· · ·∗〈dxye〉∗bin(mn−1), 0) .

• There exists a primitive-recursive function extract : N → N, s.t. if c is a configuration inwhich the TM stops, extract(c) is the natural number corresponding to the result returnedby the TM.

– Assume c = π3(q, x, i).

– We need to find first the largest subsequence of x starting at i consisting of zeros andones only: This is j defined as

j = µz < lh(x).z ≥ i ∧ (x)z 6= 0 ∧ (x)z 6= 1 .

Nowextract(c) = bin−1(substring(x, i, j)) .

• There exist primitive-recursive predicates Tn ⊆ Nn+2 s.t., if e encodes a TM M , thenTn(e, m0, . . . , mn−1, k) holds, iff M , started with the initial configuration corresponding toa computation of M (n)(m0, . . . , mn−1), stops after π0(k) steps and has final configurationπ1(k).

Tn(e, ~m, k) :⇔ terminate(e, iterate(e, initn(~m), π0(k))) ∧ iterate(e, initn(~m), π0(k)) = π1(k) .

• There exists a primitive-recursive function U : N → N s.t. , if M is a TM with code(M) = e,and Tn(e, ~m, k) holds, then U(k) is the result of M (n)(~m).

U(k) = extract(π1(k)) .

• Now it follows that, if e encodes a TM M , then

M (n)(~n) ' U(µz.Tn(e, ~m, z)) .

Page 82: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 5-29

So we have shown the following theorem:

Theorem 5.14 (Kleene’s Normal Form Theorem)

(a) There exists partial recursive functions fn : Nn+1 ∼→ N s.t. if e is the code of TM M then

fn(e, ~n) ' Mn(~n) .

(b) There exist a primitive-recursive function U : N → N and primitive recursive predicatesTn ⊆ Nn+2 s.t. for the function fn introduced in (a) we have

fn(e, ~n) ' U(µz.Tn(e, ~m, z)) .

Definition 5.15 Let U, Tn as in Kleene’s Normal Form Theorem 5.14. Then we define for e ∈ N

the partial recursive function {e}n : Nn ∼→ N by

{e}n(~m) ' U(µy.Tn(e, ~m, y)) .

{e}n(~m) is pronounced Kleene-brackets-e in n arguments applied to ~m.

Corollary 5.16

(a) For every Turing-computable function f : Nn ∼→ N there exists an e ∈ N s.t.

f = {e}n .

(b) Especially, all Turing-computable functions are partial recursive.

Proof: (a) Immediate. (b) by (a), since {e}n is partial recursive.

Theorem 5.17 The sets of URM-computable, of Turing-computable, and of partial recursive func-tions coincide.

Proof:By Theorem 4.3, every URM computable function is TM-computable. By Corollary 5.16, everyTM computable function is partial recursive. By Lemma 5.13, all partial recursive functions areURM computable.

Remark:

• By Kleene’s Normal Form Theorem every Turing computable function g : Nn ∼→ N is of theform {e}n. Since the Turing computable functions are the partial recursive functions, weobtain the following:

The partial recursive functions g : Nn ∼→ N are exactly the func-tions {e}n.

This means:

– {e}n is partial recursive for every e ∈ N.

– For every partial recursive function g : Nn ∼→ N there exists an e s.t. g = {e}n.

Page 83: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 6-1

• Therefore we can sayfn : Nn+1 ∼→ N , fn(e, ~x) ' {e}n(~x)

forms a universal n-ary partial recursive function, since it encodes all n-ary partialrecursive function.

• So we can assign to each partial recursive function g a number, namely an e s.t. g = {e}n.

– Each number e denotes one partial recursive function {e}n.

– However, several numbers denote the same partial recursive function, i.e. there are e,e′ s.t. e 6= e′ but {e}n = {e′}n.

(Proof: There are several algorithms for computing the same function. Therefore thereare several Turing machines which compute the same function. These Turing machineshave different codes e).

Lemma 5.18 The set F of partial recursive functions (and therefore as well of Turing-computableand of URM-computable functions), i.e.

F := {f : Nk ∼→ N | k ∈ N ∧ f partial recursive}

is countable.

Proof: Every partial recursive function is of the form {e}n for some e, n ∈ N. Therefore

f : N2 → F , f(e, n) := {e}n

is surjective, therefore as well

g : N → F , g(e) := f(π0(e), π1(e)) .

Therefore F is countable.

6 The Church-Turing Thesis

We have introduced three models of computations:

• The URM-computable functions.

• The Turing-computable functions.

• The partial recursive functions.

Further we have shown that all three models compute the same partial functions.Lots of other models of computation have been studied:

• The while programs.

• Symbol manipulation systems by Post and by Markov.

• Equational calculi by Kleene and by Godel.

• The λ-definable functions.

• Any of the programming languages Pascal, C, C++, Java, Prolog, Haskell, ML (and manymore).

• Lots of other models of computation.

Page 84: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 7-1

One can show that the partial functions computable in these models of computation are againexactly the partial recursive functions. So all these attempts to define a complete model of com-putation result in the same set of partial recursive functions. Because of this, one states the

Church-Turing Thesis: The (in an intuitive sense) com-putable partial functions are exactly the partial recursive functions(or equivalently the URM-computable or Turing-computable func-tions).

Note that this thesis is not a mathematical theorem, but a philosophical thesis. Thereforethe Church-Turing thesis cannot be proven, but we can only provide philosophical evidencefor it. This evidence comes from the following considerations and empirical facts:

• All complete models of computation suggested by researchers (including those mentionedabove) define the same set of partial functions.

• Many of these models were carefully designed in order to capture intuitive notions of com-putability:

– The Turing machine model captures the intuitive notion of computation on a pieceof paper in a general sense.

– The URM machine model captures the general notion of computability by a com-puter.

– Symbolic manipulation systems capture the general notion of computability by manip-ulation of symbolic strings.

• No intuitively computable partial function, which is not partial recursive, has been found,despite lots of researchers trying it.

• Using metamathematical investigations, a strong intuition has been developed that in princi-pal programs in any programming language can be simulated by Turing machines and URMs,and therefore partial functions computable in such a language are partial recursive.

Because of this, only very few researchers seriously doubt the correctness of the Church-Turingthesis.

Remark: Because of the equivalence of the models of computation it follows that the haltingproblem for any of the above mentioned models of computation is undecidable. Especially it isundecidable, whether a program in one of the programming languages mentioned terminates. Ifwe had a decision procedure which decides whether or not say a Pascal program terminates forgiven input, then we could, using a translation of URMs into Pascal programs, decide the haltingproblem for URMs, which is impossible.

7 Kleene’s Recursion Theorem

The main theorem in this section is Kleene’s Recursion Theorem, which expresses that the partialrecursive functions are closed under a very general form of recursion. In order to prove this we usethe S-m-n theorem, which is an important lemma that is used in many proofs in computabilitytheory.

Page 85: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 7-2

7.1 The S-m-n Theorem

If f : Nn+m ∼→ N is a partial recursive, and we fix the first m arguments (say l0, . . . , lm−1), weobtain a function of n arguments

g : Nn ∼→ N , g(x0, . . . , xn−1) = f(l0, . . . , lm−1, x0, . . . , xn−1) ,

which is again partial recursive. The following theorem says that we can compute a Kleene indexof g (i.e. an e′ s.t. g = {e′}n) from a Kleene index of f and l0, . . . , lm−1 primitive-recursively:There exists a primitive-recursive function Sm

n s.t., if f = {e}m+n, then g = {Smn (e, l0, . . . , lm−1)}n:

Theorem 7.1 (S-m-n Theorem).For all m, n ∈ N there exists a primitive-recursive function

Smn : Nm+1 → N

s.t. for all l0, . . . , lm−1, x0, . . . , xn−1 ∈ N

{Smn (e, l0, . . . , lm−1)}n(x0, . . . , xn−1) ' {e}m+n(l0, . . . , lm−1, x0, . . . , xn−1) .

Proof: We write ~x for x0, . . . , xn−1 and ~l for l0, . . . , lm−1. Let M be a Turing machine encoded

as e. The Turing machine M ′ corresponding to Smn (e,~l) should be such that

M ′n(~x) ' Mn+m(~l, ~x) .

Such a Turing machine can be defined as follows:

1. The initial configuration is that x0, . . . , xm−1 are written on the tape and the head is pointingto the left most bit:

· · · xy xy bin(x0) xy · · · xy bin(xn−1) xy xy · · ·↑

2. M ′ writes first the binary representation of l0, . . . , lm−1 separated by blanks in front of this,and terminates this step with the head pointing to the most significant bit of bin(l0). So weobtain after this first step the following configuration:

· · · xy xy bin(l0) xy · · · xy bin(lm−1) xy bin(x0) xy · · · xy bin(xn−1) xy xy · · ·↑

3. Then M ′ will run M , starting in this configuration, and will terminate, if M terminates. Theresult will be

' Mm+n(~l, ~x) ,

and in total we get thereforeM ′n(~x) ' Mm+n(~l, ~x)

as desired.

A code for M ′ can be obtained from a code for M and from ~l as follows:

• One takes a Turing machine M ′′, which writes the binary representations of l0, . . . , lm−1 infront of its initial position (separated by a blank and with a blank at the end), and terminatesat the left most bit. It’s a straightforward execrcise to write a code for the instructions ofsuch a Turing machine, depending on ~l, and show that the function defining it is primitive-recursive. Assume, the terminating state of M ′′ has Godel number (i.e. code) s, and that allother states have Godel numbers < s.

Page 86: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 7-3

• Then one appends to the instructions of M ′′ the instructions of M , but with the statesshifted, so that the new initial state of M is the final state s of M ′′ (i.e. we add s to all theGodel numbers of states occurring in M). This can be done as well primitive-recursively.

So a code for M ′′ can be defined primitive-recursively depending on a code e for M and ~l, and Smn

is the primitive-recursive function computing this. With this function it follows now that, if e is acode for a TM, then

{Smn (e,~l)}n(~x) ' {e}n+m(~l, ~x) .

This equation holds, even if e is not a code for a TM: In this case {e}m+n interprets e as if it werethe code for a valid TM M (A code for such a valid TM is obtained by

• deleting any instructions code(q, a, q′, a′, D) in e s.t. there exists an instruction code(q, a, q′′, a′′, D′)occurring before it in the sequence e,

• and by replacing all directions > 1 by dRe = 1.)

e′ := Smn (e,~l) will have the same deficiencies as e, but when applying the Kleene-brackets, it will be

interpreted as a TM M ′ obtained from e′ in the same way as we obtained M from e, and therefore

{e′}n(~x) ' M ′n(~x) ' Mn+m(~l, ~x) ' {e}n+m(~l, ~x) .

So we obtain the desired result in this case as well.

Notation: We will in the following usually omit the superscript n in {e}n(m0, . . . , mn−1), i.e. wewill write {e}(m0, . . . , mn−1) instead of {e}n(m0, . . . , mn−1). Further {e} not applied to argumentsand without superscript means usually {e}1.

7.2 Kleene’s Recursion Theorem

Theorem 7.2 (Kleene’s Recursion Theorem). Let f : Nn+1 ∼→ N be a partial recursivefunction. Then there exists an e ∈ N s.t.

{e}n(x0, . . . , xn−1) ' f(e, x0, . . . , xn−1) .

Before proving this, we give some examples for the use of this theorem.Examples:

1. There exists an e s.t.{e}(x) ' e + 1 .

For showing this take in the Recursion Theorem f(e, n) := e + 1. Then we get:

{e}(x) ' f(e, x) ' e + 1 .

Note that it would be rather difficult to find directly such an e (try it yourself). Such an ewould be a code for a Turing machine, which, independent of its input writes its own codeon the tape.

Remark: Such applications of the Recursion Theorem are usually not very useful. Usually,when using it, one doesn’t use the index e directly, but only the application of {e} to somearguments.

Page 87: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 7-4

2. The function fib computing the Fibonacci numbers can be seen to be recursive by usingKleene’s Recursion theorem. (This is a weaker result than what we obtained above, where weshowed that fib is even primitive-recursive. However, it is a nice example which demonstratesvery well the use of the recursion theorem.)

Remember the defining equations for fib:

fib(0) = 1 ,

fib(1) = 1 ,

fib(n + 2) = fib(n) + fib(n + 1) .

From these equations we obtain

fib(n) =

{1, if n = 0 or n = 1,fib(n−· 2) + fib(n−· 1), otherwise.

We show using the recursion theorem that there exists a recursive function g : N → N, whichfulfils the same equations, i.e. s.t.

g(n) '{

1, if n = 0 or n = 1,g(n−· 2) + g(n−· 1), otherwise.

This can be shown as follows:

Define a recursive f : N2 → N s.t.

f(e, n) '{

1, if n = 0 or n = 1,{e}(n−· 2) + {e}(n−· 1), otherwise.

Now let e be s.t.{e}(n) ' f(e, n) .

Then e fulfills the equations

{e}(n) '{

1, if n = 0 or n = 1,{e}(n−· 2) + {e}(n−· 1), otherwise.

Let g = {e}. Then g fulfills the equations as desired:

g(n) '{

1, if n = 0 or n = 1,g(n−· 2) + g(n−· 1), otherwise.

It is not a priori clear that g(x) = fib(x), since there might be several ways of solving thesame recursion equation.

(For instance, the recursion equation f(x) ' f(x) can be solved by any definition of f , e.g.by f(x) = 0, by f(x) = 1, and by f(x) ↑).Therefore we show by induction on n that ∀n ∈ N.g(n) ' fib(n), and that therefore fib isrecursive:

– In the base cases n = 0, 1 this is clear.

– In the induction step n → n + 1 with n ≥ 1 we have

g(n + 1) ' g(n−· 1) + g(n)IH' fib(n−· 1) + fib(n) = fib(n + 1) .

In a similar way one can introduce arbitrary partial recursive functions g, where g(~n) refersto arbitrary other values g(~m).

Such definitions correspond to the recursive definition of functions in many programminglanguages. For instance, in Java one defines the Fibonacci numbers recursively as follows:

Page 88: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 7-5

public static int fib(int n){

if (n == 0 || n == 1){

return 1;}

else{

return fib(n-1) + fib(n-2);

}

};

When programming functions recursively, we obtain functions which might terminate, ormight not terminate. Similarly, when defining functions using the recursion theorem, weobtain partial recursive functions, which might not be always be defined, as in the followingtwo examples:

3. There exists a partial recursive function g : N∼→ N s.t.

g(x) ' g(x) + 1 .

(Take in Kleene’s Recursion Theorem

f(e, x) ' {e}(x) + 1 ,

so we obtain an e s.t.{e}(x) ' f(e, x) ' {e}(x) + 1 .

Then let g = {e}.)It follows that

g(x) ↑ .

For, if g(x) were defined, say g(x) = n, we would get n = n + 1.

Note that, if g(x) is undefined, g(x) + 1 is undefined as well, so

g(x) ' g(x) + 1

holds in this case.

The definition of g corresponds to the following definition in Java:

public static int g(int n){

return g(n) + 1;

};

When executing g(x), Java loops (actually it will terminate with a stack overflow).

(Note that a recursion equation for a function f can not always be solved by setting f(x) ↑.E.g. the recursion equation for fib above can’t be solved by setting fib(n) ↑.)

4. There exists a partial recursive function g : N∼→ N s.t.

g(x) ' g(x + 1) + 1 .

Again g(x) is undefined, for if g(x) were defined, it need to be infinitely big.

A corresponding Java function will again loop for ever.

Page 89: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 7-6

5. An interesting example, where the Recursion Theorem is of great help in order to show thata function is partial recursive, is the Ackermann Function Ack.

Remember that Ack has the following defining equations:

Ack(0, y) = y + 1 ,

Ack(x + 1, 0) = Ack(x, 1) ,

Ack(x + 1, y + 1) = Ack(x, Ack(x + 1, y)) .

So we have

Ack(x, y) =

y + 1, if x = 0,Ack(x−· 1, 1), if x > 0 and y = 0,Ack(x−· 1, Ack(x, y−· 1)), otherwise.

In order to show that Ack is recursive, we use Kleene’s Recursion Theorem in order tointroduce a partial recursive function g which fulfils the same equations, i.e.

g(x, y) '

y + 1, if x = 0,g(x−· 1, 1), if x > 0 ∧ y = 0,g(x−· 1, g(x, y−· 1)), if x > 0 ∧ y > 0.

(In detail this is shown as follows:

There exists an e s.t.

{e}(x, y) '

y + 1, if x = 0,{e}(x−· 1, 1), if x > 0 ∧ y = 0,{e}(x−· 1, {e}(x, y−· 1)), if x > 0 ∧ y > 0.

Let g := {e}2. Then g fulfils those equations.)

We show by induction on x that g(x, y) is defined and equal to Ack(x, y) for all x, y ∈ N:

– Base case x = 0.g(0, y) = y + 1 = Ack(0, y) .

– Induction Step x → x + 1. Assume

g(x, y) = Ack(x, y) .

We showg(x + 1, y) = Ack(x + 1, y)

by side-induction on y:

∗ Base case y = 0:

g(x + 1, 0) ' g(x, 1)Main-IH

= Ack(x, 1) = Ack(x + 1, 0) .

∗ Induction Step y → y + 1:

g(x + 1, y + 1) ' g(x, g(x + 1, y))

Main-IH' g(x, Ack(x + 1, y))

Side-IH' Ack(x, Ack(x + 1, y))

= Ack(x + 1, y + 1) .

Page 90: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-1

Proof of Kleene’s Recursion Theorem:We write ~x for x0, . . . , xn−1. Assume

f : Nn+1 ∼→ N .

We have to find an e s.t.∀~x ∈ N.{e}n(~x) ' f(e, ~x) .

The idea is to define e = S1n(e1, e2) for some (yet unknown) e1, e2. Then we get

{e}n(~x) ' {S1n(e1, e2)}n(~x) ' {e1}n+1(e2, ~x) .

So, in order to fulfil our original equation, we need to find e1, e2 s.t.

∀~x. {e1}n+1(e2, ~x)︸ ︷︷ ︸'{e}n(~x)

' f(S1n(e1, e2), ~x)︸ ︷︷ ︸'f(e,~x)

.

Now let e1 be an index for the partial function function

(y, ~x) 7→ f(S1n(y, y), ~x) ,

i.e. let e1 s.t.{e1}n+1(y, ~x) ' f(S1

n(y, y), ~x) .

Using this equation, the above equation becomes

f(S1n(e1, e2), ~x)︸ ︷︷ ︸

'{e1}n+1(e2,~x)

' f(S1n(e2, e2), ~x) .

We can fulfill this equation by defininge2 := e1 .

So, an index solving the problem is

e = S1n(e1, e2) = S1

n(e1, e1) ,

and we are done.

In short the complete proof reads as follows:Let e1 be s.t.

{e1}n+1(y, ~x) ' f(S1n(y, y), ~x) .

Let e := S1n(e1, e1). Then we have

{e}n(~x)e = S

1

n(e1, e1)' {S1n(e1, e1)}n(~x)

S-m-n theorem' {e1}n+1(e1, ~x)

Def of e1' f(S1n(e1, e1), ~x)

e = S1

n(e1, e1)' f(e, ~x) .

Page 91: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-2

8 Recursively Enumerable Predicates

8.1 Introduction

In this section we are studying predicates P ⊆ Nn, which are non-decidable, but “half decidable”.The official name is semi-decidable, or recursively enumerable. (The latter name will be explainedlater).Remember that a predicate A is recursive (or, using the Church-Turing thesis computable), if itscharacteristic function χA is recursive, so we have a “full” decision procedure:

P (x0, . . . , xn−1) ⇔ χA(x0, . . . , xn−1) = 1, i.e. answer yes ,

¬P (x0, . . . , xn−1) ⇔ χA(x0, . . . , xn−1) = 0, i.e. answer no .

A predicate P will be semi-decidable, if there exists a partial recursive function f s.t.

P (x0, . . . , xn−1) ⇔ f(x0, . . . , xn−1) ↓ .

Therefore

• If P (x0, . . . , xn−1) holds, we will eventually know it – the algorithm for computing f willfinally terminate, and then we know that P (x0, . . . , xn−1) holds.

• If P (x0, . . . , xn−1) doesn’t hold, then the algorithm computing f will loop for ever, and wenever get an answer.

So we have:

P (x0, . . . , xn−1) ⇔ f(x0, . . . , xn−1) ↓ i.e. answer yes ,

¬P (x0, . . . , xn−1) ⇔ f(x0, . . . , xn−1) ↑ i.e. no answer returned by f .

It seems at first sight that recursively enumerable sets are not interesting from a computationalpoint of view. But it turns out that such sets occur in many practical applications. A typicalexample is a set A of the form

A(x) ⇔ there exists a proof in a certain formal system of a property ϕ(x) .

For instance, ϕ(x) might express the correctness of an algorithm, e.g. that the result of a digitalcircuit with x-bit inputs for multiplying two binary numbers is actually the product of its inputs.Under certain conditions (“soundness and completeness”; see other modules) we have:

• If we find a proof for ϕ(x), then ϕ(x) holds.

• If we don’t find a proof, then ϕ(x) doesn’t hold.

There are many formal systems such that the corresponding function f s.t.

A(x) ⇔ f(x) ↓

terminates in most practical cases very fast, despite the fact that for theoretical reasons it is knownthat it can take arbitrarily long before this happens.If one under these circumstances runs f(x), one usually assumes that, if one hasn’t obtained ananswer after a certain amount of time, A(x) is probably false, and then uses other means in orderto determine the reason why A(x) is false.If A(x) holds, one expects the algorithm for f(x) to terminate after a certain amount of time. Ifthis algorithms actually terminates within the expected amount of time, one knows that ϕ(x) istrue.

Page 92: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-3

8.2 Recursively Enumerable Predicates and their Canonical Numbering

Definition 8.1 Assume f : Nn ∼→ N is a partial function.

(a) The domain of f , in short dom(f) is defined as follows:

dom(f) := {(x0, . . . , xn−1) ∈ Nn | f(x0, . . . , xn−1) ↓} .

(b) The range of f , in short ran(f) is defined as follows:

ran(f) := {y ∈ N | ∃x0, . . . , xn−1.(f(x0, . . . , xn−1) ' y)} .

(c) The graph of f is the set Gf defined as

Gf := {(~x, y) ∈ Nn+1 | f(~x) ' y} .

Remark:

• The notion “graph” used here has nothing to do with the notion of “graph” in graph theory.

• The graph of a function is essentially the graph we draw when visualising f . Take as anexample

f : N∼→ N , f(x) =

{x2 , if x even,undefined, if x is odd.

Then we can draw f as follows:

3

4

2

1

1 2 3 4 5 6 70

0

In this example we have

Gf = {(0, 0), (2, 1), (4, 2), (6, 3), . . .}

These are exactly the coordinates of the crosses in the picture:

3

4

2

1

1 2 3 4 5 6 70

0

(0,0) (2,1)(4,2)

(6,3)

Page 93: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-4

Definition 8.2 • A predicate A ⊆ Nn is recursively enumerable, in short r.e., if there existsa partial recursive function f : Nn ∼→ N s.t.

A = dom(f) .

• Sometimes recursive predicates are as well called

– semi-decidable or

– semi-computable or

– partially computable.

Lemma 8.3 (a) Every recursive predicate is r.e.

(b) The halting problem, defined as

Haltn(e, ~x) :⇔ {e}n(~x) ↓ ,

is r.e., but not recursive.

Proof:(a) Assume A ⊆ Nk is decidable. Then Nk \ A is recursive, therefore its characteristic functionχNk\A is recursive as well.Define

f : Nk ∼→ N, f(~x) :' (µy.χNk\A(~x) ' 0) .

Note that y doesn’t occur in the body of the µ-expression. Then we have

• If A(~x), thenχNk\A(~x) ' 0 ,

sof(~x) ' (µy.χNk\A(~x) ' 0) ' 0 ,

especiallyf(~x) ↓ .

• If (Nk \ A)(~x), thenχNk\A(~x) ' 1 ,

so there exists no y s.t.χNk\A(~x) ' 0 .

thereforef(~x) ' (µy.χNk\A(~x) ' 0) ' undefined ,

especiallyf(~x) ↑ .

So we getA(~x) ⇔ f(~x) ↓⇔ ~x ∈ dom(f) ,

A = dom(f) is r.e. .

(b) We haveHaltn(e, ~x) :⇔ fn(e, ~x) ↓ ,

where fn is partial recursive as in Sect. 5 s.t.

{e}n(~x) ' fn(e, ~x) .

SoHaltn = dom(fn) is r.e. .

We have seen above that Haltn is non-computable, i.e. not recursive.

Page 94: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-5

Theorem 8.4 (The sets Wne .)

There exist r.e. predicates Wn ⊆ Nn+1 s.t., if we define

Wne := {(x0, . . . , xn−1) ∈ Nn | Wn(e, x0, . . . , xn−1) ,

then we have the following:

• Each of the predicates Wne ⊆ Nn is r.e.

• For each r.e. predicate P ⊆ Nn there exists an e ∈ N s.t. P = Wne , i.e.

∀~x ∈ N.P (~x) ⇔ Wne (~x) .

In other words, the r.e. sets P ⊆ Nn are exactly the sets Wne , where e ranges over all natural

numbers.

Remark:

• Wne ist therefore a universal recursively enumerable sets, which encodes all other re-

cursively enumerable predicates.

• The theorem expresses that we can assign to every recursively enumerable predicate A anatural number, namely the e s.t. A = Wn

e .

– Each number denotes one predicate.

– But several numbers denote the same predicate, i.e. there are e, e′ s.t. e 6= e′ butWn

e = Wne′ .

(This is since there are e, e′ s.t. e 6= e′ but {e}n = {e′}n).

Proof idea for Theorem 8.4:If A is r.e., then A = dom(f) for some partial rec. f . Let f = {e}n. Then A = Wn

e .

Proof of Theorem 8.4: Let fn s.t.

∀e, ~n ∈ N.fn(e, ~x) ' {e}(~x) .

DefineWn := dom(fn) .

Wn is r.e. We have

~x ∈ Wne ⇔ (e, ~x) ∈ Wn

⇔ fn(e, ~x) ↓⇔ {e}(~x) ↓⇔ ~x ∈ dom({e}n) .

ThereforeWn

e = dom({e}n) .

Wn is r.e., since fn is partial recursive.Furthermore, we have for any set A ⊆ Nn

A is r.e. iff A = dom(f) for some partial recursive f

iff A = dom({e}n) for some e ∈ N

iff A = Wne for some e ∈ N.

This shows the assertion.

Page 95: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-6

8.3 Characterisations of the Partial Recursive Functions

Theorem 8.5 (Characterisation of the recursively enumerable predicates). Let A ⊆ Nn.The following is equivalent:

(i) A is r.e.

(ii)A = {~x | ∃y.R(~x, y)}

for some primitive-recursive predicate R.

(iii)A = {~x | ∃y.R(~x, y)}

for some recursive predicate R.

(iv)A = {~x | ∃y.R(~x, y)}

for some recursively enumerable predicate R.

(v) A = ∅ orA = {(f0(x), . . . , fn−1(x)) | x ∈ N}

for some primitive-recursive functions

fi : N → N .

(vi) A = ∅ orA = {(f0(x), . . . , fn−1(x)) | x ∈ N}

for some recursive functionsfi : N → N .

Remark:

(a) We can summarise Theorem 8.5 by saying that there are essentially three equivalent ways ofdefining that A ⊆ Nn is r.e.:

• A = dom(f) for some partial recursive f ;

• A = ∅ or A is the image of primitive recursive/recursive functions f0, . . . , fn−1;

• A = {~x | ∃y.R(~x, y)} for some primitive-recursive/recursive/r.e. R.

(b) Consider the special case n = 1. Taking (i), (v), (vi) we get the following equivalence:

• Let A ⊆ N. Then the following is equivalent:

– A is r.e.

– A = ∅ orA = ran(f) for some primitive-recursive f : N → N .

(This is (v)).

– A = ∅ orA = ran(f) for some recursive f : N → N .

(This is (vi))

This means that A ⊆ N is r.e. if A = ∅ or there exists a (primitive-)recursive function f , whichenumerates all its elements. This explains the name “recursively enumerable predicate”.

Page 96: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-7

Proof of Theorem 8.5:(i) → (ii):Proof idea for (i) → (ii): If A is r.e., A = dom(f), then

A(~x) ⇔ f(~x) ↓⇔ ∃y.the Turing machine for computing f(~x) terminates after y steps

⇔ ∃y.R(~x, y)

whereR(~x, y) ⇔ the Turing machine for computing f(~x) terminates after y steps .

R is primitive-recursive.Detailed proof of (i) → (ii):(The actual predicate R we will take will be slightly differently from that in the proof idea – it istechnically easier to prove the theorem this way.)If A is r.e., then for some partial recursive function f : Nn ∼→ N we have

A = dom(f) .

Let f = {e}n.By Kleene’s Normal Form Theorem there exist a primitive-recursive function U : N → N and aprimitive-recursive predicate Tn ⊆ Nn+1 s.t.

{e}n(~x) ' U(µy.Tn(e, ~x, y)) .

Therefore

A(~x) ⇔ ~x ∈ dom(f)

⇔ ~x ∈ dom({e}n)

⇔ U(µy.Tn(e, ~x, y)) ↓U prim.-rec., therefore total⇔ µy.Tn(e, ~x, y) ↓

⇔ ∃y.Tn(e, ~x, y)

⇔ ∃y.R(~x, y) .

whereR(~x, y) ⇔ Tn(e, ~x, y) .

Now R is primitive-recursive, and

A = {~x | ∃y.R(~x, y)} .

(ii) → (iii): Trivial.(iii) → (iv): By Lemma 8.3.(iv) → (ii):Assume

A = {~x | ∃y.R(~x, y)} ,

where R is r.e. By “(i) → (ii)” there exists a primitive-recursive predicate S s.t.

R(~x, y) ⇔ ∃z.S(~x, y, z) .

Therefore

A = {~x | ∃y.∃z.S(~x, y, z)}= {~x | ∃y.S(~x, π0(y), π1(y))}= {~x | ∃y.R′(~x, y)} ,

Page 97: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-8

whereR′(~x, y) :⇔ S(~x, π0(y), π1(y)) is primitive-recursive.

(ii) → (v):Proof idea for (ii) → (v), special case n = 1:Assume

• A = {x ∈ N | ∃y.R(x, y)}, where R is primitive recursive,

• A 6= ∅,

• y ∈ A fixed.

Define f : N → N recursive,

f(x) =

{π0(x), if R(π0(x), π1(x)),y otherwise.

Then A = ran(f).Detailed proof for (ii) → (v):Assume A is not empty and R is primitive-recursive s.t.

A = {~x | ∃y.R(~x, y)} .

Let ~y = y0, . . . , yn−1 be some fixed elements s.t. A(~y) holds. Define for i = 0, . . . , n − 1

fi(x) :=

{πn+1

i (x), if R(πn+10 (x), πn+1

1 (x), . . . , πn+1n−1(x), πn+1

n (x)),yi, otherwise.

fi are primitive-recursive. We show

A = {(f0(x), . . . , fn−1(x)) | x ∈ N} .

“⊇”:Assume x ∈ N, and show

A(f0(x), . . . , fn−1(x)) .

• If R(πn+10 (x), πn+1

1 (x), . . . , πn+1n−1(x), πn+1

n (x)), then

∃z.R(πn+10 (x), πn+1

1 (x), . . . , πn+1n−1(x), z) ,

therefore(πn+1

0 (x), πn+11 (x), . . . , πn+1

n−1(x)) ∈ A ,

thereforeA(f0(x), . . . , fn−1(x)) .

• If (Nk \ R)(πn+10 (x), πn+1

1 (x), . . . , πn+1n−1(x), πn+1

n (x)), then

fi(x) = yi ,

therefore by A(~y)A(f0(x), . . . , fn−1(x)) .

Page 98: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-9

So in both cases we get thatA(f0(x), . . . , fn−1(x)) ,

so{(f0(x), . . . , fn−1(x)) | x ∈ N} ⊆ A .

“⊆”:Assume

A(x0, . . . , xn−1) ,

and show∃z.(f0(z) = x0 ∧ · · · ∧ fn−1(z) = xn−1) .

We have for some yR(x0, . . . , xn−1, y) .

Letz = πn+1(x0, . . . , xn−1, y) .

Then we havexi = πn+1

i (z) , y = πn+1n (z) ,

thereforeR(πn+1

0 (z), πn+11 (z), . . . , πn+1

n−1(z), πn+1n (z)) ,

therefore for i = 0, . . . , n − 1fi(z) = πn+1

i (z) = xi ,

therefore

(x0, . . . , xn−1) = (f0(z), . . . , fn−1(z)) ∈ {(f0(x), . . . , fn−1(x)) | x ∈ N} ,

and we haveA ⊆ {(f0(x), . . . , fn−1(x)) | x ∈ N} .

Therefore we have shownA = {(f0(x), . . . , fn−1(x)) | x ∈ N} ,

and the assertion follows.(v) → (vi): Trivial.(vi) → (i):Proof idea for (vi) → (i), special case n = 1:If A = ran(f), where f is recursive, then A = dom(g) where

g(x) ' (µy.f(y) ' x) .

g is partial recursive.Detailed proof for (vi) → (i):If A is empty, then A is recursive, therefore r.e.Assume

A = {(f0(x), . . . , fn−1(x)) | x ∈ N} .

for some recursive functions fi.Define

f : Nn ∼→ N ,

s.t.f(x0, . . . , xn−1) :' µx.(f0(x) ' x0 ∧ · · · ∧ fn−1(x) ' xn−1) .

Page 99: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-10

f can be written as

f(x0, . . . , xn−1) :' µx.(((f0(x)−· x0) + (x0−· f0(x)))+((f1(x)−· x1) + (x1−· f1(x)))+· · ·+((fn−1(x)−· xn−1) + (xn−1−· fn−1(x))) ' 0)

,

therefore f is partial recursive.Further we have

A(x0, . . . , xn−1) ⇔ ∃x ∈ N.x0 = f0(x) ∧ · · · ∧ xn−1 = fn−1(x)

⇔ f(x0, . . . , xn−1) ↓ ,

thereforeA = dom(f) is r.e. .

Theorem 8.6 (Relationship between recursive and r.e. predicates).A ⊆ Nk is recursive iff both A and Nk \ A are r.e.

Proof ideas:“⇒” is easy.For “⇐”: Assume R, S primitive-recursive s.t.

A(~x) ⇔ ∃y.R(~x, y)

(Nk \ A)(~x) ⇔ ∃y.S(~x, y)

In order to decide A, search simultaneously for a y s.t. R(~x, y) and for a y s.t. S(~x, y) holds.If we find a y s.t. R(~x, y) holds, then A(~x) holds.If we find a y s.t. S(~x, y) holds, then ¬A(~x) holdsDetailed Proof:“⇒”:If A is recursive, then both A and Nk \ A are recursive, therefore as well r.e.“⇐”:Assume A, Nk \ A are r.e.Then there exist primitive-recursive predicates R and S s.t.

A = {~x | ∃y.R(~x, y)} ,

Nk \ A = {~x | ∃y.S(~x, y)} .

ByA ∪ (Nk \ A) = Nk ,

it follows∀~x.((∃y.R(~x, y)) ∨ (∃y.S(~x, y))) ,

therefore as well∀~x.∃y.(R(~x, y) ∨ S(~x, y)) . (∗)

Defineh : Nn → N , h(~x) := µy.(R(~x, y) ∨ S(~x, y)) .

h is partial recursive. By (∗) we have h is total, so h is recursive.We show

A(~x) ⇔ R(~x, h(~x)) .

IfA(~x)

Page 100: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-11

then∃y.R(~x, y)

and~x 6∈ (Nk \ A) ,

therefore¬∃y.S(~x, y) .

Therefore we have for the y found by h(~x) that R(~x, y) holds, i.e.

R(~x, h(~x)) .

On the other hand, if R(~x, h(~x)) holds then

∃y.R(~x, y) ,

thereforeA(~x) .

Now we getA = {~x | R(~x, h(~x))} is recursive.

Theorem 8.7 (Characterisation of partial recursive functions).Let f : Nn ∼→ N.Then

f is partial recursive ⇔ Gf is r.e. .

Proof idea for “⇐”:Assume R primitive recursive s.t.

Gf (~x, y) ⇔ ∃z.R(~x, y, z) .

In order to compute f(~x), search for a y s.t. R(~x, π0(y), π1(y)) holds. f(~x) will be the firstprojection of this y.Detailed Proof:“⇒”:Assume f is partial recursive. Then f = {e}n for some e ∈ N. By Kleene’s Normal Form Theoremwe have

f(~x) ' U(µy.Tn(~x, y)) ,

for some primitive-recursive relation Tn ⊆ Nn+1 and some primitive-recursive function U : N → N.Therefore

(~x, y) ∈ Gf ⇔ (f(~x) ' y) ⇔ ∃z.(Tn(~x, z) ∧ (∀z′ < z.Tn(~x, z′)) ∧ U(z) = y) ,

therefore Gf is r.e.“⇐”:If Gf is r.e., then there exists a primitive-recursive predicate R s.t.

f(~x) ' y ⇔ (~x, y) ∈ Gf ⇔ ∃z.R(~x, y, z) .

Therefore for any z s.t. R(~x, π0(z), π1(z)) holds we have that

f(~x) ' π0(z) .

Thereforef(~x) ' π0(µu.R(~x, π0(u), π1(u))) ,

f is partial recursive.

Page 101: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 8-12

8.4 Closure Properties of the Recursively enumerable Predicates

Lemma 8.8 The recursively enumerable predicates are closed under the following operations:

(a) Union:

If A, B ⊆ Nn are r.e. so is A ∪ B.

(b) Intersection:

If A, B ⊆ Nn are r.e. so is A ∩ B.

(c) Substitution by recursive functions:

If A ⊆ Nn is r.e., fi : Nk → N are recursive for i = 0, . . . , n, so is

C := {(y0, . . . , yk−1) ∈ Nk | A(f0(y0, . . . , yk−1), . . . , fn−1(y0, . . . , yk−1)) .

(d) (Unbounded) existential quantification:

If D ⊆ Nn+1 is r.e., so is

E := {(x0, . . . , xn−1) ∈ Nn | ∃y.D(x0, . . . , xn−1, y)} .

(e) Bounded universal quantification:

If D ⊆ Nn+1 is r.e., so is

F := {(x0, . . . , xn−1, z) ∈ Nn+1 | ∀y < z.D(x0, . . . , xn−1, z)} .

Proof:Let A, B ⊆ Nn be r.e.Then there exist primitive-recursive relations R, S s.t.

A = {(x0, . . . , xn−1) ∈ Nn | ∃y.R(x0, . . . , xn−1, y)} ,

B = {(x0, . . . , xn−1) ∈ Nn | ∃y.S(x0, . . . , xn−1, y)} .

(a), (b):One can easily see that

A ∪ B = {(x0, . . . , xn−1) ∈ Nn | ∃y.(R(x0, . . . , xn−1, y) ∨ S(x0, . . . , xn−1, y))} ,

A ∩ B = {(x0, . . . , xn−1) ∈ Nn | ∃y.(R(x0, . . . , xn−1, π0(y)) ∧ S(x0, . . . , xn−1, π1(y)))} .

therefore A ∪ B and A ∩ B are r.e.

(c):

C = {(~y) | A(f0(~y), . . . , fn−1(~y))}= {(~y) | ∃z.R(f0(~y), . . . , fn−1(~y), z)} is r.e.

(d) follows from 8.5.(e):Assume T is a primitive-recursive predicate s.t.

D = {(x0, . . . , xn−1, y) ∈ Nn+1 | ∃z.T (x0, . . . , xn−1, y, z)} .

Then we get

F = {(~x, y) | ∀y′ < y.D(~x, y′)}= {(~x, y) | ∀y′ < y.∃z.T (~x, y′, z)}= {(~x, y) | ∃z.∀y′ < y.T (~x, y′, (z)y′)} is r.e.,

Page 102: cs.swan.ac.ukcs.swan.ac.uk/~csetzer/lectures/computability/03/cpscriptdraft.pdf · CS 226 Computability Theory Course Notes, Spring 2003 Anton Setzer January 4, 2004 1 Introduction

January 4, 2004 CS 226 Computability Theory 11-1

where in the last line we used that

{(~x, z) | ∀y′ < y.T (~x, y′, (z)y′)} is primitive-recursive .

Lemma 8.9 The r.e. predicates are not closed under complement, i.e. there exists an r.e. predi-cate A ⊆ Nn s.t. Nn \ A is not r.e.

Proof:Haltn is r.e.. If Nn \ Haltn were r.e., then by Theorem 8.6 Haltn were recursive, which is not thecase.

9 Lambda-definable Functions

10 Reducibility.

11 Computational Complexity


Recommended