Fundamentals of Undecidability in Computational Theory

Department of Mathematics and StatisticsIndian Institute of Technology

Fundamentals of Undecidabilityin Computational Theory

MTH 401 Theory of ComputationProject Report

Arcchit Jain 10142Minnie Kabra 10400Oajasavee K. Mourya 10469Ved Gupta 10790

Contents

1 Preliminaries 2

2 Godel’s Incompleteness Theorem 4

3 Post Correspondance Problem 12

4 Halting Problem 16

5 Hilbert’s Tenth Problem 20

6 Wang’s Tile Problem 26

7 Busy Beaver 30

8 Some More Undecidable Problems 35

9 Hypercomputation 42

1

Chapter 1

Preliminaries

Definition Turing Machine

A Turing Machine is a machine described by

1. A finite set of states (represented by Q).

2. An input alphabet (represented by Σ).

3. A tape alphabet (represented by Γ; where Γε Σ).

4. A transition function (represented by δ).

5. A start state (represented by q0; always belongs to Q).

6. A blank symbol (represented by B; always belongs to Γ and Σ).

7. A set of final states (represented by F ; F ⊆ Q).

The transition function takes two arguments, a state in Q and a tape symbol in Γ. It is

either undefined or a triplet of the form of (p, Y,D). where

• p is a state (pε Q),

• Y is a new tape symbol (Y ε Γ), and

• D is a direction in which the header would move with respect to the tape.

2

Definition Decidable Language

A language for which there exists a Turing machine which accepts all the strings in the

language and no other string and halts for every input string is called a Decidable language

or Turing acceptible langugae or Recursive language.

Definition Partially Decidable Language

A language for which there exists a Turing machine which accepts all the strings in the

language and no other string but halts for every accepted input string and does not halts

for rejected strings is called a Partially Decidable language or Turing recognizible language

or Recursive enumerable language.

Definition Undecidable Language

A language for which there does not exists any Turing machine which

1. would accept all the strings in the language and no other string, and

2. does not halts for every input string

is called an Undecidable language or Non-Recursively enumerable language.

Definition Undecidable Problem

An Undecidable Problem is a decision problem for which it is impossible to construct a

single algorithm that always leads to a correct yes-or-no answer.

3

Chapter 2

Godel’s Incompleteness Theorem

Kurt Friedrich Godel was an Austrian American logician, mathematician, and

philosopher. Godel made an immense impact upon scientific and philosophical thinking in

the 20th century. Godel is best known for his two incompleteness theorems, published in

1931. He also showed that neither the axiom of choice nor the continuum hypothesis can

be disproved from the accepted axioms of set theory, assuming these axioms are consistent.

This result opened the door for mathematicians to assume the axiom of choice in their

proofs.

Godel’s Incompleteness Theorems are two theorems of mathematical logic that estab-

lish inherent limitations of all but the most trivial axiomatic systems capable of doing

arithmetic. In this section we will see philosophical significance of Incompleteness The-

orems and how they are related to Liar’s Paradox, Consistency of formal systems and

Undecidability. We will also cover proof of First Incompleteness Theorem. Few examples

supporting Godel’s Theorem will discussed towards the end of this section.

The two Incompleteness Theorems are widely interpreted as showing that Hilbert’s

program to find a complete and consistent set of axioms for all mathematics is impossible,

giving a negative answer to Hilbert’s second problem. Before we move forward we need to

know what is Hilbert’s Second Problem and how is it related to Godel’s Incompleteness

Theorem.

4

HILBERT’S SECOND PROBLEM

Hilbert’s problems form a list of 23 problems in mathematics published by David

Hilbert in 1900. But here we are only interested in Hilbert’s Second Problem, that is,

When we are engaged in investigating the foundations of a science, we must set up a system

of axioms which contains an exact and complete description of the relations subsisting

between the elementary ideas of that science. But above all I wish to designate the following

as the most important among the numerous questions which can be asked with regard to

the axioms: To prove that they are not contradictory, that is, that a definite number of

logical steps based upon them can never lead to contradictory results. In geometry, the

proof of the compatibility of the axioms can be effected by constructing a suitable field of

numbers, such that analogous relations between the numbers of this field correspond to

the geometrical axioms. On the other hand a direct method is needed for the proof of the

compatibility of the arithmetical axioms.

In order to understand what Hilbert meant by his problem there are few definitions we

need to know. To understand its soulution we need to have clear idea of both of the Godel’s

Incompleteness Theorem. After doing all that, we would relate undecidability to Godel’s

theorems and try to reason the occurance of undecidable sets, statements, Problems etc.

Later we will see that what Godel suggested is that there cannot exist any such

axiomatic sytstem, or even if it exists we can never prove that it is consistent!

Definition Axiomatic System

Axiomatic system is any set of axioms from which some or all axioms can be used in

conjunction to logically derive theorems. A mathematical theory consists of an axiomatic

system and all its derived theorems. An axiomatic system that is completely described is

a special kind of formal system.

5

Definition Formal Systems

Formal systems in mathematics consist of the following elements:

1. A finite set of symbols known as alphabet, that can be used for constructing formulae.

2. A grammar, which tells how well-formed formulae are constructed out of the symbols

in the alphabet. It is usually required that there be a decision procedure for deciding

whether a formula is well formed or not.

3. A set of axioms where each axiom must be a well-formed formula.

4. A set of inference rules.

Now we will see some popular axomatic systems we use very frequently in mathematics.

We will not go in much detail about these axiomatic systems. Minor references about these

axiomatic systems may be used later.

Peano axioms

Peano axioms are set of axioms for Natural numbers. They are widely used in Number

theory. It consists of 9 axioms. Equality relation, (=) and Successor function, S(n) are

used in these axioms.

1. 0 is a natural number.

2. For every natural number x, x = x.

3. For all natural numbers x and y, if x = y, then y = x.

4. For all natural numbers x, y and z, if x = y and y = z, then x = z.

5. For all a and b, if a is a natural number and a = b, then b is also a natural number.

6. For every natural number n, S(n) is a natural number.

7. For every natural number n, S(n) = 0 is false.

8. For all natural numbers m and n, if S(m) = S(n), then m = n.

9. If K is a set such that, 0 is in K, and for every natural number n, if n is in K, then

S(n) is in K, then K contains every natural number.

6

Zermelo Fraenkel set Theory with the Axiom of Choice or ZFC

ZFC System is commonly used in Set Theory. It also consists of 9 axioms.

1. Two sets are equal (are the same set) if they have the same elements.

∀ x ∀ y [ ∀ z (z ∈ x⇔ z ∈ y)⇒ x = y].

2. Every non-empty set x contains a member y such that x and y are disjoint sets.

∀ x [ ∃ a (a ∈ x)⇒ ∃ y (y ∈ x ∧ ¬∃ z(z ∈ y ∧ z ∈ x))].

3. If z is a set, and φ is any property which may characterize the elements x of z, then

there is a subset y of z containing those x in z which satisfy the property.

∀ z ∀ w1 . . . wn∃ y ∀ x [x ∈ y ⇔ (x ∈ z ∧ φ)].

4. If x and y are sets, then there exists a set which contains x and y as elements.

∀ x ∀ y ∃ z (x ∈ z ∧ y ∈ z).

5. For all F there is a A containing every set that is a member of some member of F .

∀ F ∃ A∀ Y ∀ x [(x ∈ Y ∧ Y ∈ F)⇒ x ∈ A].

6. Let φ be any formula in the language of ZFC whose free variables are among

x,y,A,w1, . . . , wn, so that in particular B is not free in φ. Then:

∀ A∀ w1, . . . , wn

[∀ x(x ∈ A⇒ ∃y φ)⇒ ∃ B∀ x

(x ∈ A⇒ ∃ y(y ∈ B ∧ φ)

)].

7. Let S(x) abbreviate x ∪{x}, where x is some set. Then there exists a set X such

that the empty set ∅ is a member of X and, whenever a set y is a member of X,

then S(y) is also a member of X.

∃ X [∅ ∈ X ∧ ∀ y(y ∈ X ⇒ S(y) ∈ X)] .

8. Let z ⊆ x abbreviate ∀q(q ∈ z ⇒ q ∈ x). For any set x, there is a set y which is a

superset of the power set of x. The power set of x is the class whose members are

all of the subsets of x.

∀ x ∃ y ∀ z [z ⊆ x⇒ z ∈ y].

9. For any set X, there is a binary relation R which well-orders X. This means R is

a linear order on X such that every nonempty subset of X has a member which is

minimal under R.

∀ X ∃ R (Rwell-ordersX).

7

GODEL’S INCOMPLETENESS THEOREM

After getting a basic idea of what axiomatic systems are we can now move forward

and see what incompleteness theorem says.

FIRST INCOMPLETENESS THEOREM

Any effectively generated theory capable of expressing elementary arithmetic cannot be

both consistent and complete. In particular, for any consistent, effectively generated formal

theory that proves certain basic arithmetic truths, there is an arithmetical statement that

is true but not provable in the theory.

It may seem a bit difficult at first look but the theorems are easy to understand. The

explaination for first incompleteness theorem is given below.

The true but unprovable statement referred to by the theorem is often referred to as

”the Godel sentence” for the theory. The proof constructs a specific Godel sentence for

each effectively generated theory, but there are infinitely many statements in the language

of the theory that share the property of being true but unprovable. For example, the

conjunction of the Godel sentence and any logically valid sentence will have this property.

For each consistent formal theory T having the required small amount of number

theory, the corresponding Godel sentence G asserts: ”G cannot be proved within the

theory T”. This interpretation of G leads to the following informal analysis. If G were

provable under the axioms and rules of inference of T , then T would have a theorem, G,

which effectively contradicts itself, and thus the theory T would be inconsistent. This

means that if the theory T is consistent then G cannot be proved within it, and so the

theory T is incomplete. Moreover, the claim G makes about its own unprovability is

correct. In this sense G is not only unprovable but true, and provability within the theory

T is not the same as truth.

Each effectively generated theory has its own Godel statement. It is possible to define

a larger theory T that contains the whole of T , plus G as an additional axiom. This will

not result in a complete theory, because Godel’s theorem will also apply to T , and thus T

8

cannot be complete. In this case, G is indeed a theorem in T , because it is an axiom. Since

G states only that it is not provable in T , no contradiction is presented by its provability

in T . However, because the incompleteness theorem applies to T , there will be a new

Godel statement G for T , showing that T is also incomplete. G will differ from G in that

G will refer to T , rather than T .

Godel’s first incompleteness theorem shows that any consistent effective formal system

that includes enough of the theory of the natural numbers is incomplete, there are true

statements expressible in its language that are unprovable. A system may be incomplete

simply because not all the necessary axioms have been discovered. For example, Euclidean

geometry without the parallel postulate is incomplete; it is not possible to prove or disprove

the parallel postulate from the remaining axioms. This will be discussed later towards the

end of topic. Godel’s theorem shows that, in theories that include a small portion of

number theory, a complete and consistent finite list of axioms can never be created, nor

even an infinite list that can be enumerated by a computer program. Each time a new

statement is added as an axiom, there are other true statements that still cannot be proved,

even with the new axiom. If an axiom is ever added that makes the system complete, it

does so at the cost of making the system inconsistent.

There are complete and consistent lists of axioms for arithmetic that cannot be enu-

merated by a computer program. For example, one might take all true statements about

the natural numbers to be axioms (and no false statements), which gives the theory known

as “true arithmetic”. The difficulty is that there is no mechanical way to decide, given

a statement about the natural numbers, whether it is an axiom of this theory, and thus

there is no effective way to verify a formal proof in this theory.

There is another example of Godel’s Incompleteness Theorem called Liar paradox. The

liar paradox is the sentence “This sentence is false.” An analysis of the liar sentence shows

that it cannot be true nor can it be false . A Godel sentence G for a theory T makes a

similar assertion to the liar sentence, but with truth replaced by provability: G says “G is

not provable in the theory T .” It is not possible to replace “not provable” with “false” in

a Godel sentence because the predicate “Q is the Godel number of a false formula” cannot

be represented as a formula of arithmetic.

9

SECOND INCOMPLETENESS THEOREM

For any formal effectively generated theory T including basic arithmetical truths and

also certain truths about formal provability, if T includes a statement of its own consistency

then T is inconsistent.

This strengthens the first incompleteness theorem, because the statement constructed

in the first incompleteness theorem does not directly express the consistency of the theory.

A technical subtlety in the second incompleteness theorem is how to express the consistency

of T as a formula in the language of T . There are many ways to do this, and not all of

them lead to the same result. In particular, different formalizations of the claim that T

is consistent may be inequivalent in T , and some may even be provable. For example,

first-order Peano arithmetic (PA) can prove that the largest consistent subset of PA is

consistent. But since PA is consistent, the largest consistent subset of PA is just PA, so in

this sense PA “proves that it is consistent”. What PA does not prove is that the largest

consistent subset of PA is, in fact, the whole of PA.

For any familiar explicitly axiomatized theory T, it is possible to canonically define a

formula Con(T ) expressing the consistency of T. The formalization of Con(T ) depends on

two factors, formalizing the notion of a sentence being derivable from a set of sentences

and formalizing the notion of being an axiom of T.

Godel’s second incompleteness theorem also implies that a theory T1 satisfying the

technical conditions outlined above cannot prove the consistency of any theory T2 which

proves the consistency of T1. This is because such a theory T1 can prove that if T2 proves

the consistency of T1, then T1 is in fact consistent. For the claim that T1 is consistent has

form “for all numbers n, n has the decidable property of not being a code for a proof of

contradiction in T1”. If T1 were in fact inconsistent, then T2 would prove for some n that

n is the code of a contradiction in T1. But if T2 also proved that T1 is consistent (that

is, that there is no such n), then it would itself be inconsistent. This reasoning can be

formalized in T1 to show that if T2 is consistent, then T1 is consistent. Since, by second

incompleteness theorem, T1 does not prove its consistency, it cannot prove the consistency

of T2 either.

10

Undecidable statements

There are two distinct senses of the word “undecidable” in mathematics and computer

science. The first of these is the proof-theoretic sense used in relation to Godel’s theorems,

that of a statement being neither provable nor refutable in a specified deductive system.

The second sense is used in relation to computability theory and applies not to statements

but to decision problems, which are countably infinite sets of questions each requiring a

yes or no answer. Such a problem is said to be undecidable if there is no computable

function that correctly answers every question in the problem set.

The two concrete examples of undecidable statements,

1. The continuum hypothesis can neither be proved nor refuted in ZFC.

2. The axiom of choice can neither be proved nor refuted in ZF (which is all the ZFC

axioms except the axiom of choice).

Godel’s incompleteness theorems struck a fatal blow to David Hilbert’s second problem,

which asked for a finitary consistency proof for mathematics. The second incompleteness

theorem, in particular, is often viewed as making the problem impossible. Not all mathe-

maticians agree with this analysis, however, and the status of Hilbert’s second problem is

not yet decided.

A turing machine is more like an axiomatic system (with enough complexity) with all

its transition function as its axioms and all strings it derives are theorems. A grammar

and a set of turing machines defines the complete structure of a formal system. Now

incompleteness theorem suggests that there will be certain strings which cannot be derived

by any machine. Therefore undecidable strings are inherent in nature and they exist as

basic characterstic of system. As we have seen the reason of occurance of undecidable sets

we can now move forward and explore some famous undecidable problems and see some

interesting results.

11

Chapter 3

Post Correspondance Problem

Definition Post Correspondance Problem(PCP)

The PCP problem was introduced by Emil Post in 1946 which he proved to be undecidable.

Halting problem is tougher to understand that’s why PCP is used to in proofs of several

other undecidable problems.

Definition of the problem

Input: A finite collection of blocks, labelled as above.

Question: Given an unlimited supply of copies of these particular blocks, can one form

a nonempty finite sequence of these for which the concatenation of the top strings equals

the concatenation of the bottom strings?

Example Consider two languages, A and B which contains threee srings each.

A B

a 10 101

b 011 11

c 101 011

12

In this particular problem we will never be able to find a string that is exactly the same.

Consider another example where the languages are as follows:

A B

a 1 111

b 10111 10

c 10 0

For this particular problem, “baac” is the string that works. Check:

A : 10111 + 11 + 10 = 101111110

B : 10 + 111 + 111 + 0 = 101111110

One more interesting property is that if a PCP has a solution then it has infinite solutions.

In this case the string “baac” can be infinitely many times to get the desired result.

Formal Definition

Input: Two set of n strings,

A = W1,W2, ......Wn

B = V1, V2, ......Vn

Then there is a solution to Post Correspondance Problem if there is a sequence i, j, . . . k

such that:

WiWj . . .Wk = ViVj . . . Vk

where indices may be repeated or omitted.

Using Post Correspondance Theorem to prove undecidability of other prob-

lems:

1. To Prove: There does not exists an algorithm to find whether a language is am-

biguous.

Proof : We reduce this sproblem from Post′s Correspondence Problem. Suppose

we can, in fact, decide the language {〈G〉|G is a CFG and L(G) is ambiguous}.

13

Given α1, . . . , αm, β1, . . . , βm:

Construct the following CFG G = (V,Σ, R, S) where

V = {S, S1, S2},R = {S → S1 | S2, S1 → α1Sσ1 | α2Sσ2 | . . . | αmSσm, S2 → β1Sσ1| . . . | βmSσm}(where σi are new characters added to the alphabet, e.g., σi = i).

If the language is ambiguous, then there is a derivation of some string w in two

different ways. Without loss of generality, let us suppose that the derivations both

start with the rule S → S1, reading the new characters backwards until they end

makes sure there can only be one derivation, so that’s not possible. Hence, we see

that the only ambiguity can come from one S1 and one S2 ‘start’. But then, taking

the substring of w up to the beginning of the new characters, we have a solution to

the PCP (since the strings of indices used after those points match).

Similarly, if there is no ambiguity, then the PCP cannot be solved, since a solution

would imply an ambiguity that just follows

S ⇒ S1 ⇒∗ ασ and S ⇒ S2 ⇒∗ βσ, where α = β are strings of matching α’s and

β’s (since the σ’s match).

Hence, we’ve reduced to PCP, and since that’s undecidable, we have shown that

there cannot exist any algorithm to find whether a given language is ambiguous.

2. To Prove: Given two CFLs, the problem of deciding whether the CFLs are disjoint

or not is undecidable.

Proof : To solve this problem we will require a lemma as given below:-

Lemma: The set of valid computations of a turing machine M is the intersection of

two CFL’s L1 and L2, and grammars for these CFL’s can be effectively constructed

from M .

Let G1andG2 be the two given grammars, and we have to check whether we can

say that L(G1) ∩ L(G2) is empty or not. Using the lemma we can construct from

M grammners G1andG2 such that L(G1) ∩ L(G2) is the set of valid computations

of M . If there is an algorithm A to tell whether the intersection of the languages of

two CFG’s is empty, we can construct an algorithm to B to tell whether L(M) = Φ

for arbitrary TM M . Simply design B to construct G1 and G2 from M as in lemma,

14

then apply Algorithm A to tell whether L(G1)∩L(G2) is empty. If the intersectio is

empty, then there are no valid computations of M, so L(M) = Φ. If the intersection

is not empty, L(M) 6= Φ. That is, the problem of emptiness for r.e. sets reduces to

the problem of intersection for CFG’s.

Algorithm B cannot exist, however, since L(M) = Φ is undecidable by Rice Theo-

rem. Therefore A does not exist, so it is undecidable whether the intersection of two

CFL’s is empty.

3. To Prove: For a given CFL, it is undecidable to check whether L = Σ∗

Proof : Let us suppose that the problem of decideing L = Σ∗ is decidable. Take

L = (L(G1) ∩ L(G2))∗. It means the problem of decideing L(G1) ∩ L(G2) = Φ is

decidable. But by previous result, it is undecidable. Hence our assumption was

wrong. Thus, it is undecidable for a CFG G whether L(G) = Σ∗.

4. To Prove: For any given CFG, it is undecidable to check whether its language equal

to a regular set?

Proof : We know that Σ∗ is a regular set. So if we can decide whether a language

L is equal to a regular set, then we can decide that whether L = Σ∗ but we have

proved previously that this problem is undecidable.

5. To Prove: For any two given CFLs L1 and L2, check if L1 = L2

Proof : Let L1 = L(G1) and L2 = L(G2). Fix G2 be the grammar generating

Σ∗, where Σ is the terminal symbol alphabet of G1. Then it is equivalent to decide

whether L1 = Σ∗, which is undecidable. Hence the given problem is undecidable.

6. To prove: Given two CFLs L1andL2, it is undecidable to check if L2 ⊆ L1

Proof : The proof is same as the above proof.

7. To Prove: Given a regular set R and a CFL, L, it is undecidable to check if R ⊆ L

Proof : Take R = Σ∗ then the theorem is reduced to one of the previously seen

proofs. Hence it is also undecidable.

15

Chapter 4

Halting Problem

Definition Halting Problem

Given a description of an arbitrary computer program, decide whether the program finishes

running or continues to run forever.

This is equivalent to the problem of deciding, given a program and an input, whether

the program will eventually halt when run with that input, or will run forever.

In 1936, Alan Turing proved that a general algorithm to solve the halting problem

for all possible program - input pairs cannot exist. In other words, he proved that the

Halting Problem is undecidable.

Suppose that there exists a program, that takes as input a program M and an input

w of that program. Let us assume that the program always determines correctly whether

the program M would halt on input w (it would return“yes” if it does), or whether it

would run forever (it would then return “no”). Let us call this program H(M,w).

Now, we use H(M,w) to write another program, with the name D(w), such that D(w)

halts if H(w,w) does not halt. Since the program H(M,w) has been assumed to halt (get

result) on each input string, so D(w) would not halt for any input string w.

And now comes the unanswerable question: Does D(D) halts? It would halt if and

only if the call H(D,D) returns “no”. In other words, it would halt if and only if it would

16

not halt. This is a contradiction: we must conclude that the only hypothesis that started

us on this path is false, that program H(M,w) does not exist. Thus, there does not exists

any program or algorithm for solving if the problem H would solve: to decide whether

arbitrary programs would halt or loop.

We have a full-fledged notation for algorithms: Turing machines. We are thus ready

to define a recursively enumerable language, and prove that it is partially decidable. Let

H = {(M, w) : Turing machine M halts on input string w}.

Notice first that H is recursively enumerable: On input (M,w), universal Turing machine

U halts precisely when the input is in H.

Furthermore, if H is recursive, then every recursively enumerable language is recursive.

In other words, all recursively enumerable languages are also decidable if and only if H

is recursive. For suppose that it is indeed decided by some Turing machine M0. Then

given any particular Turing machine M which partially decides a language L(M), we could

design a Turing machine M′

that fully decides L(M).

Theorem Undecidability of Halting Problem

Let ATM = {〈M,w〉|M is a TM and M accepts w}. Then ATM is undecidable.

Proof In a proof by contradiction it will be assumed that ATM is decidable. Suppose

that H is a decider for ATM , i.e.,

H(〈M,w〉) =

{accepts if M accepts w

rejects if M does not accept w

A new Turing machine D with H as a subroutine will now be constructed.

The input to D is a description of a Turing machine M . This information is sent to Turing

machine H which determines what M does when the input to M is its own description.

Once D has determined this information, it does the opposite, i.e., it rejects if M accepts

and accepts if M does not accept:

D(〈M〉) =

{accepts if M does not accepts 〈M〉rejects if M accepts 〈M〉

17

In the case when D is run with its own description the following is obtained:

D(〈D〉) =

{accepts if D does not accepts 〈D〉rejects if D accepts 〈D〉

The Halting Problem is noncomputable, but it is an important problems. It is useful

to know if a procedure application will terminate in a reasonable amount of time, but the

Halting Problem does not answer that question. It concerns the question of whether the

procedure application will terminate in any finite amount of time, no matter how long it is.

Virus detection

A virus is a program that infects other programs. A virus spreads by copying its own

code into the code of other programs, so when those programs are executed the virus will

execute. In this manner, the virus spreads to infect more and more programs. A typical

virus also includes a malicious payload so when it executes in addition to infecting other

programs it also performs some damaging (corrupting data files) or annoying (popping

up messages) behavior. The Is-Virus Problem is to determine if a procedure specification

contains a virus:

Input: A specification of a Python program.

Output: If the expression contains a virus (a code fragment that will infect other files)

output True. Otherwise, output False.

We demonstrate the Is-Virus Problem is noncomputable, we show how to define a halts

algorithm given a hypothetical isVirus algorithm. Since we know halts is noncomputable,

this shows there is no isVirus algorithm.

18

Assume infectFiles is a procedure that infects files, so the result of evaluating

isV irus(‘infectF iles()′) is True. We could define halts as:

def halts(p) :

return isV irus(p+′; infectF iles()′)

This works as long as the program specified by p does not exhibit the file-infecting

behavior. If it does, p could infect a file and never terminate, and halts would produce

the wrong output. To solve this we need to hide the printing behavior of the original

program. A rough definition of file-infecting behavior would be to consider any write to

an executable file to be an infection. To avoid any file infections in the specific program,

we replace all procedures that write to files with procedures that write to shadow copies

of these files. For example, we could do this by creating a new temporary directory and

prepend that path to all file names. We call this (assumed) procedure, sandBox, since it

transforms the original program specification into one that would execute in a protected

sandBox.

def halts(p) : isV irus(sandBox(p)+′; infectF iles()′)

Since we know there is no algorithm that solves the Halting Problem, this proves that

there is no algorithm that solves the Is-Virus problem.

Virus scanners such as Symantec’s Norton AntiVirus attempt to solve the Is-Virus

Problem, but its non-computability means they are doomed to always fail. Virus scanners

detect known viruses by scanning files for strings that match signatures in a database of

known viruses. As long as the signature database is frequently updated they may be able

to detect currently spreading viruses, but this approach cannot detect a new virus that

will not match the signature of a previously known virus.

Sophisticated virus scanners employ more advanced techniques to attempt to detect

complex viruses such as metamorphic viruses that alter their own code as they propagate

to avoid detection. But, because the general Is-Virus Problem is noncomputable, we know

that it is impossible to create a program that always terminates and that always correctly

determines if an input procedure specification is a virus.

19

Chapter 5

Hilbert’s Tenth Problem

The statement of Hilbert’s tenth problem is as follows :

Given a Diophantine equation with any number of unknown quantities and with

rational integral numerical coefficients: To devise a process according to which it can be

determined in a finite number of operations whether the equation is solvable in rational

integers.

The undecidability of the existence of integral roots turned out to be the consequence

of a deep equivalence.

Definition Diophantine Predicates and Relations

A predicate P on Nk is called diophantine iff there is an n ∈ N and a polynomial p with

integer coefficients in k + n variables, such that P (x)⇐⇒ ∃y ∈ Nn : p(x, y) = 0.

A set or relation S ⊂ Nk is called diophantine iff x ∈ S is a diophantine predicate.

A function f : Nk → N is called diophantine iff its graph (f(x), x) ⊂ Nk+1 is a diophantine

set.

The definition allows for polynomials of arbitrary (though finite) degree. Diophantine

problems have fewer equations than unknown variables and involve finding integers that

work correctly for all equations.

20

The following trick by Skolem, shows that we can trade the degree with the number

of variables to the extent that finally we may restrict ourselves to polynomials of degree

at most four.

Lemma If S ⊂ Nk is a diophantine set, then there is an m ∈ N and a polynomial q in

m + k variables with integer coefficients and of degree at most four such that S = {x ∈Nk| ∃ z ∈ Nm : q(x, z) = 0}.

Proof : By assumption there is a polynomial p with integer coefficients such that S =

{x ∈ Nk| ∃ xy ∈ Nn : p(x, y) = 0}. The construction of the new polynomial is then

done recursively: for all monomials in p which have degree larger than two, introduce new

variables u1, u2, . . . defined as a product of the first two variables of the corresponding

monomial. Inserting the new variables then leads to a new polynomial p1(x, y, u) whose

maximal degree is one less than that of p and p(x, y) = p1(x, y, u) if we impose the defining

constraints for the uis. Iterating this procedure, we can obtain a sequence of polynomial in

more and more variables which eventually is at most quadratic in all variables. Suppose

pn is this quadratic polynomial. The imposed constraints on the new variables which

guarantee that p(x, y) = pn(x, y, u, . . .) can now be formulated in terms of the existence

of integral roots of quadratic polynomials with integer coefficients.

Denote those polynomials by c1, c2, . . .. That is, if we have for instance u1 := x2y7, u2 :=

u1x2, then we define c1(u1, x2, y7) := u1−x2y7 and c2(u2, u1, x2) := u2−u1x2. In this way,

we achieve that p(x, y) = 0 iff ∃u : p2n + Σic2i = 0, where we denote by u the collection of

all variables added to x and y. Hence, the polynomial q := p2n + Σic2i leads to the sought

biquadratic characterization of the diophantine set S.

Proposition Every diophantine set is recursively enumerable.

Proof : Let p(x, y) be a characterizing polynomial for the diophantine set. The statement

follows from observing that p(x, y) = 0 is a primitive recursive predicate which we call

P (x, y) and from recalling that a recursively enumerable set S is exactly one for which

there is a primitive recursive predicate for which S = {x|∃ y : P (x, y)}.

21

A basic property of recursively enumerable sets is that the class is closed w.r.t. unions

and intersections. This is easily seen to hold also for diophantine sets:

Proposition: The class of diophantine predicates is closed w.r.t. (i) conjunction, (ii)

disjunction and (iii) the use of existential quantifiers.

Proof : Let upper case P ′s be diophantine predicates and lower case ps their characterizing

polynomials. Then

P1(x) ∧ P2(x) ⇐⇒ ∃ y1, y2 : p1(x, y1)2 + p2(x, y2)

2 = 0

P1(x) ∨ P2(x) ⇐⇒ ∃ y1, y2 : p1(x, y1)p2(x, y2) = 0,

x2 : P (x1, x2) ⇐⇒ ∃ y, x2 : p(x1, x2, y) = 0

Now, we can go through the proof of the fact that every Turing computable function

is recursive. Since the encoding of an arbitrary Turing machine in terms of recursive

functions uses only functions for which we now known that they are diophantine, one can

with a little bit of effort see that the predicate “the Turing machine halts” is a diophantine

predicate. Following these lines leads to two remarkable consequences:

Theorem Every recursively enumerable set is diophantine.

Theorem There is an n ∈ N and a polynomial p with integer coefficients such that for

any recursively enumerable set S ⊆ N there exists an s ∈ N so that

S = {x ∈ N | ∃ y ∈ Nn : p(s, x, y) = 0}.

Denoting by (n, d) the number of variables and the maximal degree of a polynomial, then

there are universal polynomials known for (n, d) = (58, 4) (note that the possibility of

having d = 4 follows from Skolem’s trick) to (n, d) = (9, 1.6× 1045).

The fact that diophantine sets and recursively enumerable sets are the same leads to

the sought undecidability of Hilbert’s tenth problem:

22

Corollary Let P be the class of polynomials with integer coefficients and of degree at most

four. Then,

1. There is no algorithm which upon input of any element p ∈ P decides whether or

not p has an integral root, and

2. there is no algorithm which upon input of any element p ∈ P decides whether or not

p has a non-negative integral root.

Proof : Assume there would be an algorithm for deciding integral roots. Then there

would be one for non-negative integral roots as well, since we can exploit the Lagrange

four square theorem to the end that 0 ∈ p(Nn)⇔ 0 ∈ p′(Z4n).

Then for any diophantine set S = {x ∈ Nk | ∃ y ∈ Nm : p(x, y) = 0}, the hypothetical

algorithm could be used in order to decide x ∈ S for any x. In other words, every

diophantine set would be a recursive set. However, we known that there are non-recursive

sets within the recursively enumerable sets. An since the latter are exactly the diophantine

sets, the assumption of such an algorithm leads to a contradiction. The fact that we can

restrict ourselves to degree at most four follows from Skolem’s lemma.

While for polynomials with maximal degree two, there exists such an algorithm, the

case of maximal degree three is still open. Similarly, for rational (rather than integral)

roots, decidability is an open problem. For real roots, on the other hand, a result of

Tarski implies that the problem then becomes decidable.

As we will prove in the exercise, one can extend the above undecidability result in the

following direction: let C be any set of cardinal numbers ≤ N0 which is neither empty

nor does it contain all such cardinal numbers. Then the question of whether or not the

number of non-negative integral roots of a polynomial is in C turns out to be undecidable as

well. The proof is a reduction from C = {0} - the undecidability of Hilbert’s tenth problem.

23

Proposition There is a polynomial q(y1, y2, . . . , yn, x) with integer coefficients such that

the positive integers in its range are exactly all prime numbers in the sense that

q(Nn+1) ∩ N∖{0} = the set of all primes.

Proof : Primes form a recursively enumerable and thus diophantine set S. This implies

that there is a polynomial p with integer coefficients such that S = {x ∈ N | ∃ yNn :

p(x, y) = 0}. Defining q(y, x) := x(1 − p(x, y)2) then gives the sought polynomial since

this is positive iff p has a root in which case indeed q(y, x) takes on the value of the

corresponding prime.

Following the remark regarding universal polynomials, we obtain that for prime number

producing polynomials ten variables suffice. A similar construction leads to the following:

Proposition Let f : N→ N be any partial recursive function. There exists a polynomial

q with integer coefficients such that for all x, y ∈ N:

y = f(x)⇔ ∃ x0, . . . , xn ∈ N : y = q(x, x0, . . . , xn).

Proof : The graph of f is recursively enumerable and thus diophantine. So y = f(x) holds

iff for a certain polynomial p we have ∃ x0, . . . , xn : (1− p(x0, . . . , xn, x)2) > 0 ∧ x0 = y.

This in turn is equivalent to ∃ x0, . . . , xn : (x0 + 1)(1 − p(x0, . . . , xn, x)2) = y +

1. Therefore the sought polynomial can be defined as q(x, x0, . . . , xn) := (x0 + 1)(1 −p(x0, . . . , xn, x)2)− 1.

Application of Solution

A particularly form of Godel’s incompleteness theorem is also a consequence of the

Matiyasevich/MRDP Theorem:

Let p(a, x1, . . . , xk) = 0, provide a Diophantine definition of a non-computable set. Let

A be an algorithm that outputs a sequence of natural numbers such that the corresponding

equation p(n, x1, . . . , xk) = 0,has no solutions in natural numbers. Then there is a

numbern0 which is not output by A while in fact the equation p(n, x1, . . . , xk) = 0, has

no solutions in natural numbers.

24

To see that the theorem is true, it suffices to notice that if there were no such number n0

, one could algorithmically test membership of a number n in this non-computable set by

simultaneously running the algorithm A to see whether n is output while also checking all

possible k-tuples of natural numbers seeking a solution of the equation p(n, x1, . . . , xk) = 0.

We may associate an algorithm A with any of the usual formal systems such as Peano

Arithmetic or ZFC by letting it systematically generate consequences of the axioms and

then output a number n whenever a sentence of the form

∃x1, x2, . . . xk[p(n, x1, x2, . . . . .xk) = 0]

is generated. Then the theorem tells us that either a false statement of this form is proved

or a true one remains unproved in the system in question.

25

Chapter 6

Wang’s Tile Problem

Suppose we want to cover the plane with decorated square tiles of the same size. Tiles

are to be chosen from a finite number of types. There are unbounded tiles of each type

available. Due to the decorations, however, there are local constraints on which tiles can

be put next to each other, for the tiling to look appealing. Is it possible to cover the whole

plane with tiles of given types? How if we require a certain tile to be used at least once?

Can they be used to tile a finite rectangular area, with a certain boundary condition? It

turns out that these problems the way formulated by Hao Wang are all undecidable.

In 1961, Wang conjectured that if a finite set of tiles can tile the plane, then there

exists also a periodic tiling, i.e., a tiling that is invariant under translations by vectors in

a 2-dimensional lattice, like a wallpaper pattern. He also observed that this conjecture

would imply the existence of an algorithm to decide whether a given finite set of tiles can

tile the plane.

This conjecture was in 1966 refuted by Berger. He showed that any Turing machine

can be translated into a Wang tile set, and that the Wang tile set tiles the plane if and

only if the Turing machine will never halt. The halting problem is undecidable and thus

so is Wang’s original problem.

Berger constructed the first aperiodic tile set counting 20426 tiles. This number was

reduced repeatedly, often by well known scientists, such as DonaldKnuth. The smallest

aperiodic set of Wang tiles consists of 13 tiles over 5 colors.

26

Proving the undecidability of Tiling problem

To prove that tiing problem is undecidable we somehow reduce it to the problem

of Halting Turing Machine. The encoding is most natural and intuitive. Let M =

(Σ, Q, δ, q0, qF , B) be a Turing machine with a one-way-infinite tape, whereΣ is the al-

phabet, Q the set of states, δ : Q− {qF } ×Σ −→ Σ× {L,R} ×Q the transition function,

q0 ∈ Q the initial state, qF ∈ Q the final state, and B ∈ Σ the blank symbol. We construct

a set TM of Wang tiles as shown in Figure A and B and described bellow.

Fig. A. Alphabat tile B. Merging tile C.Action Tile

• For any letter a ∈ Σ, we have a tile of the form depicted in Figure A. These are to

pass the content of an inactive cell of the tape one row upward (which corresponds

to one step later in the computation process of the machine).

• For any state q ∈ Q and letter a ∈ Σ, where δ(q, a) = (b,D, p), there is a tile like

one of those in Figure C, depending on whether D = R or D = L. These correspond

the action of the transition function, and passing the new state to a neighbor cell.

27

• The state p received from a neighbor cell, is combined with the current content of

the cell, by a tile of the form shown in Figure B and passed to the upper tile to be

processed in next step.

Tiles for fixing the initial configuration of a Turing machine.

(a) Head position. (b) Empty cells.

• The initial configuration of the machine is fixed by the tiles in second figure. Here

* is a new colour and used to ensure that only one head appears on the tape.

Theorem Plane Tiling Problem is undecidable.

Proof : Let M be a Turing Machine - an instance of the Halting Problem. We use the

tile set TM , constructed above, and a tile set TR, to construct a new set of tiles T that

can tile the plane, if and only if, M on empty input never halts.

Each tile in T is basically a pair (s, t), where s ∈ TR, and T is either in TM or is of an

auxiliary type, to pass information along a vertical or horizontal line. The two components

are interpreted as layers of the tile. Let s0 ∈ TR be any tile that represents the lower-left

corner of a connected region. For any such tile, we place a tile (s0, t0) in T , where t0

is the marked tile of TM as defined above. This is the only pair in which t0 appears,

and is supposed to trigger the start of simulation in each connected region. Any other tile

T ∈ TM\{t0} is paired with tiles s ∈ TR that represent a cell fully inside a connected region.

Let us say a row (or column) of a region is free if it does not pass through a hole.

Otherwise, the row (or column) is blocked by the hole. The intersection of a free row and

free column is a free cell. Only free cells are used for simulation. A cell that belongs to

a blocked row (resp. a blocked column) simply passes the information along the edge of

28

the blocking hole. More clearly, a cell whose lower (or upper) edge touches a hole simply

passes the colour of its left edge to the right, and asks its upper (resp. lower) neighbor to

do the same. Similarly, a cell whose left (resp. right) edge touches a hole passes the colour

of its lower edge to the upper edge and asks its right (resp. left) neighbor to do the same.

Now, it should be clear that the above-d escribed tiles can tile a connected region C

of the decoration, if and only if, TM can tile a square with the same area as C, provided

t0 is used in the lower-left corner. Since the entire decoration provides connected regions

of arbitrarily large net area, using the Extension Lemma, we conclude that the plane can

be tiled by the tiles in T , if and only if, TM can tile an upper-right quarter of the plane

with t0 in the lower-left corner. This completes the proof.

29

Chapter 7

Busy Beaver

A busy beaver is a Turing machine that attains the maximum “operational busyness”

among all the Turing machines in a certain class. The Turing machines in this class must

meet certain design specifications and are required to eventually halt after being started

with a blank tape. A busy beaver function quantifies these upper limits on a given type

of“operational busyness”, and is a noncomputable function. In fact, a busy beaver function

can be shown to grow faster asymptotically than does any computable function.

To understand busy beaver function we need to make few changes in our turing machine

and its functioning. So we define our turing machine as follows:

Turing machine here, is a device consisting of

1. a tape, T = {0, 1} and we say that the cell is blank if it contains a zero,

2. a read/write head, typically

3. a finite set of internal states Q = {0, . . . , n}

4. a list of instructions, typically

where {R,L} are the directions the head can move. We will sometimes write M (n) if

we want to make the number of internal states explicit.

Example 1: A TM M (k+1) which writes k ones onto a blank tape and then halts above

the leftmost one. M : (0, t) 7→ (1, t+1, L) for t = 0, . . . , k−1 and M : (0, k) 7→ (0, k+1, R).

30

We will use the convention that the initial state is q = 0 and the last state q = n is

the halting state - the only state upon which the machine halts. Since this results in n

active states, the machine is called an n-state Turing machine. Mathematically, the set of

instructions characterizing a Turing machine is a map,

M : T ×Q −→ T ×Q× {R,L}

In order to talk about Turing machines as devices which compute functions of the

form f : Nk −→ N, we need to specify some conventions about how input and output are

represented. We will use unary encoding for both of them. That is, a number x ∈ N will

be represented by x + 1 consecutive 1s on the tape with the rest of the tape blank (e.g.,

2 would correspond to 0 . . . 1110 . . . 0). Similarly, (x1, . . . , xk) ∈ Nk will be represented by

k such blocks of 1s separated by single zeros (e.g. (0, 2) would be 0 . . . 0101110 . . . 0). A

Turing machine M(n)f is then said to compute the function f : Nk −→ N iff the machine

starting with the head placed on the leftmost 1 of the unary encoding of x ∈ Nk eventually

halts on the leftmost 1 of the encoded f(x) if x ∈ dom(f) and it never halts if x 6∈ dom(f).

Definition A function f : Nk −→ N is called Turing computable iff there is an n-state

Turing machine (TM) for some finite n which computes f in the sense that the Turing

machine halts for every input x ∈ dom(f) with the tape eventually representing f(x) and

it doesn’t halt if x 6∈ dom(f). Here, input and output are encoded in the above specified

unary way and at the start and (potential) end of the computation the head of the TM

should be positioned above the leftmost non-blank symbol of the tape.

Example 2: the successor function s(x)=x+1 can be computed by the following 2-

state Turing machine M(2)x+1.(1, 0) 7→ (1, 0, R), (0, 0) 7→ (1, 1, L), (1, 1) 7→ (1, 1, L), (0, 1) 7→

(0, 2, R).

Example 3: the zero function z(x)=0 can be computed by a 2-state TM: (1, 0) 7→(0, 0, R), (0, 0) 7→ (1, 1, R), (0, 1) 7→ (0, 2, L).

Example 4: the following 5-state TMM(5)2x implements x 7→ 2x : (0, 0) 7→ (0, 3, R), (1, 0) 7→

(0, 1, L), (0, 1) 7→ (1, 2, R), (1, 1) 7→ (1, 1, L), (0, 2) 7→ (1, 0, R), (1, 2) 7→ (1, 2, R), (0, 3) 7→(0, 3, L), (1, 3) 7→ (0, 4, L), (0, 4) 7→ (0, 5, R), (1, 4) 7→ (1, 4, L).

31

After defining the new turing machine, we now define concatenation of two turing

machines. Let M(nf )f and M

(ng)g be two TMs with nf , ng internal states, computing

functions f and g respectively. Then we can a define a new (nf +ng)−state TM M(nf+ng)gf

via

Mgf (t, q):=

{Mf (t, q), q < nf

Mg(t, q − nf ), q ≥ nf

Its action will be such that it first computes f(x) and then uses the resulting output

as an input for g. Hence, Mgf computes the concatenation corresponding to x 7→ g(f(x))

for which we will also write gf(x). Note that the possibility of concatenating two TMs

in this way builds up on our requirements that the output of a computation has to be

encoded in unary on the tape and that the TM (if ever) halts with the head positioned on

the leftmost 1.

Now we move on to defining the busy beaver function. Let us assign a number B(M) ∈N to every Turing machine M by considering its behavior when run on an initially blank

tape. We set B(M) := 0 if M never halts and B(M) := b if it halts and the total number

of (not necessarily consecutive) 1s eventually written on the tape is b. Based on this we

can define the busy beaver function BB : N→ N

BB(n) := max{B(M)|M ∈ {M (n)}}

BB(n) is defined as the largest number of 1s eventually written on an initially blank

tape by any n-state TM which halts. Note that the function is well-defined since the

maximum is taken over a finite set.

Lemma Busy Beaver function is monotonically increasing:

BB(n+ 1) > BB(n) for all n ∈ N.

Proof Denote the TM which achieves BB(n) by M (n). Based on this we can define a

(n + 1)−state TM M (n+1)whose instructions equal those of M (n) for all internal states

q < n and which in addition follows the rule (t, n) 7→ (1, n + 1 − t, R). By construction

BB(n) + 1 = B(M (n+1)) ≤ BB(n+ 1).

32

This leads us to a common property of all Turing computable functions that they

cannot grow faster than BB.

Theorem Let f : N −→ N be any function which is Turing computable by a k-state TM

Mf . Then for all x > 2k + 13 for which f is defined we have f(x) < BB(x).

Proof We utilize concatenation of the above discussed examples and then define a (k +

n+ 8)−state TM

Mf(2n+1) := M(k)f M

(2)x+1M

(5)2x M

(n+1)

with M (n+1) being the TM which writes n consecutive 1s.

Running Mf(2n+1) on the blank tape then produces f(2n + 1) consecutive ones before

halting. Thus,

f(2n+ 1) ≤ BB(k + n+ 8).

Moreover, monotonicity of BB implies

BB(k + n+ 8) < BB(2n+ 1) if k + 7 < n.

or, f(2n+ 1) < BB(2n+ 1) if k + 7 < n.

Similaraly if we construct a (k + n+ 6)−state TM

Mf(2n) := M(k)f M

(5)2x M

(n+1)

where every component is same as defined above. Using the monotonicity of BB we

obtain,

f(2n) < BB(2n) if k + 6 < n.

For f(x) < BB(x) to hold true, x can either be even and take values 14,16,18... or

x can be odd and take values 15,17,19... From above two results we can generelize that

x > 2k + 13.

Above theorem is known as Rado’s theorem and from the proof of Rado’s theorem we

see that the statement of the theorem would still hold true if we would require a single

block of consecutive 1s in the definition of BB, rather than counting all 1s on the tape.

33

After defining every thing required now we will prove that busy beaver function is not

Turing computable or undecidable. If BB was decidable, then there would be a k ∈ Nand a k−state TM computing BB so that by Rado’s theorem for all sufficiently large x

BB(x) < BB(x). Hence busy beaver function is undecidable.

This is the formalized version of the following more vague statement:

“if BB(x) is the largest finite number which can be written by an algorithm of length x,

then there cannot be a single, finite algorithm which computes BB(x) for all x.”

The fact that BB is not Turing computable doesnt mean that BB(x) cannot be

computed for given x. Rado’s theorem just tells us that the complexity of the TM has to

increase unboundedly with x. In fact, BB(x) is known for small values of x,

for x=1, BB(x) = 1

for x=2, BB(x) = 4

for x=3, BB(x) = 6

for x=4, BB(x) = 13

for x=5, BB(5) ≥ 4098

for x=6, BB(6) ≥ 3.5× 1018267.

34

Chapter 8

Some More Undecidable Problems

In this section we will see some famous undecidable problems. These problems are from

different fields in mathematics like Mathematical Logic, Matrix theory, Group Theory etc.

We will also proof some of them undecidable. Some of the proofs are very complex and

require higher knowledge in their field, so they are omitted.

Entscheidungsproblem

The Entscheidungsproblem asks for an algorithm that takes as input a statement

of a first-order logic (possibly with a finite number of axioms beyond the usual axioms

of first-order logic) and answers ”Yes” or ”No” according to whether the statement is

universally valid.

By the completeness theorem of first-order logic, a statement is universally valid if

and only if it can be deduced from the axioms, so the Entscheidungsproblem can also be

viewed as asking for an algorithm to decide whether a given statement is provable from

the axioms using the rules of logic.

Suppose that we had a general decision algorithm for statements in a first-order

language. The question whether a given Turing machine halts or not can be formulated

as a first-order statement, which would then be susceptible to the decision algorithm. But

we know that no general algorithm can decide whether a given Turing machine halts.

35

Mortality problem

The Mortality Problem for Turing machines with an infinite input tape is the problem

to determine, for an arbitrary machine M, whether or not M eventually halts no matter

in what configuration it is started. This is not the Halting Problem, since it means that

we cannot just consider well-behaved machines that always start in their start states,

positioned to the right of their arguments and which always end up to the right of the

answer, which immediately follows these arguments (a convention called Standard Turing

Computation). It also means that we might start with an infinite number of marked

squares on the tape, unlike a normal Turing machine, which must start with its tape only

finitely marked.

As is commonly done with Turing machines, we can, without loss of generality, limit the

tape alphabet to {0,1}, where 0 denotes a blank, and 1 is the only mark (non-blank). Using

that limitation on the tape alphabet, consider a function to compute x+ 1 from x, using

Standard Turing Computation and unary representations of numbers. Such a machine

could copy its one argument to the immediate right of the original scanned square and

then move to the end of the copy appending a 1.

This machine always halts if it is started on a finitely marked tape, with the Standard

Turing conventions obeyed. In fact, it can be written so it will always halt so long as the

tape is finitely marked, even if the machine is started in other than the correct state and

other than on the correct square. However, this machine is not mortal since, for example,

it would run forever if started just to the right of an infinite sequence of 1s; the copy

operation could never end.

Philip K. Hooper proved in 1966 that the mortality problem is undecidable. However,

it can be shown that the set of Turing machines which are mortal i.e. halt on every starting

configuration is recursively enumerable.

36

Word Problem

The Word Problem for groups is undecidable. This is the problem, given a finite group

presentation and a word, to decide if that word is the group identity in that presentation.

By group identity we mean that given a word in the generators of group, it represents

the identity if and only if the reduced word obtained by iteratively cancelling adjacent

inverses is the empty word. The problem can be also viewed as: the algorithmic problem

of deciding whether two words in the generators represent the same element.

The basic line of thought will be to realize that the set of words forms a semi group

which then allows us to extend the undecidability result to presentations of semi groups

and groups. It was proven undecidable independently by Post and Markov. The problem is

undecidable because one may encode the Halting problem for Turing machines. Basically,

for each Turing machine program, one can construct a group presentation and a word,

such that the program halts if and only if that word is the identity.

Conjugacy Problem

The conjugacy problem for a group G with a given presentation is the decision problem

of determining, given two words x and y in G, whether or not they represent conjugate

elements of G. That is, the problem is to determine whether there exists an element z

of G such that y = xzx−1 The geometric motivations is: to see if whether two loops are

freely homotopic.

The Conjugacy problem is undecidable. The conjugacy problem for arbitrary presenta-

tions (and not just one fixed presentation) does reduce to the word problem for arbitrary

presentations, since both are equivalent to the halting problem. That is, the halting

problem famously reduces to the word problem, which reduces to the conjugacy problem

as but this clearly reduces to the halting problem.

Isomorphism Problem

The isomorphism problem is to find an algorithm to determine whether two finite

presentations give isomorphic groups. The geometric motivation is to see if one can

37

algorithmically distinguish spaces based on their fundamental groups. The undecidability

of the isomorphism problem can be proofed by Adjan-Rabin theorem. This theorem gives a

general construction which can be applied to any Markov property P of finitely presented

groups to proof the undecidability. Also, All varieties of solvable groups with undecidable

word problem have undecidable isomorphism problem.

Undecidable statements in ZFC

Assuming that ZFC is consistent, the mathematical statements mentioned below are

provably undecidable in ZFC (the Zermelo Fraenkel axioms plus the axiom of choice) :

1. Set theory of the real line: It is an area of mathematics concerned with the applica-

tion of set theory to aspects of the real numbers. Axiomatic set theory, by Godel’s

incompleteness theorem, contains propositions that are undecidable. Also, the real

numbers are most often formalized using the Zermelo Fraenkel axiomatization of set

theory.

2. Axioms of Constructability: It is a possible axiom for set theory in mathematics

that asserts that every set is constructible. It implies the axiom of choice over ZFC

theory.

3. Continuum Hypothesis: It stated that there is no infinite set with a cardinal number

between that of the “small” infinite set of integers and the “large” infinite set of real

numbers (the continuum)

It was showed by Godel that no contradiction would arise if the continuum hypothesis

were added to conventional ZFC set theory. However, using a technique called

forcing, Paul Cohen proved that no contradiction would arise if the negation of the

continuum hypothesis was added to set theory. Together, the result was established

that the validity of the continuum hypothesis depends on the version of set theory

being used, and is therefore undecidable.

4. Group Theory: It studies the algebraic structures known as groups. It is an undecid-

able theory. There is no computable process to determine whether a given statement

in the first order language of group theory is true in all groups.

38

5. Measure Theory: It is about the study of measures. It generalizes the intuitive

notions of length, area, and volume. The earliest and most important examples

are Jordan measure and Lebesgue measure, but other examples are Borel measure,

probability measure, complex measure, and Haar measure.

6. Order Theory: It is a branch of mathematics which investigates our intuitive notion

of order using binary relations. It provides a formal framework for describing

statements such as “this is less than that” or “this precedes that”.

7. Functional Analysis: It is concerned with infinite-dimensional vector spaces (mainly

function spaces) and mappings between them. The spaces may be of different, and

possibly infinite dimensions. These mappings are called operators or, if the range is

on the real line or in the complex plane, functionals.

Matrix - Mortality Problem

Consider a fnite set of d×d matrices S = {M1, . . . ,Mn} ⊂Md(Z) with integer entries.

We call S mortal iff there is a non-empty word w ∈ {1, . . . n}′, of length m say, such that

for the corresponding product of matrices:

Mw1 · · ·Mwm = 0

Example 1 Consider a set consisting of two matrices

(0 1

−3 2

)and

(2 0

0 −1

)

This cannot be mortal since the matrices have non-zero determinant and the determi-

nant of any product is just the product of determinants.

Example 2 The two matrices

(0 0

0 1

)and

(0 1

−1 0

)

39

form a mortal set since their product is a nilpotent matrix whose square vanishes.

Before we show that unlike in these simple examples there cannot be a general recipe

for deciding mortality, we will introduce some tools for encoding words into products of

matrices:

For words w = a1 . . . am over the alphabet A := {1; 2; 3} define an injective map

W (w) := Σmk=1ak4m−k from A′ to N. Denote by |w| the length of a word and define a map

from A′ ×Aprime into the set of 3× 3 integer matrices by

M(u,w) :=

4|u| 0 0

0 4|w| 0

W (u) W (w) 1

If we use concatenation of words and matrix multiplication as binary operations in the

domain and codomain respectively, then (u,w) 7→ M(u,w) is an injective monoid homo-

morphism. That is, in particular M(u1, v1)M(u2, v2) = M(u1u2, v1v2).

In addition we will need the matrix

B :=

1 0 0

−1 0 1

0 0 0

which satisfies B2 = B and BM(u, v)B = (4juj +W (u)−W (v))B.

The latter implies that BM(u, v)B = 0 iff W (v) = W (1u) which in turn is equivalent

to v = 1u. Now let us exploit these relations to prove the following:

Proposition Let k ∈ N be such that PCP with k “dominos” is undecidable. Then there

is no algorithm which upon input of a set S ⊂M3(/Z) of 2k + 1 integer matrices decides

whether or not S is mortal.

Proof Consider an undecidable PCP with k dominos and choose {2, 3} as a binary

alphabet for it. Denote by (xi, yi) with i = 1, . . . , k the pairs of words appearing in

the PCP . For each of these k dominos we define two matrices Mi := M(xi; yi) and

M ′i := M(xi, 1yi). So together with B these form a set S of 2k + 1 integer matrices.

40

Now assume that PCP has a solution w ∈ {1, . . . , k}′. Then

BM ′w1Mw2 · · ·Mw|w|B = 0

so the set S is mortal. Conversely, if S is mortal, then there is a product so that

BM(u1, v1)BM(u2, v2)B · · ·B = 0. Since B2 = B and each BM(ui, vi)B is a multiple of

B, the product can only be zero if for at least one i we have 1ui = vi. Observing that

ui ∈ {2, 3}′ this implies a solution for PCP.

Using that PCP is known to be undecidable for seven dominos, we obtain that matrix

mortality is undecidable for sets of fifteen 3 × 3 matrices. One can trade the number of

matrices with their dimension and show that matrix mortality is undecidable as well for

two 24 × 24 matrices. On the positive side, it is known that it is decidable for two 2 × 2

matrices and for instance for an arbitrary number of upper triangular 2 × 2 matrices.

Without such an additional constraint decidability is, however, not known already for

three 2× 2 matrices with integer coefficients.

41

Chapter 9

Hypercomputation

Hypercomputation or super-Turing computation refers to models of computation

that go beyond, or are incomparable to, Turing computability. This includes various

hypothetical methods for the computation of non-Turing-computable functions, following

super-recursive algorithms. The difference between super-Turing computation and Hy-

percomputation is super-Turing computation usually implies that the proposed model is

supposed to be physically realizable, while hypercomputation does not.

A computational model going beyond Turing machines was introduced by Alan Turing.

His paper investigated mathematical systems in which an oracle was available, which could

compute a single arbitrary (non-recursive) function from naturals to naturals. He used this

device to prove that even in those more powerful systems, undecidability is still present.

Turing’s oracle machines are strictly mathematical abstractions, and are not physically

realizable.

The ChurchTuring thesis states that any function that is algorithmically computable

can be computed by a Turing machine. Hypercomputers compute functions that a Turing

machine cannot, hence, not computable in the Church-Turing sense. An example of a

problem a Turing machine cannot solve is the halting problem. A Turing machine cannot

decide if an arbitrary program halts or runs forever. Some proposed hypercomputers can

simulate the program for an infinite number of steps and tell the user whether or not the

program halted.

42

Hypercomputation Proposals

There are many proposals for Hypercomputers but only a few are discussed here. Zeno

Machine, Oracle Machine and Real Computers are three famous proposals among which

Oracle machine is most important. Alan Turing himself proposed this model. It is one of

the oldest proposals of Hypercomputers. In this section we will cover these three proposals,

first we will see Zeno Machine and then move on to Oracle Machine and end with Real

Computers.

Zeno Machine

The idea of Zeno machines was first discussed by Hermann Weyl, they are named

after the ancient Greek philosopher Zeno of Elea. Computation method of this machine is

similar to the famous Zeno’s paradox. Zeno machines are also called Accelerated Turing

machine. They are a hypothetical computational model related to Turing machines that

allows a countably infinite number of algorithmic steps to be performed in finite time.

If we define formally, a Zeno machine is a Turing machine that takes 2−n units of

time to perform its n-th step thus, the first step takes 0.5 units of time, the second takes

0.25, the third 0.125 and so on, so that after one unit of time, countably infinite number

of steps will have been performed. There are quite a few things this little machine can

do which a turing machine cannot like solving halting problem. Keeping in mind that it

is just a hypothetical model given below is how Zeno’s Machine can solve Halting Problem.

Suppose we have to decide wether a given machine M halts on an input string w. We

construct an accelerated Turing machine M’ as defined above. It is given that w is turing

recognizable i.e there exists an algorithm to construct the given string . The constructed

machine M’ simulates M and and runs w on it. If M halts on w then M’ will not take more

than one unit time to halt. So if M’ halts within one unit time then M halts on string

w. Hence Halting problem becomes decidable in accelerated turing machine. As Godel’s

Theorem suggests it is true that the halting problem for Zeno machines is not solvable by

a Zeno machine itself. Therefore undecidability remains in this model of Hypercomputer.

As we have already seen that how powerful a computer becomes undecidable statements

will always exist.

43

Oracle Machine

In computability theory, an oracle machine is an abstract machine used to study

decision problems. It can be visualized as a Turing machine with a black box, called

an oracle, which is able to decide certain decision problems in a single operation. The

problem can be of any complexity class. Even undecidable problems, like the halting

problem, can be used.

An oracle machine is a Turing machine connected to an oracle. The oracle, in this

context, is thought of as an entity capable of answering some collection of questions, and

usually represented as some subset A of the natural numbers. Intuitively then, the oracle

machine can perform all of the usual operations of a Turing machine, and can also query

the oracle for an answer to a specific question of the form ”is x in A?”

Informal Definition

An oracle machine has

1. A work tape, a sequence of cells without beginning or end, each of which may

contain a B (for blank) or a 1.

2. A read/write head, which rests on a single cell of the work tape and can read the

data there, write new data, and move left or right along the tape

3. A control mechanism, which can be in one of a finite number of states, and which

will perform different actions like reading data, writing data, moving the control

mechanism, and changing states depending on the current state and the data being

read.

4. An oracle tape, on which an infinite sequence of B’s and 1’s is printed, correspond-

ing to the characteristic function of the oracle set A.

5. An oracle head, which can move left or right along the oracle tape reading data,

but which cannot write.

44

Formal definition

An oracle Turing machine is a 4-tuple M=(Q, δ, q0, F) where

1. Q is a finite set of states

2. δ : Q×{B, 1}2 −→ Q×{B, 1}× {L,R}2 is transition function, where L is left shift,

R is right shift.

3. q0 ∈ Q is the initial state

4. F ⊆ Q is the set of halting states.

The oracle machine is initialized with the work tape containing some input with finitely

many 1’s and the rest of the tape blank, the oracle tape containing the characteristic

function of the oracle, A, and the Turing machine in state q0 with read/write head reading

the first nonblank cell of the work tape, and oracle head reading the cell of the oracle

tape which corresponds to χA(0). Thereafter it operates according to δ: if the Turing

machine is currently in state q, the read/write head is reading a symbol S1, and the oracle

head is reading S2, then if δ(q, S1, S2) = (q′, S′1, D1, D2), the machine enters state q’, the

read/write head writes the symbol S′1 in place of S1, and then the read/write head moves

1 cell in direction D1 and the oracle head moves one cell in direction D2. At this point if

is a halting state, the machine halts, otherwise it repeats this same procedure.

We have seen the working of Oracle Turing Machine (OTM) now we will look how

oracle helps in solving certain decision problems. An OTM is a Turing Machine M that

has a special read-write tape called Ms oracle tape and three special states qquery , qyes

and qno apart from other states.

To execute M, we specify the input as usual and a language O ⊆ 0, 1* that is used as

an oracle for M. While performing its computation, if M enters the state qquery , then M

checks whether the contents of the oracle tape w ∈ O? If w ∈ O, M moves to the state qyes

, it moves to qno if q /∈ O. Regardless of the choice of O, a query like w ∈ O counts for a

single computational step of M. MO (x) denotes the output of the oracle turing machine

M on input x 0, 1* with O ∈ 0, 1* as the language.

45

Turing machines can compute a general functions as follows: if f is a function that

takes natural numbers to natural numbers, MA is a Turing machine with oracle A, and

whenever MA is initialized with the work tape consisting of n+1 consecutive 1’s (and blank

elsewhere) MA eventually halts with f(n) 1’s on the tape, then MA is said to compute the

function f. A similar definition can be made for functions of more than one variable, or

partial functions.

If there is an oracle machine M that computes a function f with oracle A, f is said

to be A-computable. If f is the characteristic function of a set B, B is also said to be

A-computable, and M is said to be a Turing reduction from B to A.

Halting problem

If we assume the existence of an oracle which computes a non-computable function,

such as the answer to the halting problem or some equivalent. A machine with an oracle is

a hypercomputer. But again the halting paradox still applies to such machines, although

they determine whether particular Turing machines will halt on particular inputs, they

cannot determine, in general, if machines equivalent to themselves will halt. This fact

creates a hierarchy of machines, called the arithmetical hierarchy, each with a more

powerful halting oracle and an even harder halting problem.

Consider a “super halting problem”, given a Turing machine with an oracle for the

halting problem, decide if it halts. We can prove that this super halting problem is

unsolvable, even given an oracle for the ordinary halting problem. We simply take Turing’s

original proof that the halting problem is unsolvable, and “shift everything up a level” by

giving all the machines an oracle for the halting problem. Everything in the proof goes

through as before.

Friedberg and Muchnik actually proved a stronger result that if there are two problems

A and B, both of which are solvable given an oracle for the halting problem, but neither

of which is solvable given an oracle for the other. These problems are constructed via an

infinite process whose purpose is to kill off every Turing machine that might reduce A to B

or B to A. The resulting problems are extremely contrived, they don’t look like anything

that might arise in practice.

46

Real Computers

In computability theory, the theory of real computation deals with hypothetical com-

puting machines using infinite-precision real numbers. They are given this name because

they operate on the set of real numbers. These hypothetical computing machines can

be viewed as idealised analog computers which operate on real numbers, whereas digital

computers are limited to computable numbers. In this section we will not not see the

working of a real computer. We will only get a rough idea of computable numbers. We

will also see how real computers or analog computers are more powerful than today’s

digital computer.

Computable numbers are also known as the recursive numbers or the computable

reals. they are the real numbers that can be computed to within any desired precision by

a finite, terminating algorithm. Turing machines or -calculus as the formal representation

of algorithms. The computable numbers form a real closed field and can be used in the

place of real numbers for many, but not all, mathematical purposes.

Definition using a Turing machine

“sequences of digits interpreted as decimal fractions between 0 and 1 is a computable

number if there exists a Turing machine which, given n on its initial tape, terminates with

the nth digit of that number.”

The key notions in the definition are that some n is specified at the start, for any n

the computation only takes a finite number of steps, after which the machine produces

the desired output and terminates.

By the definition it is clear that a real computer is more powerful than a turing machine

or is a hypercomputer beacuse it had to generate numbers that cannot be generated by a

turing machine’.

This is however not the modern definition which only requires the result be accurate to

within any given accuracy. The informal definition above is subject to a rounding problem

called the table-maker’s dilemma whereas the modern definition is not.

47

Formal Definition

A real number a is said to be computable if it can be approximated by some computable

function in the following manner: given any integer n ≥ 1, the function produces an integer

k such that:

k−1n ≤ a ≤

k+1n

There is another equivalent definition called ε approximation of computable numbers.

There exists a computable function which, given any positive rational error bound ε,

produces a rational number r such that:

|r − a| ≤ ε

The set of real numbers is uncountable, the set of computable numbers is only countable

and thus almost all real numbers are not computable.

The arithmetical operations on computable numbers are themselves computable in the

sense that whenever real numbers a and b are computable then the following real numbers

are also computable: a + b, a - b, ab, and a/b if b is nonzero, for example, there is a

Turing machine which on input (A,B,ε) produces output r, where A and B are Turing

machines approximating a and b respectively, and r is an ε approximation of a+b.

The least upper bound of a bounded increasing computable sequence of computable

real numbers need not be a computable real number.

The order relation on the computable numbers is not computable and same holds for

the equality relation. There is no Turing machine which on input A (the description of a

Turing machine approximating the number a) outputs YES if a > 0 and and NO if a ≤ 0.

Suppose the machine described by A keeps outputting 0 as approximations. It is not clear

how long to wait before deciding that the machine will never output an approximation

which forces a to be positive. Thus the machine will eventually have to guess that the

number will equal 0 but the sequence may later become different from 0.

48

While the full order relation is not computable, the restriction of it to pairs of unequal

numbers is computable. That is, there is a program that takes an input two Turing

machines A and B approximating numbers a and b respectively, where a 6= b, and outputs

whether a < b or a > b. It is sufficient to use ε-approximations where ε < |b−a|2 ; so by

taking increasingly small ε , one eventually can decide whether a < b or a > b.

Halting problem

In real computers is it hard to show that Halting problem for TM is decidalbe by

simulating the working of TM in it. So instead of doing this we will logically show that

a real computer can solve halting problem of TM. We alredy know the fact that these

hypothetical model can generate real numbers with infinite precision i.e. that are not

computable. Uncomputable numbers are basically those numbers for which the machine

generating them can easily (in finite time) compute countably infinite steps; in this case,

by above definitions a computer can actually compute till n→∞, hence acquiring infinite

precision.

As we know halting problem is uncomputable so there must exist an algorithm on

hypercomputaion level or a real computer that can generate it in finite time. This implies

that a Real computer can infact solve halting problem. We can also see its proof using

Godel’s numbering system, which is omitted here. Also by Godel’s Incompleteness theorem

we can say there must exist some undecidable problems in this proposed model of a

Hypercomputer.

49

Bibliography

[1] Undecidable Problem

http : //en.wikipedia.org/wiki/Undecidable problem

[2] Entscheidungsproblem

http : //en.wikipedia.org/wiki/Entscheidungsproblem

[3] Godel’s incompleteness theorems

http : //en.wikipedia.org/wiki/Godel incompleteness theorem

[4] Post correspondence problem

http : //en.wikipedia.org/wiki/Post correspondance problem

[5] Halting problem

http : //en.wikipedia.org/wiki/Halting problem

[6] Hilbert’s tenth problem

http : //en.wikipedia.org/wiki/Hilbert′s tenth problem

[7] Wang’s Tile

http : //en.wikipedia.org/wiki/Wang tile

[8] Busy Beaver

http : //en.wikipedia.org/wiki/Busy beaver

[9] Mortality Problem

http : //en.wikipedia.org/wiki/Mortality computability theory

[10] Undecidable problems in group theory, by George S.Sacerdote, Proceedings of the

American Mathematical Society, Volume 36, Number 1, November 1972.

50

[11] List of statements undecidable in ZFC

http : //en.wikipedia.org/wiki/List of statements undecidable in ZFC

[12] Lecture on Undecidability, Michael M. Wolf , June 27, 2012

[13] Hypercomputation

http : //en.wikipedia.org/wiki/Hypercomputation

[14] Zeno machines and hypercomputation

http : //www.sciencedirect.com/science/article/pii/S0304397505009011

[15] Undecidable problems in semigroups theory Honours Project, by L.Konstantinovskiy

[16] Elements of Theory of Computation, by Harry R. Lewis and Christos H. Papdim-

itriou, Second Ed., Prentice - Hall

[17] Introduction to Automata Theory, Languages, and Computation, by John Hopcroft,

Rajeev Motwani, Jeffrey Ullman, Third Ed., 1979, Pearson

[18] Computation Beyond Turing Machines, Peter Wegner, Dina Goldin

[19] Undecidable Problems - A Sampler, by Bjorn Poonen

[20] Lectures on Turing Machine, by Robb T. Koether, 2009

[21] Lectures on Halting Problem, by Costas Busch, 2006

[22] Godel’s incompleteness theorem - An incomplete guide to its use and abuse,

by Torkel Franzen, 2005, A. K. Peters Ltd.

[23] The myth of Hypercomputation, by Martin Davis

[24] Limits of Computation: Undecidable Problems, by D. Gorse

[25] Accelerating Turing Machines, by B. Jack Copeland

51

Date post:	30-Oct-2014
Category:	Documents
Upload:	oajasavee-mourya
View:	745 times
Download:	4 times

Fundamentals of Undecidability in Computational Theory

Documents