Date post: | 30-Oct-2014 |
Category: |
Documents |
Upload: | oajasavee-mourya |
View: | 745 times |
Download: | 4 times |
Department of Mathematics and StatisticsIndian Institute of Technology
Fundamentals of Undecidabilityin Computational Theory
MTH 401 Theory of ComputationProject Report
Arcchit Jain 10142Minnie Kabra 10400Oajasavee K. Mourya 10469Ved Gupta 10790
Contents
1 Preliminaries 2
2 Godel’s Incompleteness Theorem 4
3 Post Correspondance Problem 12
4 Halting Problem 16
5 Hilbert’s Tenth Problem 20
6 Wang’s Tile Problem 26
7 Busy Beaver 30
8 Some More Undecidable Problems 35
9 Hypercomputation 42
1
Chapter 1
Preliminaries
Definition Turing Machine
A Turing Machine is a machine described by
1. A finite set of states (represented by Q).
2. An input alphabet (represented by Σ).
3. A tape alphabet (represented by Γ; where Γε Σ).
4. A transition function (represented by δ).
5. A start state (represented by q0; always belongs to Q).
6. A blank symbol (represented by B; always belongs to Γ and Σ).
7. A set of final states (represented by F ; F ⊆ Q).
The transition function takes two arguments, a state in Q and a tape symbol in Γ. It is
either undefined or a triplet of the form of (p, Y,D). where
• p is a state (pε Q),
• Y is a new tape symbol (Y ε Γ), and
• D is a direction in which the header would move with respect to the tape.
2
Definition Decidable Language
A language for which there exists a Turing machine which accepts all the strings in the
language and no other string and halts for every input string is called a Decidable language
or Turing acceptible langugae or Recursive language.
Definition Partially Decidable Language
A language for which there exists a Turing machine which accepts all the strings in the
language and no other string but halts for every accepted input string and does not halts
for rejected strings is called a Partially Decidable language or Turing recognizible language
or Recursive enumerable language.
Definition Undecidable Language
A language for which there does not exists any Turing machine which
1. would accept all the strings in the language and no other string, and
2. does not halts for every input string
is called an Undecidable language or Non-Recursively enumerable language.
Definition Undecidable Problem
An Undecidable Problem is a decision problem for which it is impossible to construct a
single algorithm that always leads to a correct yes-or-no answer.
3
Chapter 2
Godel’s Incompleteness Theorem
Kurt Friedrich Godel was an Austrian American logician, mathematician, and
philosopher. Godel made an immense impact upon scientific and philosophical thinking in
the 20th century. Godel is best known for his two incompleteness theorems, published in
1931. He also showed that neither the axiom of choice nor the continuum hypothesis can
be disproved from the accepted axioms of set theory, assuming these axioms are consistent.
This result opened the door for mathematicians to assume the axiom of choice in their
proofs.
Godel’s Incompleteness Theorems are two theorems of mathematical logic that estab-
lish inherent limitations of all but the most trivial axiomatic systems capable of doing
arithmetic. In this section we will see philosophical significance of Incompleteness The-
orems and how they are related to Liar’s Paradox, Consistency of formal systems and
Undecidability. We will also cover proof of First Incompleteness Theorem. Few examples
supporting Godel’s Theorem will discussed towards the end of this section.
The two Incompleteness Theorems are widely interpreted as showing that Hilbert’s
program to find a complete and consistent set of axioms for all mathematics is impossible,
giving a negative answer to Hilbert’s second problem. Before we move forward we need to
know what is Hilbert’s Second Problem and how is it related to Godel’s Incompleteness
Theorem.
4
HILBERT’S SECOND PROBLEM
Hilbert’s problems form a list of 23 problems in mathematics published by David
Hilbert in 1900. But here we are only interested in Hilbert’s Second Problem, that is,
When we are engaged in investigating the foundations of a science, we must set up a system
of axioms which contains an exact and complete description of the relations subsisting
between the elementary ideas of that science. But above all I wish to designate the following
as the most important among the numerous questions which can be asked with regard to
the axioms: To prove that they are not contradictory, that is, that a definite number of
logical steps based upon them can never lead to contradictory results. In geometry, the
proof of the compatibility of the axioms can be effected by constructing a suitable field of
numbers, such that analogous relations between the numbers of this field correspond to
the geometrical axioms. On the other hand a direct method is needed for the proof of the
compatibility of the arithmetical axioms.
In order to understand what Hilbert meant by his problem there are few definitions we
need to know. To understand its soulution we need to have clear idea of both of the Godel’s
Incompleteness Theorem. After doing all that, we would relate undecidability to Godel’s
theorems and try to reason the occurance of undecidable sets, statements, Problems etc.
Later we will see that what Godel suggested is that there cannot exist any such
axiomatic sytstem, or even if it exists we can never prove that it is consistent!
Definition Axiomatic System
Axiomatic system is any set of axioms from which some or all axioms can be used in
conjunction to logically derive theorems. A mathematical theory consists of an axiomatic
system and all its derived theorems. An axiomatic system that is completely described is
a special kind of formal system.
5
Definition Formal Systems
Formal systems in mathematics consist of the following elements:
1. A finite set of symbols known as alphabet, that can be used for constructing formulae.
2. A grammar, which tells how well-formed formulae are constructed out of the symbols
in the alphabet. It is usually required that there be a decision procedure for deciding
whether a formula is well formed or not.
3. A set of axioms where each axiom must be a well-formed formula.
4. A set of inference rules.
Now we will see some popular axomatic systems we use very frequently in mathematics.
We will not go in much detail about these axiomatic systems. Minor references about these
axiomatic systems may be used later.
Peano axioms
Peano axioms are set of axioms for Natural numbers. They are widely used in Number
theory. It consists of 9 axioms. Equality relation, (=) and Successor function, S(n) are
used in these axioms.
1. 0 is a natural number.
2. For every natural number x, x = x.
3. For all natural numbers x and y, if x = y, then y = x.
4. For all natural numbers x, y and z, if x = y and y = z, then x = z.
5. For all a and b, if a is a natural number and a = b, then b is also a natural number.
6. For every natural number n, S(n) is a natural number.
7. For every natural number n, S(n) = 0 is false.
8. For all natural numbers m and n, if S(m) = S(n), then m = n.
9. If K is a set such that, 0 is in K, and for every natural number n, if n is in K, then
S(n) is in K, then K contains every natural number.
6
Zermelo Fraenkel set Theory with the Axiom of Choice or ZFC
ZFC System is commonly used in Set Theory. It also consists of 9 axioms.
1. Two sets are equal (are the same set) if they have the same elements.
∀ x ∀ y [ ∀ z (z ∈ x⇔ z ∈ y)⇒ x = y].
2. Every non-empty set x contains a member y such that x and y are disjoint sets.
∀ x [ ∃ a (a ∈ x)⇒ ∃ y (y ∈ x ∧ ¬∃ z(z ∈ y ∧ z ∈ x))].
3. If z is a set, and φ is any property which may characterize the elements x of z, then
there is a subset y of z containing those x in z which satisfy the property.
∀ z ∀ w1 . . . wn∃ y ∀ x [x ∈ y ⇔ (x ∈ z ∧ φ)].
4. If x and y are sets, then there exists a set which contains x and y as elements.
∀ x ∀ y ∃ z (x ∈ z ∧ y ∈ z).
5. For all F there is a A containing every set that is a member of some member of F .
∀ F ∃ A∀ Y ∀ x [(x ∈ Y ∧ Y ∈ F)⇒ x ∈ A].
6. Let φ be any formula in the language of ZFC whose free variables are among
x,y,A,w1, . . . , wn, so that in particular B is not free in φ. Then:
∀ A∀ w1, . . . , wn
[∀ x(x ∈ A⇒ ∃y φ)⇒ ∃ B∀ x
(x ∈ A⇒ ∃ y(y ∈ B ∧ φ)
)].
7. Let S(x) abbreviate x ∪{x}, where x is some set. Then there exists a set X such
that the empty set ∅ is a member of X and, whenever a set y is a member of X,
then S(y) is also a member of X.
∃ X [∅ ∈ X ∧ ∀ y(y ∈ X ⇒ S(y) ∈ X)] .
8. Let z ⊆ x abbreviate ∀q(q ∈ z ⇒ q ∈ x). For any set x, there is a set y which is a
superset of the power set of x. The power set of x is the class whose members are
all of the subsets of x.
∀ x ∃ y ∀ z [z ⊆ x⇒ z ∈ y].
9. For any set X, there is a binary relation R which well-orders X. This means R is
a linear order on X such that every nonempty subset of X has a member which is
minimal under R.
∀ X ∃ R (Rwell-ordersX).
7
GODEL’S INCOMPLETENESS THEOREM
After getting a basic idea of what axiomatic systems are we can now move forward
and see what incompleteness theorem says.
FIRST INCOMPLETENESS THEOREM
Any effectively generated theory capable of expressing elementary arithmetic cannot be
both consistent and complete. In particular, for any consistent, effectively generated formal
theory that proves certain basic arithmetic truths, there is an arithmetical statement that
is true but not provable in the theory.
It may seem a bit difficult at first look but the theorems are easy to understand. The
explaination for first incompleteness theorem is given below.
The true but unprovable statement referred to by the theorem is often referred to as
”the Godel sentence” for the theory. The proof constructs a specific Godel sentence for
each effectively generated theory, but there are infinitely many statements in the language
of the theory that share the property of being true but unprovable. For example, the
conjunction of the Godel sentence and any logically valid sentence will have this property.
For each consistent formal theory T having the required small amount of number
theory, the corresponding Godel sentence G asserts: ”G cannot be proved within the
theory T”. This interpretation of G leads to the following informal analysis. If G were
provable under the axioms and rules of inference of T , then T would have a theorem, G,
which effectively contradicts itself, and thus the theory T would be inconsistent. This
means that if the theory T is consistent then G cannot be proved within it, and so the
theory T is incomplete. Moreover, the claim G makes about its own unprovability is
correct. In this sense G is not only unprovable but true, and provability within the theory
T is not the same as truth.
Each effectively generated theory has its own Godel statement. It is possible to define
a larger theory T that contains the whole of T , plus G as an additional axiom. This will
not result in a complete theory, because Godel’s theorem will also apply to T , and thus T
8
cannot be complete. In this case, G is indeed a theorem in T , because it is an axiom. Since
G states only that it is not provable in T , no contradiction is presented by its provability
in T . However, because the incompleteness theorem applies to T , there will be a new
Godel statement G for T , showing that T is also incomplete. G will differ from G in that
G will refer to T , rather than T .
Godel’s first incompleteness theorem shows that any consistent effective formal system
that includes enough of the theory of the natural numbers is incomplete, there are true
statements expressible in its language that are unprovable. A system may be incomplete
simply because not all the necessary axioms have been discovered. For example, Euclidean
geometry without the parallel postulate is incomplete; it is not possible to prove or disprove
the parallel postulate from the remaining axioms. This will be discussed later towards the
end of topic. Godel’s theorem shows that, in theories that include a small portion of
number theory, a complete and consistent finite list of axioms can never be created, nor
even an infinite list that can be enumerated by a computer program. Each time a new
statement is added as an axiom, there are other true statements that still cannot be proved,
even with the new axiom. If an axiom is ever added that makes the system complete, it
does so at the cost of making the system inconsistent.
There are complete and consistent lists of axioms for arithmetic that cannot be enu-
merated by a computer program. For example, one might take all true statements about
the natural numbers to be axioms (and no false statements), which gives the theory known
as “true arithmetic”. The difficulty is that there is no mechanical way to decide, given
a statement about the natural numbers, whether it is an axiom of this theory, and thus
there is no effective way to verify a formal proof in this theory.
There is another example of Godel’s Incompleteness Theorem called Liar paradox. The
liar paradox is the sentence “This sentence is false.” An analysis of the liar sentence shows
that it cannot be true nor can it be false . A Godel sentence G for a theory T makes a
similar assertion to the liar sentence, but with truth replaced by provability: G says “G is
not provable in the theory T .” It is not possible to replace “not provable” with “false” in
a Godel sentence because the predicate “Q is the Godel number of a false formula” cannot
be represented as a formula of arithmetic.
9
SECOND INCOMPLETENESS THEOREM
For any formal effectively generated theory T including basic arithmetical truths and
also certain truths about formal provability, if T includes a statement of its own consistency
then T is inconsistent.
This strengthens the first incompleteness theorem, because the statement constructed
in the first incompleteness theorem does not directly express the consistency of the theory.
A technical subtlety in the second incompleteness theorem is how to express the consistency
of T as a formula in the language of T . There are many ways to do this, and not all of
them lead to the same result. In particular, different formalizations of the claim that T
is consistent may be inequivalent in T , and some may even be provable. For example,
first-order Peano arithmetic (PA) can prove that the largest consistent subset of PA is
consistent. But since PA is consistent, the largest consistent subset of PA is just PA, so in
this sense PA “proves that it is consistent”. What PA does not prove is that the largest
consistent subset of PA is, in fact, the whole of PA.
For any familiar explicitly axiomatized theory T, it is possible to canonically define a
formula Con(T ) expressing the consistency of T. The formalization of Con(T ) depends on
two factors, formalizing the notion of a sentence being derivable from a set of sentences
and formalizing the notion of being an axiom of T.
Godel’s second incompleteness theorem also implies that a theory T1 satisfying the
technical conditions outlined above cannot prove the consistency of any theory T2 which
proves the consistency of T1. This is because such a theory T1 can prove that if T2 proves
the consistency of T1, then T1 is in fact consistent. For the claim that T1 is consistent has
form “for all numbers n, n has the decidable property of not being a code for a proof of
contradiction in T1”. If T1 were in fact inconsistent, then T2 would prove for some n that
n is the code of a contradiction in T1. But if T2 also proved that T1 is consistent (that
is, that there is no such n), then it would itself be inconsistent. This reasoning can be
formalized in T1 to show that if T2 is consistent, then T1 is consistent. Since, by second
incompleteness theorem, T1 does not prove its consistency, it cannot prove the consistency
of T2 either.
10
Undecidable statements
There are two distinct senses of the word “undecidable” in mathematics and computer
science. The first of these is the proof-theoretic sense used in relation to Godel’s theorems,
that of a statement being neither provable nor refutable in a specified deductive system.
The second sense is used in relation to computability theory and applies not to statements
but to decision problems, which are countably infinite sets of questions each requiring a
yes or no answer. Such a problem is said to be undecidable if there is no computable
function that correctly answers every question in the problem set.
The two concrete examples of undecidable statements,
1. The continuum hypothesis can neither be proved nor refuted in ZFC.
2. The axiom of choice can neither be proved nor refuted in ZF (which is all the ZFC
axioms except the axiom of choice).
Godel’s incompleteness theorems struck a fatal blow to David Hilbert’s second problem,
which asked for a finitary consistency proof for mathematics. The second incompleteness
theorem, in particular, is often viewed as making the problem impossible. Not all mathe-
maticians agree with this analysis, however, and the status of Hilbert’s second problem is
not yet decided.
A turing machine is more like an axiomatic system (with enough complexity) with all
its transition function as its axioms and all strings it derives are theorems. A grammar
and a set of turing machines defines the complete structure of a formal system. Now
incompleteness theorem suggests that there will be certain strings which cannot be derived
by any machine. Therefore undecidable strings are inherent in nature and they exist as
basic characterstic of system. As we have seen the reason of occurance of undecidable sets
we can now move forward and explore some famous undecidable problems and see some
interesting results.
11
Chapter 3
Post Correspondance Problem
Definition Post Correspondance Problem(PCP)
The PCP problem was introduced by Emil Post in 1946 which he proved to be undecidable.
Halting problem is tougher to understand that’s why PCP is used to in proofs of several
other undecidable problems.
Definition of the problem
Input: A finite collection of blocks, labelled as above.
Question: Given an unlimited supply of copies of these particular blocks, can one form
a nonempty finite sequence of these for which the concatenation of the top strings equals
the concatenation of the bottom strings?
Example Consider two languages, A and B which contains threee srings each.
A B
a 10 101
b 011 11
c 101 011
12
In this particular problem we will never be able to find a string that is exactly the same.
Consider another example where the languages are as follows:
A B
a 1 111
b 10111 10
c 10 0
For this particular problem, “baac” is the string that works. Check:
A : 10111 + 11 + 10 = 101111110
B : 10 + 111 + 111 + 0 = 101111110
One more interesting property is that if a PCP has a solution then it has infinite solutions.
In this case the string “baac” can be infinitely many times to get the desired result.
Formal Definition
Input: Two set of n strings,
A = W1,W2, ......Wn
B = V1, V2, ......Vn
Then there is a solution to Post Correspondance Problem if there is a sequence i, j, . . . k
such that:
WiWj . . .Wk = ViVj . . . Vk
where indices may be repeated or omitted.
Using Post Correspondance Theorem to prove undecidability of other prob-
lems:
1. To Prove: There does not exists an algorithm to find whether a language is am-
biguous.
Proof : We reduce this sproblem from Post′s Correspondence Problem. Suppose
we can, in fact, decide the language {〈G〉|G is a CFG and L(G) is ambiguous}.
13
Given α1, . . . , αm, β1, . . . , βm:
Construct the following CFG G = (V,Σ, R, S) where
V = {S, S1, S2},R = {S → S1 | S2, S1 → α1Sσ1 | α2Sσ2 | . . . | αmSσm, S2 → β1Sσ1| . . . | βmSσm}(where σi are new characters added to the alphabet, e.g., σi = i).
If the language is ambiguous, then there is a derivation of some string w in two
different ways. Without loss of generality, let us suppose that the derivations both
start with the rule S → S1, reading the new characters backwards until they end
makes sure there can only be one derivation, so that’s not possible. Hence, we see
that the only ambiguity can come from one S1 and one S2 ‘start’. But then, taking
the substring of w up to the beginning of the new characters, we have a solution to
the PCP (since the strings of indices used after those points match).
Similarly, if there is no ambiguity, then the PCP cannot be solved, since a solution
would imply an ambiguity that just follows
S ⇒ S1 ⇒∗ ασ and S ⇒ S2 ⇒∗ βσ, where α = β are strings of matching α’s and
β’s (since the σ’s match).
Hence, we’ve reduced to PCP, and since that’s undecidable, we have shown that
there cannot exist any algorithm to find whether a given language is ambiguous.
2. To Prove: Given two CFLs, the problem of deciding whether the CFLs are disjoint
or not is undecidable.
Proof : To solve this problem we will require a lemma as given below:-
Lemma: The set of valid computations of a turing machine M is the intersection of
two CFL’s L1 and L2, and grammars for these CFL’s can be effectively constructed
from M .
Let G1andG2 be the two given grammars, and we have to check whether we can
say that L(G1) ∩ L(G2) is empty or not. Using the lemma we can construct from
M grammners G1andG2 such that L(G1) ∩ L(G2) is the set of valid computations
of M . If there is an algorithm A to tell whether the intersection of the languages of
two CFG’s is empty, we can construct an algorithm to B to tell whether L(M) = Φ
for arbitrary TM M . Simply design B to construct G1 and G2 from M as in lemma,
14
then apply Algorithm A to tell whether L(G1)∩L(G2) is empty. If the intersectio is
empty, then there are no valid computations of M, so L(M) = Φ. If the intersection
is not empty, L(M) 6= Φ. That is, the problem of emptiness for r.e. sets reduces to
the problem of intersection for CFG’s.
Algorithm B cannot exist, however, since L(M) = Φ is undecidable by Rice Theo-
rem. Therefore A does not exist, so it is undecidable whether the intersection of two
CFL’s is empty.
3. To Prove: For a given CFL, it is undecidable to check whether L = Σ∗
Proof : Let us suppose that the problem of decideing L = Σ∗ is decidable. Take
L = (L(G1) ∩ L(G2))∗. It means the problem of decideing L(G1) ∩ L(G2) = Φ is
decidable. But by previous result, it is undecidable. Hence our assumption was
wrong. Thus, it is undecidable for a CFG G whether L(G) = Σ∗.
4. To Prove: For any given CFG, it is undecidable to check whether its language equal
to a regular set?
Proof : We know that Σ∗ is a regular set. So if we can decide whether a language
L is equal to a regular set, then we can decide that whether L = Σ∗ but we have
proved previously that this problem is undecidable.
5. To Prove: For any two given CFLs L1 and L2, check if L1 = L2
Proof : Let L1 = L(G1) and L2 = L(G2). Fix G2 be the grammar generating
Σ∗, where Σ is the terminal symbol alphabet of G1. Then it is equivalent to decide
whether L1 = Σ∗, which is undecidable. Hence the given problem is undecidable.
6. To prove: Given two CFLs L1andL2, it is undecidable to check if L2 ⊆ L1
Proof : The proof is same as the above proof.
7. To Prove: Given a regular set R and a CFL, L, it is undecidable to check if R ⊆ L
Proof : Take R = Σ∗ then the theorem is reduced to one of the previously seen
proofs. Hence it is also undecidable.
15
Chapter 4
Halting Problem
Definition Halting Problem
Given a description of an arbitrary computer program, decide whether the program finishes
running or continues to run forever.
This is equivalent to the problem of deciding, given a program and an input, whether
the program will eventually halt when run with that input, or will run forever.
In 1936, Alan Turing proved that a general algorithm to solve the halting problem
for all possible program - input pairs cannot exist. In other words, he proved that the
Halting Problem is undecidable.
Suppose that there exists a program, that takes as input a program M and an input
w of that program. Let us assume that the program always determines correctly whether
the program M would halt on input w (it would return“yes” if it does), or whether it
would run forever (it would then return “no”). Let us call this program H(M,w).
Now, we use H(M,w) to write another program, with the name D(w), such that D(w)
halts if H(w,w) does not halt. Since the program H(M,w) has been assumed to halt (get
result) on each input string, so D(w) would not halt for any input string w.
And now comes the unanswerable question: Does D(D) halts? It would halt if and
only if the call H(D,D) returns “no”. In other words, it would halt if and only if it would
16
not halt. This is a contradiction: we must conclude that the only hypothesis that started
us on this path is false, that program H(M,w) does not exist. Thus, there does not exists
any program or algorithm for solving if the problem H would solve: to decide whether
arbitrary programs would halt or loop.
We have a full-fledged notation for algorithms: Turing machines. We are thus ready
to define a recursively enumerable language, and prove that it is partially decidable. Let
H = {(M, w) : Turing machine M halts on input string w}.
Notice first that H is recursively enumerable: On input (M,w), universal Turing machine
U halts precisely when the input is in H.
Furthermore, if H is recursive, then every recursively enumerable language is recursive.
In other words, all recursively enumerable languages are also decidable if and only if H
is recursive. For suppose that it is indeed decided by some Turing machine M0. Then
given any particular Turing machine M which partially decides a language L(M), we could
design a Turing machine M′
that fully decides L(M).
Theorem Undecidability of Halting Problem
Let ATM = {〈M,w〉|M is a TM and M accepts w}. Then ATM is undecidable.
Proof In a proof by contradiction it will be assumed that ATM is decidable. Suppose
that H is a decider for ATM , i.e.,
H(〈M,w〉) =
{accepts if M accepts w
rejects if M does not accept w
A new Turing machine D with H as a subroutine will now be constructed.
The input to D is a description of a Turing machine M . This information is sent to Turing
machine H which determines what M does when the input to M is its own description.
Once D has determined this information, it does the opposite, i.e., it rejects if M accepts
and accepts if M does not accept:
D(〈M〉) =
{accepts if M does not accepts 〈M〉rejects if M accepts 〈M〉
17
In the case when D is run with its own description the following is obtained:
D(〈D〉) =
{accepts if D does not accepts 〈D〉rejects if D accepts 〈D〉
The Halting Problem is noncomputable, but it is an important problems. It is useful
to know if a procedure application will terminate in a reasonable amount of time, but the
Halting Problem does not answer that question. It concerns the question of whether the
procedure application will terminate in any finite amount of time, no matter how long it is.
Virus detection
A virus is a program that infects other programs. A virus spreads by copying its own
code into the code of other programs, so when those programs are executed the virus will
execute. In this manner, the virus spreads to infect more and more programs. A typical
virus also includes a malicious payload so when it executes in addition to infecting other
programs it also performs some damaging (corrupting data files) or annoying (popping
up messages) behavior. The Is-Virus Problem is to determine if a procedure specification
contains a virus:
Input: A specification of a Python program.
Output: If the expression contains a virus (a code fragment that will infect other files)
output True. Otherwise, output False.
We demonstrate the Is-Virus Problem is noncomputable, we show how to define a halts
algorithm given a hypothetical isVirus algorithm. Since we know halts is noncomputable,
this shows there is no isVirus algorithm.
18
Assume infectFiles is a procedure that infects files, so the result of evaluating
isV irus(‘infectF iles()′) is True. We could define halts as:
def halts(p) :
return isV irus(p+′; infectF iles()′)
This works as long as the program specified by p does not exhibit the file-infecting
behavior. If it does, p could infect a file and never terminate, and halts would produce
the wrong output. To solve this we need to hide the printing behavior of the original
program. A rough definition of file-infecting behavior would be to consider any write to
an executable file to be an infection. To avoid any file infections in the specific program,
we replace all procedures that write to files with procedures that write to shadow copies
of these files. For example, we could do this by creating a new temporary directory and
prepend that path to all file names. We call this (assumed) procedure, sandBox, since it
transforms the original program specification into one that would execute in a protected
sandBox.
def halts(p) : isV irus(sandBox(p)+′; infectF iles()′)
Since we know there is no algorithm that solves the Halting Problem, this proves that
there is no algorithm that solves the Is-Virus problem.
Virus scanners such as Symantec’s Norton AntiVirus attempt to solve the Is-Virus
Problem, but its non-computability means they are doomed to always fail. Virus scanners
detect known viruses by scanning files for strings that match signatures in a database of
known viruses. As long as the signature database is frequently updated they may be able
to detect currently spreading viruses, but this approach cannot detect a new virus that
will not match the signature of a previously known virus.
Sophisticated virus scanners employ more advanced techniques to attempt to detect
complex viruses such as metamorphic viruses that alter their own code as they propagate
to avoid detection. But, because the general Is-Virus Problem is noncomputable, we know
that it is impossible to create a program that always terminates and that always correctly
determines if an input procedure specification is a virus.
19
Chapter 5
Hilbert’s Tenth Problem
The statement of Hilbert’s tenth problem is as follows :
Given a Diophantine equation with any number of unknown quantities and with
rational integral numerical coefficients: To devise a process according to which it can be
determined in a finite number of operations whether the equation is solvable in rational
integers.
The undecidability of the existence of integral roots turned out to be the consequence
of a deep equivalence.
Definition Diophantine Predicates and Relations
A predicate P on Nk is called diophantine iff there is an n ∈ N and a polynomial p with
integer coefficients in k + n variables, such that P (x)⇐⇒ ∃y ∈ Nn : p(x, y) = 0.
A set or relation S ⊂ Nk is called diophantine iff x ∈ S is a diophantine predicate.
A function f : Nk → N is called diophantine iff its graph (f(x), x) ⊂ Nk+1 is a diophantine
set.
The definition allows for polynomials of arbitrary (though finite) degree. Diophantine
problems have fewer equations than unknown variables and involve finding integers that
work correctly for all equations.
20
The following trick by Skolem, shows that we can trade the degree with the number
of variables to the extent that finally we may restrict ourselves to polynomials of degree
at most four.
Lemma If S ⊂ Nk is a diophantine set, then there is an m ∈ N and a polynomial q in
m + k variables with integer coefficients and of degree at most four such that S = {x ∈Nk| ∃ z ∈ Nm : q(x, z) = 0}.
Proof : By assumption there is a polynomial p with integer coefficients such that S =
{x ∈ Nk| ∃ xy ∈ Nn : p(x, y) = 0}. The construction of the new polynomial is then
done recursively: for all monomials in p which have degree larger than two, introduce new
variables u1, u2, . . . defined as a product of the first two variables of the corresponding
monomial. Inserting the new variables then leads to a new polynomial p1(x, y, u) whose
maximal degree is one less than that of p and p(x, y) = p1(x, y, u) if we impose the defining
constraints for the uis. Iterating this procedure, we can obtain a sequence of polynomial in
more and more variables which eventually is at most quadratic in all variables. Suppose
pn is this quadratic polynomial. The imposed constraints on the new variables which
guarantee that p(x, y) = pn(x, y, u, . . .) can now be formulated in terms of the existence
of integral roots of quadratic polynomials with integer coefficients.
Denote those polynomials by c1, c2, . . .. That is, if we have for instance u1 := x2y7, u2 :=
u1x2, then we define c1(u1, x2, y7) := u1−x2y7 and c2(u2, u1, x2) := u2−u1x2. In this way,
we achieve that p(x, y) = 0 iff ∃u : p2n + Σic2i = 0, where we denote by u the collection of
all variables added to x and y. Hence, the polynomial q := p2n + Σic2i leads to the sought
biquadratic characterization of the diophantine set S.
Proposition Every diophantine set is recursively enumerable.
Proof : Let p(x, y) be a characterizing polynomial for the diophantine set. The statement
follows from observing that p(x, y) = 0 is a primitive recursive predicate which we call
P (x, y) and from recalling that a recursively enumerable set S is exactly one for which
there is a primitive recursive predicate for which S = {x|∃ y : P (x, y)}.
21
A basic property of recursively enumerable sets is that the class is closed w.r.t. unions
and intersections. This is easily seen to hold also for diophantine sets:
Proposition: The class of diophantine predicates is closed w.r.t. (i) conjunction, (ii)
disjunction and (iii) the use of existential quantifiers.
Proof : Let upper case P ′s be diophantine predicates and lower case ps their characterizing
polynomials. Then
P1(x) ∧ P2(x) ⇐⇒ ∃ y1, y2 : p1(x, y1)2 + p2(x, y2)
2 = 0
P1(x) ∨ P2(x) ⇐⇒ ∃ y1, y2 : p1(x, y1)p2(x, y2) = 0,
x2 : P (x1, x2) ⇐⇒ ∃ y, x2 : p(x1, x2, y) = 0
Now, we can go through the proof of the fact that every Turing computable function
is recursive. Since the encoding of an arbitrary Turing machine in terms of recursive
functions uses only functions for which we now known that they are diophantine, one can
with a little bit of effort see that the predicate “the Turing machine halts” is a diophantine
predicate. Following these lines leads to two remarkable consequences:
Theorem Every recursively enumerable set is diophantine.
Theorem There is an n ∈ N and a polynomial p with integer coefficients such that for
any recursively enumerable set S ⊆ N there exists an s ∈ N so that
S = {x ∈ N | ∃ y ∈ Nn : p(s, x, y) = 0}.
Denoting by (n, d) the number of variables and the maximal degree of a polynomial, then
there are universal polynomials known for (n, d) = (58, 4) (note that the possibility of
having d = 4 follows from Skolem’s trick) to (n, d) = (9, 1.6× 1045).
The fact that diophantine sets and recursively enumerable sets are the same leads to
the sought undecidability of Hilbert’s tenth problem:
22
Corollary Let P be the class of polynomials with integer coefficients and of degree at most
four. Then,
1. There is no algorithm which upon input of any element p ∈ P decides whether or
not p has an integral root, and
2. there is no algorithm which upon input of any element p ∈ P decides whether or not
p has a non-negative integral root.
Proof : Assume there would be an algorithm for deciding integral roots. Then there
would be one for non-negative integral roots as well, since we can exploit the Lagrange
four square theorem to the end that 0 ∈ p(Nn)⇔ 0 ∈ p′(Z4n).
Then for any diophantine set S = {x ∈ Nk | ∃ y ∈ Nm : p(x, y) = 0}, the hypothetical
algorithm could be used in order to decide x ∈ S for any x. In other words, every
diophantine set would be a recursive set. However, we known that there are non-recursive
sets within the recursively enumerable sets. An since the latter are exactly the diophantine
sets, the assumption of such an algorithm leads to a contradiction. The fact that we can
restrict ourselves to degree at most four follows from Skolem’s lemma.
While for polynomials with maximal degree two, there exists such an algorithm, the
case of maximal degree three is still open. Similarly, for rational (rather than integral)
roots, decidability is an open problem. For real roots, on the other hand, a result of
Tarski implies that the problem then becomes decidable.
As we will prove in the exercise, one can extend the above undecidability result in the
following direction: let C be any set of cardinal numbers ≤ N0 which is neither empty
nor does it contain all such cardinal numbers. Then the question of whether or not the
number of non-negative integral roots of a polynomial is in C turns out to be undecidable as
well. The proof is a reduction from C = {0} - the undecidability of Hilbert’s tenth problem.
23
Proposition There is a polynomial q(y1, y2, . . . , yn, x) with integer coefficients such that
the positive integers in its range are exactly all prime numbers in the sense that
q(Nn+1) ∩ N∖{0} = the set of all primes.
Proof : Primes form a recursively enumerable and thus diophantine set S. This implies
that there is a polynomial p with integer coefficients such that S = {x ∈ N | ∃ yNn :
p(x, y) = 0}. Defining q(y, x) := x(1 − p(x, y)2) then gives the sought polynomial since
this is positive iff p has a root in which case indeed q(y, x) takes on the value of the
corresponding prime.
Following the remark regarding universal polynomials, we obtain that for prime number
producing polynomials ten variables suffice. A similar construction leads to the following:
Proposition Let f : N→ N be any partial recursive function. There exists a polynomial
q with integer coefficients such that for all x, y ∈ N:
y = f(x)⇔ ∃ x0, . . . , xn ∈ N : y = q(x, x0, . . . , xn).
Proof : The graph of f is recursively enumerable and thus diophantine. So y = f(x) holds
iff for a certain polynomial p we have ∃ x0, . . . , xn : (1− p(x0, . . . , xn, x)2) > 0 ∧ x0 = y.
This in turn is equivalent to ∃ x0, . . . , xn : (x0 + 1)(1 − p(x0, . . . , xn, x)2) = y +
1. Therefore the sought polynomial can be defined as q(x, x0, . . . , xn) := (x0 + 1)(1 −p(x0, . . . , xn, x)2)− 1.
Application of Solution
A particularly form of Godel’s incompleteness theorem is also a consequence of the
Matiyasevich/MRDP Theorem:
Let p(a, x1, . . . , xk) = 0, provide a Diophantine definition of a non-computable set. Let
A be an algorithm that outputs a sequence of natural numbers such that the corresponding
equation p(n, x1, . . . , xk) = 0,has no solutions in natural numbers. Then there is a
numbern0 which is not output by A while in fact the equation p(n, x1, . . . , xk) = 0, has
no solutions in natural numbers.
24
To see that the theorem is true, it suffices to notice that if there were no such number n0
, one could algorithmically test membership of a number n in this non-computable set by
simultaneously running the algorithm A to see whether n is output while also checking all
possible k-tuples of natural numbers seeking a solution of the equation p(n, x1, . . . , xk) = 0.
We may associate an algorithm A with any of the usual formal systems such as Peano
Arithmetic or ZFC by letting it systematically generate consequences of the axioms and
then output a number n whenever a sentence of the form
∃x1, x2, . . . xk[p(n, x1, x2, . . . . .xk) = 0]
is generated. Then the theorem tells us that either a false statement of this form is proved
or a true one remains unproved in the system in question.
25
Chapter 6
Wang’s Tile Problem
Suppose we want to cover the plane with decorated square tiles of the same size. Tiles
are to be chosen from a finite number of types. There are unbounded tiles of each type
available. Due to the decorations, however, there are local constraints on which tiles can
be put next to each other, for the tiling to look appealing. Is it possible to cover the whole
plane with tiles of given types? How if we require a certain tile to be used at least once?
Can they be used to tile a finite rectangular area, with a certain boundary condition? It
turns out that these problems the way formulated by Hao Wang are all undecidable.
In 1961, Wang conjectured that if a finite set of tiles can tile the plane, then there
exists also a periodic tiling, i.e., a tiling that is invariant under translations by vectors in
a 2-dimensional lattice, like a wallpaper pattern. He also observed that this conjecture
would imply the existence of an algorithm to decide whether a given finite set of tiles can
tile the plane.
This conjecture was in 1966 refuted by Berger. He showed that any Turing machine
can be translated into a Wang tile set, and that the Wang tile set tiles the plane if and
only if the Turing machine will never halt. The halting problem is undecidable and thus
so is Wang’s original problem.
Berger constructed the first aperiodic tile set counting 20426 tiles. This number was
reduced repeatedly, often by well known scientists, such as DonaldKnuth. The smallest
aperiodic set of Wang tiles consists of 13 tiles over 5 colors.
26
Proving the undecidability of Tiling problem
To prove that tiing problem is undecidable we somehow reduce it to the problem
of Halting Turing Machine. The encoding is most natural and intuitive. Let M =
(Σ, Q, δ, q0, qF , B) be a Turing machine with a one-way-infinite tape, whereΣ is the al-
phabet, Q the set of states, δ : Q− {qF } ×Σ −→ Σ× {L,R} ×Q the transition function,
q0 ∈ Q the initial state, qF ∈ Q the final state, and B ∈ Σ the blank symbol. We construct
a set TM of Wang tiles as shown in Figure A and B and described bellow.
Fig. A. Alphabat tile B. Merging tile C.Action Tile
• For any letter a ∈ Σ, we have a tile of the form depicted in Figure A. These are to
pass the content of an inactive cell of the tape one row upward (which corresponds
to one step later in the computation process of the machine).
• For any state q ∈ Q and letter a ∈ Σ, where δ(q, a) = (b,D, p), there is a tile like
one of those in Figure C, depending on whether D = R or D = L. These correspond
the action of the transition function, and passing the new state to a neighbor cell.
27
• The state p received from a neighbor cell, is combined with the current content of
the cell, by a tile of the form shown in Figure B and passed to the upper tile to be
processed in next step.
Tiles for fixing the initial configuration of a Turing machine.
(a) Head position. (b) Empty cells.
• The initial configuration of the machine is fixed by the tiles in second figure. Here
* is a new colour and used to ensure that only one head appears on the tape.
Theorem Plane Tiling Problem is undecidable.
Proof : Let M be a Turing Machine - an instance of the Halting Problem. We use the
tile set TM , constructed above, and a tile set TR, to construct a new set of tiles T that
can tile the plane, if and only if, M on empty input never halts.
Each tile in T is basically a pair (s, t), where s ∈ TR, and T is either in TM or is of an
auxiliary type, to pass information along a vertical or horizontal line. The two components
are interpreted as layers of the tile. Let s0 ∈ TR be any tile that represents the lower-left
corner of a connected region. For any such tile, we place a tile (s0, t0) in T , where t0
is the marked tile of TM as defined above. This is the only pair in which t0 appears,
and is supposed to trigger the start of simulation in each connected region. Any other tile
T ∈ TM\{t0} is paired with tiles s ∈ TR that represent a cell fully inside a connected region.
Let us say a row (or column) of a region is free if it does not pass through a hole.
Otherwise, the row (or column) is blocked by the hole. The intersection of a free row and
free column is a free cell. Only free cells are used for simulation. A cell that belongs to
a blocked row (resp. a blocked column) simply passes the information along the edge of
28
the blocking hole. More clearly, a cell whose lower (or upper) edge touches a hole simply
passes the colour of its left edge to the right, and asks its upper (resp. lower) neighbor to
do the same. Similarly, a cell whose left (resp. right) edge touches a hole passes the colour
of its lower edge to the upper edge and asks its right (resp. left) neighbor to do the same.
Now, it should be clear that the above-d escribed tiles can tile a connected region C
of the decoration, if and only if, TM can tile a square with the same area as C, provided
t0 is used in the lower-left corner. Since the entire decoration provides connected regions
of arbitrarily large net area, using the Extension Lemma, we conclude that the plane can
be tiled by the tiles in T , if and only if, TM can tile an upper-right quarter of the plane
with t0 in the lower-left corner. This completes the proof.
29
Chapter 7
Busy Beaver
A busy beaver is a Turing machine that attains the maximum “operational busyness”
among all the Turing machines in a certain class. The Turing machines in this class must
meet certain design specifications and are required to eventually halt after being started
with a blank tape. A busy beaver function quantifies these upper limits on a given type
of“operational busyness”, and is a noncomputable function. In fact, a busy beaver function
can be shown to grow faster asymptotically than does any computable function.
To understand busy beaver function we need to make few changes in our turing machine
and its functioning. So we define our turing machine as follows:
Turing machine here, is a device consisting of
1. a tape, T = {0, 1} and we say that the cell is blank if it contains a zero,
2. a read/write head, typically
3. a finite set of internal states Q = {0, . . . , n}
4. a list of instructions, typically
where {R,L} are the directions the head can move. We will sometimes write M (n) if
we want to make the number of internal states explicit.
Example 1: A TM M (k+1) which writes k ones onto a blank tape and then halts above
the leftmost one. M : (0, t) 7→ (1, t+1, L) for t = 0, . . . , k−1 and M : (0, k) 7→ (0, k+1, R).
30
We will use the convention that the initial state is q = 0 and the last state q = n is
the halting state - the only state upon which the machine halts. Since this results in n
active states, the machine is called an n-state Turing machine. Mathematically, the set of
instructions characterizing a Turing machine is a map,
M : T ×Q −→ T ×Q× {R,L}
In order to talk about Turing machines as devices which compute functions of the
form f : Nk −→ N, we need to specify some conventions about how input and output are
represented. We will use unary encoding for both of them. That is, a number x ∈ N will
be represented by x + 1 consecutive 1s on the tape with the rest of the tape blank (e.g.,
2 would correspond to 0 . . . 1110 . . . 0). Similarly, (x1, . . . , xk) ∈ Nk will be represented by
k such blocks of 1s separated by single zeros (e.g. (0, 2) would be 0 . . . 0101110 . . . 0). A
Turing machine M(n)f is then said to compute the function f : Nk −→ N iff the machine
starting with the head placed on the leftmost 1 of the unary encoding of x ∈ Nk eventually
halts on the leftmost 1 of the encoded f(x) if x ∈ dom(f) and it never halts if x 6∈ dom(f).
Definition A function f : Nk −→ N is called Turing computable iff there is an n-state
Turing machine (TM) for some finite n which computes f in the sense that the Turing
machine halts for every input x ∈ dom(f) with the tape eventually representing f(x) and
it doesn’t halt if x 6∈ dom(f). Here, input and output are encoded in the above specified
unary way and at the start and (potential) end of the computation the head of the TM
should be positioned above the leftmost non-blank symbol of the tape.
Example 2: the successor function s(x)=x+1 can be computed by the following 2-
state Turing machine M(2)x+1.(1, 0) 7→ (1, 0, R), (0, 0) 7→ (1, 1, L), (1, 1) 7→ (1, 1, L), (0, 1) 7→
(0, 2, R).
Example 3: the zero function z(x)=0 can be computed by a 2-state TM: (1, 0) 7→(0, 0, R), (0, 0) 7→ (1, 1, R), (0, 1) 7→ (0, 2, L).
Example 4: the following 5-state TMM(5)2x implements x 7→ 2x : (0, 0) 7→ (0, 3, R), (1, 0) 7→
(0, 1, L), (0, 1) 7→ (1, 2, R), (1, 1) 7→ (1, 1, L), (0, 2) 7→ (1, 0, R), (1, 2) 7→ (1, 2, R), (0, 3) 7→(0, 3, L), (1, 3) 7→ (0, 4, L), (0, 4) 7→ (0, 5, R), (1, 4) 7→ (1, 4, L).
31
After defining the new turing machine, we now define concatenation of two turing
machines. Let M(nf )f and M
(ng)g be two TMs with nf , ng internal states, computing
functions f and g respectively. Then we can a define a new (nf +ng)−state TM M(nf+ng)gf
via
Mgf (t, q):=
{Mf (t, q), q < nf
Mg(t, q − nf ), q ≥ nf
Its action will be such that it first computes f(x) and then uses the resulting output
as an input for g. Hence, Mgf computes the concatenation corresponding to x 7→ g(f(x))
for which we will also write gf(x). Note that the possibility of concatenating two TMs
in this way builds up on our requirements that the output of a computation has to be
encoded in unary on the tape and that the TM (if ever) halts with the head positioned on
the leftmost 1.
Now we move on to defining the busy beaver function. Let us assign a number B(M) ∈N to every Turing machine M by considering its behavior when run on an initially blank
tape. We set B(M) := 0 if M never halts and B(M) := b if it halts and the total number
of (not necessarily consecutive) 1s eventually written on the tape is b. Based on this we
can define the busy beaver function BB : N→ N
BB(n) := max{B(M)|M ∈ {M (n)}}
BB(n) is defined as the largest number of 1s eventually written on an initially blank
tape by any n-state TM which halts. Note that the function is well-defined since the
maximum is taken over a finite set.
Lemma Busy Beaver function is monotonically increasing:
BB(n+ 1) > BB(n) for all n ∈ N.
Proof Denote the TM which achieves BB(n) by M (n). Based on this we can define a
(n + 1)−state TM M (n+1)whose instructions equal those of M (n) for all internal states
q < n and which in addition follows the rule (t, n) 7→ (1, n + 1 − t, R). By construction
BB(n) + 1 = B(M (n+1)) ≤ BB(n+ 1).
32
This leads us to a common property of all Turing computable functions that they
cannot grow faster than BB.
Theorem Let f : N −→ N be any function which is Turing computable by a k-state TM
Mf . Then for all x > 2k + 13 for which f is defined we have f(x) < BB(x).
Proof We utilize concatenation of the above discussed examples and then define a (k +
n+ 8)−state TM
Mf(2n+1) := M(k)f M
(2)x+1M
(5)2x M
(n+1)
with M (n+1) being the TM which writes n consecutive 1s.
Running Mf(2n+1) on the blank tape then produces f(2n + 1) consecutive ones before
halting. Thus,
f(2n+ 1) ≤ BB(k + n+ 8).
Moreover, monotonicity of BB implies
BB(k + n+ 8) < BB(2n+ 1) if k + 7 < n.
or, f(2n+ 1) < BB(2n+ 1) if k + 7 < n.
Similaraly if we construct a (k + n+ 6)−state TM
Mf(2n) := M(k)f M
(5)2x M
(n+1)
where every component is same as defined above. Using the monotonicity of BB we
obtain,
f(2n) < BB(2n) if k + 6 < n.
For f(x) < BB(x) to hold true, x can either be even and take values 14,16,18... or
x can be odd and take values 15,17,19... From above two results we can generelize that
x > 2k + 13.
Above theorem is known as Rado’s theorem and from the proof of Rado’s theorem we
see that the statement of the theorem would still hold true if we would require a single
block of consecutive 1s in the definition of BB, rather than counting all 1s on the tape.
33
After defining every thing required now we will prove that busy beaver function is not
Turing computable or undecidable. If BB was decidable, then there would be a k ∈ Nand a k−state TM computing BB so that by Rado’s theorem for all sufficiently large x
BB(x) < BB(x). Hence busy beaver function is undecidable.
This is the formalized version of the following more vague statement:
“if BB(x) is the largest finite number which can be written by an algorithm of length x,
then there cannot be a single, finite algorithm which computes BB(x) for all x.”
The fact that BB is not Turing computable doesnt mean that BB(x) cannot be
computed for given x. Rado’s theorem just tells us that the complexity of the TM has to
increase unboundedly with x. In fact, BB(x) is known for small values of x,
for x=1, BB(x) = 1
for x=2, BB(x) = 4
for x=3, BB(x) = 6
for x=4, BB(x) = 13
for x=5, BB(5) ≥ 4098
for x=6, BB(6) ≥ 3.5× 1018267.
34
Chapter 8
Some More Undecidable Problems
In this section we will see some famous undecidable problems. These problems are from
different fields in mathematics like Mathematical Logic, Matrix theory, Group Theory etc.
We will also proof some of them undecidable. Some of the proofs are very complex and
require higher knowledge in their field, so they are omitted.
Entscheidungsproblem
The Entscheidungsproblem asks for an algorithm that takes as input a statement
of a first-order logic (possibly with a finite number of axioms beyond the usual axioms
of first-order logic) and answers ”Yes” or ”No” according to whether the statement is
universally valid.
By the completeness theorem of first-order logic, a statement is universally valid if
and only if it can be deduced from the axioms, so the Entscheidungsproblem can also be
viewed as asking for an algorithm to decide whether a given statement is provable from
the axioms using the rules of logic.
Suppose that we had a general decision algorithm for statements in a first-order
language. The question whether a given Turing machine halts or not can be formulated
as a first-order statement, which would then be susceptible to the decision algorithm. But
we know that no general algorithm can decide whether a given Turing machine halts.
35
Mortality problem
The Mortality Problem for Turing machines with an infinite input tape is the problem
to determine, for an arbitrary machine M, whether or not M eventually halts no matter
in what configuration it is started. This is not the Halting Problem, since it means that
we cannot just consider well-behaved machines that always start in their start states,
positioned to the right of their arguments and which always end up to the right of the
answer, which immediately follows these arguments (a convention called Standard Turing
Computation). It also means that we might start with an infinite number of marked
squares on the tape, unlike a normal Turing machine, which must start with its tape only
finitely marked.
As is commonly done with Turing machines, we can, without loss of generality, limit the
tape alphabet to {0,1}, where 0 denotes a blank, and 1 is the only mark (non-blank). Using
that limitation on the tape alphabet, consider a function to compute x+ 1 from x, using
Standard Turing Computation and unary representations of numbers. Such a machine
could copy its one argument to the immediate right of the original scanned square and
then move to the end of the copy appending a 1.
This machine always halts if it is started on a finitely marked tape, with the Standard
Turing conventions obeyed. In fact, it can be written so it will always halt so long as the
tape is finitely marked, even if the machine is started in other than the correct state and
other than on the correct square. However, this machine is not mortal since, for example,
it would run forever if started just to the right of an infinite sequence of 1s; the copy
operation could never end.
Philip K. Hooper proved in 1966 that the mortality problem is undecidable. However,
it can be shown that the set of Turing machines which are mortal i.e. halt on every starting
configuration is recursively enumerable.
36
Word Problem
The Word Problem for groups is undecidable. This is the problem, given a finite group
presentation and a word, to decide if that word is the group identity in that presentation.
By group identity we mean that given a word in the generators of group, it represents
the identity if and only if the reduced word obtained by iteratively cancelling adjacent
inverses is the empty word. The problem can be also viewed as: the algorithmic problem
of deciding whether two words in the generators represent the same element.
The basic line of thought will be to realize that the set of words forms a semi group
which then allows us to extend the undecidability result to presentations of semi groups
and groups. It was proven undecidable independently by Post and Markov. The problem is
undecidable because one may encode the Halting problem for Turing machines. Basically,
for each Turing machine program, one can construct a group presentation and a word,
such that the program halts if and only if that word is the identity.
Conjugacy Problem
The conjugacy problem for a group G with a given presentation is the decision problem
of determining, given two words x and y in G, whether or not they represent conjugate
elements of G. That is, the problem is to determine whether there exists an element z
of G such that y = xzx−1 The geometric motivations is: to see if whether two loops are
freely homotopic.
The Conjugacy problem is undecidable. The conjugacy problem for arbitrary presenta-
tions (and not just one fixed presentation) does reduce to the word problem for arbitrary
presentations, since both are equivalent to the halting problem. That is, the halting
problem famously reduces to the word problem, which reduces to the conjugacy problem
as but this clearly reduces to the halting problem.
Isomorphism Problem
The isomorphism problem is to find an algorithm to determine whether two finite
presentations give isomorphic groups. The geometric motivation is to see if one can
37
algorithmically distinguish spaces based on their fundamental groups. The undecidability
of the isomorphism problem can be proofed by Adjan-Rabin theorem. This theorem gives a
general construction which can be applied to any Markov property P of finitely presented
groups to proof the undecidability. Also, All varieties of solvable groups with undecidable
word problem have undecidable isomorphism problem.
Undecidable statements in ZFC
Assuming that ZFC is consistent, the mathematical statements mentioned below are
provably undecidable in ZFC (the Zermelo Fraenkel axioms plus the axiom of choice) :
1. Set theory of the real line: It is an area of mathematics concerned with the applica-
tion of set theory to aspects of the real numbers. Axiomatic set theory, by Godel’s
incompleteness theorem, contains propositions that are undecidable. Also, the real
numbers are most often formalized using the Zermelo Fraenkel axiomatization of set
theory.
2. Axioms of Constructability: It is a possible axiom for set theory in mathematics
that asserts that every set is constructible. It implies the axiom of choice over ZFC
theory.
3. Continuum Hypothesis: It stated that there is no infinite set with a cardinal number
between that of the “small” infinite set of integers and the “large” infinite set of real
numbers (the continuum)
It was showed by Godel that no contradiction would arise if the continuum hypothesis
were added to conventional ZFC set theory. However, using a technique called
forcing, Paul Cohen proved that no contradiction would arise if the negation of the
continuum hypothesis was added to set theory. Together, the result was established
that the validity of the continuum hypothesis depends on the version of set theory
being used, and is therefore undecidable.
4. Group Theory: It studies the algebraic structures known as groups. It is an undecid-
able theory. There is no computable process to determine whether a given statement
in the first order language of group theory is true in all groups.
38
5. Measure Theory: It is about the study of measures. It generalizes the intuitive
notions of length, area, and volume. The earliest and most important examples
are Jordan measure and Lebesgue measure, but other examples are Borel measure,
probability measure, complex measure, and Haar measure.
6. Order Theory: It is a branch of mathematics which investigates our intuitive notion
of order using binary relations. It provides a formal framework for describing
statements such as “this is less than that” or “this precedes that”.
7. Functional Analysis: It is concerned with infinite-dimensional vector spaces (mainly
function spaces) and mappings between them. The spaces may be of different, and
possibly infinite dimensions. These mappings are called operators or, if the range is
on the real line or in the complex plane, functionals.
Matrix - Mortality Problem
Consider a fnite set of d×d matrices S = {M1, . . . ,Mn} ⊂Md(Z) with integer entries.
We call S mortal iff there is a non-empty word w ∈ {1, . . . n}′, of length m say, such that
for the corresponding product of matrices:
Mw1 · · ·Mwm = 0
Example 1 Consider a set consisting of two matrices
(0 1
−3 2
)and
(2 0
0 −1
)
This cannot be mortal since the matrices have non-zero determinant and the determi-
nant of any product is just the product of determinants.
Example 2 The two matrices
(0 0
0 1
)and
(0 1
−1 0
)
39
form a mortal set since their product is a nilpotent matrix whose square vanishes.
Before we show that unlike in these simple examples there cannot be a general recipe
for deciding mortality, we will introduce some tools for encoding words into products of
matrices:
For words w = a1 . . . am over the alphabet A := {1; 2; 3} define an injective map
W (w) := Σmk=1ak4m−k from A′ to N. Denote by |w| the length of a word and define a map
from A′ ×Aprime into the set of 3× 3 integer matrices by
M(u,w) :=
4|u| 0 0
0 4|w| 0
W (u) W (w) 1
If we use concatenation of words and matrix multiplication as binary operations in the
domain and codomain respectively, then (u,w) 7→ M(u,w) is an injective monoid homo-
morphism. That is, in particular M(u1, v1)M(u2, v2) = M(u1u2, v1v2).
In addition we will need the matrix
B :=
1 0 0
−1 0 1
0 0 0
which satisfies B2 = B and BM(u, v)B = (4juj +W (u)−W (v))B.
The latter implies that BM(u, v)B = 0 iff W (v) = W (1u) which in turn is equivalent
to v = 1u. Now let us exploit these relations to prove the following:
Proposition Let k ∈ N be such that PCP with k “dominos” is undecidable. Then there
is no algorithm which upon input of a set S ⊂M3(/Z) of 2k + 1 integer matrices decides
whether or not S is mortal.
Proof Consider an undecidable PCP with k dominos and choose {2, 3} as a binary
alphabet for it. Denote by (xi, yi) with i = 1, . . . , k the pairs of words appearing in
the PCP . For each of these k dominos we define two matrices Mi := M(xi; yi) and
M ′i := M(xi, 1yi). So together with B these form a set S of 2k + 1 integer matrices.
40
Now assume that PCP has a solution w ∈ {1, . . . , k}′. Then
BM ′w1Mw2 · · ·Mw|w|B = 0
so the set S is mortal. Conversely, if S is mortal, then there is a product so that
BM(u1, v1)BM(u2, v2)B · · ·B = 0. Since B2 = B and each BM(ui, vi)B is a multiple of
B, the product can only be zero if for at least one i we have 1ui = vi. Observing that
ui ∈ {2, 3}′ this implies a solution for PCP.
Using that PCP is known to be undecidable for seven dominos, we obtain that matrix
mortality is undecidable for sets of fifteen 3 × 3 matrices. One can trade the number of
matrices with their dimension and show that matrix mortality is undecidable as well for
two 24 × 24 matrices. On the positive side, it is known that it is decidable for two 2 × 2
matrices and for instance for an arbitrary number of upper triangular 2 × 2 matrices.
Without such an additional constraint decidability is, however, not known already for
three 2× 2 matrices with integer coefficients.
41
Chapter 9
Hypercomputation
Hypercomputation or super-Turing computation refers to models of computation
that go beyond, or are incomparable to, Turing computability. This includes various
hypothetical methods for the computation of non-Turing-computable functions, following
super-recursive algorithms. The difference between super-Turing computation and Hy-
percomputation is super-Turing computation usually implies that the proposed model is
supposed to be physically realizable, while hypercomputation does not.
A computational model going beyond Turing machines was introduced by Alan Turing.
His paper investigated mathematical systems in which an oracle was available, which could
compute a single arbitrary (non-recursive) function from naturals to naturals. He used this
device to prove that even in those more powerful systems, undecidability is still present.
Turing’s oracle machines are strictly mathematical abstractions, and are not physically
realizable.
The ChurchTuring thesis states that any function that is algorithmically computable
can be computed by a Turing machine. Hypercomputers compute functions that a Turing
machine cannot, hence, not computable in the Church-Turing sense. An example of a
problem a Turing machine cannot solve is the halting problem. A Turing machine cannot
decide if an arbitrary program halts or runs forever. Some proposed hypercomputers can
simulate the program for an infinite number of steps and tell the user whether or not the
program halted.
42
Hypercomputation Proposals
There are many proposals for Hypercomputers but only a few are discussed here. Zeno
Machine, Oracle Machine and Real Computers are three famous proposals among which
Oracle machine is most important. Alan Turing himself proposed this model. It is one of
the oldest proposals of Hypercomputers. In this section we will cover these three proposals,
first we will see Zeno Machine and then move on to Oracle Machine and end with Real
Computers.
Zeno Machine
The idea of Zeno machines was first discussed by Hermann Weyl, they are named
after the ancient Greek philosopher Zeno of Elea. Computation method of this machine is
similar to the famous Zeno’s paradox. Zeno machines are also called Accelerated Turing
machine. They are a hypothetical computational model related to Turing machines that
allows a countably infinite number of algorithmic steps to be performed in finite time.
If we define formally, a Zeno machine is a Turing machine that takes 2−n units of
time to perform its n-th step thus, the first step takes 0.5 units of time, the second takes
0.25, the third 0.125 and so on, so that after one unit of time, countably infinite number
of steps will have been performed. There are quite a few things this little machine can
do which a turing machine cannot like solving halting problem. Keeping in mind that it
is just a hypothetical model given below is how Zeno’s Machine can solve Halting Problem.
Suppose we have to decide wether a given machine M halts on an input string w. We
construct an accelerated Turing machine M’ as defined above. It is given that w is turing
recognizable i.e there exists an algorithm to construct the given string . The constructed
machine M’ simulates M and and runs w on it. If M halts on w then M’ will not take more
than one unit time to halt. So if M’ halts within one unit time then M halts on string
w. Hence Halting problem becomes decidable in accelerated turing machine. As Godel’s
Theorem suggests it is true that the halting problem for Zeno machines is not solvable by
a Zeno machine itself. Therefore undecidability remains in this model of Hypercomputer.
As we have already seen that how powerful a computer becomes undecidable statements
will always exist.
43
Oracle Machine
In computability theory, an oracle machine is an abstract machine used to study
decision problems. It can be visualized as a Turing machine with a black box, called
an oracle, which is able to decide certain decision problems in a single operation. The
problem can be of any complexity class. Even undecidable problems, like the halting
problem, can be used.
An oracle machine is a Turing machine connected to an oracle. The oracle, in this
context, is thought of as an entity capable of answering some collection of questions, and
usually represented as some subset A of the natural numbers. Intuitively then, the oracle
machine can perform all of the usual operations of a Turing machine, and can also query
the oracle for an answer to a specific question of the form ”is x in A?”
Informal Definition
An oracle machine has
1. A work tape, a sequence of cells without beginning or end, each of which may
contain a B (for blank) or a 1.
2. A read/write head, which rests on a single cell of the work tape and can read the
data there, write new data, and move left or right along the tape
3. A control mechanism, which can be in one of a finite number of states, and which
will perform different actions like reading data, writing data, moving the control
mechanism, and changing states depending on the current state and the data being
read.
4. An oracle tape, on which an infinite sequence of B’s and 1’s is printed, correspond-
ing to the characteristic function of the oracle set A.
5. An oracle head, which can move left or right along the oracle tape reading data,
but which cannot write.
44
Formal definition
An oracle Turing machine is a 4-tuple M=(Q, δ, q0, F) where
1. Q is a finite set of states
2. δ : Q×{B, 1}2 −→ Q×{B, 1}× {L,R}2 is transition function, where L is left shift,
R is right shift.
3. q0 ∈ Q is the initial state
4. F ⊆ Q is the set of halting states.
The oracle machine is initialized with the work tape containing some input with finitely
many 1’s and the rest of the tape blank, the oracle tape containing the characteristic
function of the oracle, A, and the Turing machine in state q0 with read/write head reading
the first nonblank cell of the work tape, and oracle head reading the cell of the oracle
tape which corresponds to χA(0). Thereafter it operates according to δ: if the Turing
machine is currently in state q, the read/write head is reading a symbol S1, and the oracle
head is reading S2, then if δ(q, S1, S2) = (q′, S′1, D1, D2), the machine enters state q’, the
read/write head writes the symbol S′1 in place of S1, and then the read/write head moves
1 cell in direction D1 and the oracle head moves one cell in direction D2. At this point if
is a halting state, the machine halts, otherwise it repeats this same procedure.
We have seen the working of Oracle Turing Machine (OTM) now we will look how
oracle helps in solving certain decision problems. An OTM is a Turing Machine M that
has a special read-write tape called Ms oracle tape and three special states qquery , qyes
and qno apart from other states.
To execute M, we specify the input as usual and a language O ⊆ 0, 1* that is used as
an oracle for M. While performing its computation, if M enters the state qquery , then M
checks whether the contents of the oracle tape w ∈ O? If w ∈ O, M moves to the state qyes
, it moves to qno if q /∈ O. Regardless of the choice of O, a query like w ∈ O counts for a
single computational step of M. MO (x) denotes the output of the oracle turing machine
M on input x 0, 1* with O ∈ 0, 1* as the language.
45
Turing machines can compute a general functions as follows: if f is a function that
takes natural numbers to natural numbers, MA is a Turing machine with oracle A, and
whenever MA is initialized with the work tape consisting of n+1 consecutive 1’s (and blank
elsewhere) MA eventually halts with f(n) 1’s on the tape, then MA is said to compute the
function f. A similar definition can be made for functions of more than one variable, or
partial functions.
If there is an oracle machine M that computes a function f with oracle A, f is said
to be A-computable. If f is the characteristic function of a set B, B is also said to be
A-computable, and M is said to be a Turing reduction from B to A.
Halting problem
If we assume the existence of an oracle which computes a non-computable function,
such as the answer to the halting problem or some equivalent. A machine with an oracle is
a hypercomputer. But again the halting paradox still applies to such machines, although
they determine whether particular Turing machines will halt on particular inputs, they
cannot determine, in general, if machines equivalent to themselves will halt. This fact
creates a hierarchy of machines, called the arithmetical hierarchy, each with a more
powerful halting oracle and an even harder halting problem.
Consider a “super halting problem”, given a Turing machine with an oracle for the
halting problem, decide if it halts. We can prove that this super halting problem is
unsolvable, even given an oracle for the ordinary halting problem. We simply take Turing’s
original proof that the halting problem is unsolvable, and “shift everything up a level” by
giving all the machines an oracle for the halting problem. Everything in the proof goes
through as before.
Friedberg and Muchnik actually proved a stronger result that if there are two problems
A and B, both of which are solvable given an oracle for the halting problem, but neither
of which is solvable given an oracle for the other. These problems are constructed via an
infinite process whose purpose is to kill off every Turing machine that might reduce A to B
or B to A. The resulting problems are extremely contrived, they don’t look like anything
that might arise in practice.
46
Real Computers
In computability theory, the theory of real computation deals with hypothetical com-
puting machines using infinite-precision real numbers. They are given this name because
they operate on the set of real numbers. These hypothetical computing machines can
be viewed as idealised analog computers which operate on real numbers, whereas digital
computers are limited to computable numbers. In this section we will not not see the
working of a real computer. We will only get a rough idea of computable numbers. We
will also see how real computers or analog computers are more powerful than today’s
digital computer.
Computable numbers are also known as the recursive numbers or the computable
reals. they are the real numbers that can be computed to within any desired precision by
a finite, terminating algorithm. Turing machines or -calculus as the formal representation
of algorithms. The computable numbers form a real closed field and can be used in the
place of real numbers for many, but not all, mathematical purposes.
Definition using a Turing machine
“sequences of digits interpreted as decimal fractions between 0 and 1 is a computable
number if there exists a Turing machine which, given n on its initial tape, terminates with
the nth digit of that number.”
The key notions in the definition are that some n is specified at the start, for any n
the computation only takes a finite number of steps, after which the machine produces
the desired output and terminates.
By the definition it is clear that a real computer is more powerful than a turing machine
or is a hypercomputer beacuse it had to generate numbers that cannot be generated by a
turing machine’.
This is however not the modern definition which only requires the result be accurate to
within any given accuracy. The informal definition above is subject to a rounding problem
called the table-maker’s dilemma whereas the modern definition is not.
47
Formal Definition
A real number a is said to be computable if it can be approximated by some computable
function in the following manner: given any integer n ≥ 1, the function produces an integer
k such that:
k−1n ≤ a ≤
k+1n
There is another equivalent definition called ε approximation of computable numbers.
There exists a computable function which, given any positive rational error bound ε,
produces a rational number r such that:
|r − a| ≤ ε
The set of real numbers is uncountable, the set of computable numbers is only countable
and thus almost all real numbers are not computable.
The arithmetical operations on computable numbers are themselves computable in the
sense that whenever real numbers a and b are computable then the following real numbers
are also computable: a + b, a - b, ab, and a/b if b is nonzero, for example, there is a
Turing machine which on input (A,B,ε) produces output r, where A and B are Turing
machines approximating a and b respectively, and r is an ε approximation of a+b.
The least upper bound of a bounded increasing computable sequence of computable
real numbers need not be a computable real number.
The order relation on the computable numbers is not computable and same holds for
the equality relation. There is no Turing machine which on input A (the description of a
Turing machine approximating the number a) outputs YES if a > 0 and and NO if a ≤ 0.
Suppose the machine described by A keeps outputting 0 as approximations. It is not clear
how long to wait before deciding that the machine will never output an approximation
which forces a to be positive. Thus the machine will eventually have to guess that the
number will equal 0 but the sequence may later become different from 0.
48
While the full order relation is not computable, the restriction of it to pairs of unequal
numbers is computable. That is, there is a program that takes an input two Turing
machines A and B approximating numbers a and b respectively, where a 6= b, and outputs
whether a < b or a > b. It is sufficient to use ε-approximations where ε < |b−a|2 ; so by
taking increasingly small ε , one eventually can decide whether a < b or a > b.
Halting problem
In real computers is it hard to show that Halting problem for TM is decidalbe by
simulating the working of TM in it. So instead of doing this we will logically show that
a real computer can solve halting problem of TM. We alredy know the fact that these
hypothetical model can generate real numbers with infinite precision i.e. that are not
computable. Uncomputable numbers are basically those numbers for which the machine
generating them can easily (in finite time) compute countably infinite steps; in this case,
by above definitions a computer can actually compute till n→∞, hence acquiring infinite
precision.
As we know halting problem is uncomputable so there must exist an algorithm on
hypercomputaion level or a real computer that can generate it in finite time. This implies
that a Real computer can infact solve halting problem. We can also see its proof using
Godel’s numbering system, which is omitted here. Also by Godel’s Incompleteness theorem
we can say there must exist some undecidable problems in this proposed model of a
Hypercomputer.
49
Bibliography
[1] Undecidable Problem
http : //en.wikipedia.org/wiki/Undecidable problem
[2] Entscheidungsproblem
http : //en.wikipedia.org/wiki/Entscheidungsproblem
[3] Godel’s incompleteness theorems
http : //en.wikipedia.org/wiki/Godel incompleteness theorem
[4] Post correspondence problem
http : //en.wikipedia.org/wiki/Post correspondance problem
[5] Halting problem
http : //en.wikipedia.org/wiki/Halting problem
[6] Hilbert’s tenth problem
http : //en.wikipedia.org/wiki/Hilbert′s tenth problem
[7] Wang’s Tile
http : //en.wikipedia.org/wiki/Wang tile
[8] Busy Beaver
http : //en.wikipedia.org/wiki/Busy beaver
[9] Mortality Problem
http : //en.wikipedia.org/wiki/Mortality computability theory
[10] Undecidable problems in group theory, by George S.Sacerdote, Proceedings of the
American Mathematical Society, Volume 36, Number 1, November 1972.
50
[11] List of statements undecidable in ZFC
http : //en.wikipedia.org/wiki/List of statements undecidable in ZFC
[12] Lecture on Undecidability, Michael M. Wolf , June 27, 2012
[13] Hypercomputation
http : //en.wikipedia.org/wiki/Hypercomputation
[14] Zeno machines and hypercomputation
http : //www.sciencedirect.com/science/article/pii/S0304397505009011
[15] Undecidable problems in semigroups theory Honours Project, by L.Konstantinovskiy
[16] Elements of Theory of Computation, by Harry R. Lewis and Christos H. Papdim-
itriou, Second Ed., Prentice - Hall
[17] Introduction to Automata Theory, Languages, and Computation, by John Hopcroft,
Rajeev Motwani, Jeffrey Ullman, Third Ed., 1979, Pearson
[18] Computation Beyond Turing Machines, Peter Wegner, Dina Goldin
[19] Undecidable Problems - A Sampler, by Bjorn Poonen
[20] Lectures on Turing Machine, by Robb T. Koether, 2009
[21] Lectures on Halting Problem, by Costas Busch, 2006
[22] Godel’s incompleteness theorem - An incomplete guide to its use and abuse,
by Torkel Franzen, 2005, A. K. Peters Ltd.
[23] The myth of Hypercomputation, by Martin Davis
[24] Limits of Computation: Undecidable Problems, by D. Gorse
[25] Accelerating Turing Machines, by B. Jack Copeland
51