1
CD5560
FABER
Formal Languages, Automata and Models of Computation
Lecture 1
Mälardalen University
2005
2
Content
Mathematical Preliminaries
Countable Sets (Uppräkneliga mängder)
Uncountable sets (Överuppräkneliga mängder)
Languages, Alphabets and Strings
Strings & String Operations
Languages & Language Operations
Regular Expressions
3
Lecturer & Examiner
Gordana Dodig-Crnkovic
4
Teaching Assistent
Andreas Ermedahl
5
http://www.idt.mdh.se/kurser/cd5560/05_04
visit home page regularly!
Course Home Page
6
Why Theory of Computation?
1. A real computer can be modelled by a mathematical object: a theoretical computer.
2. A formal language is a set of strings, and can represent a computational problem.
3. A formal language can be described in many different ways that ultimately prove to be identical.
4. Simulation: the relative power of computing models can be based on the ease with which one model can simulate another.
7
5. Robustness of a computational model.
6. The Church-Turing thesis: anything that can be computed can be computed by a Turing machine.
7. Nondeterminism: languages can be described by the existence or nonexistence of computational paths.
8. Unsolvability: for some computational problems there is no corresponding algorithm that will unerringly solve them.
8
Practical Applications
1. Efficient compilation of computer languages
2. String searching
3. Identifying the limits; Recognizing difficult problems
4. Applications to other areas:– circuit verification– economics and game theory (finite automata as
strategy models in decision-making); – theoretical biology (L-systems as models of
organism growth) – computer graphics (L-systems) – linguistics (modeling by grammars)
9
History
• Euclid's attempt to axiomatize geometry
(Archimedes realized, during his own efforts to define the area of a planar figure, that Euclid's attempt had failed and that additional postulates were needed. )
• Leibniz's dream of a symbolic logic
• de Morgan, Boole, Frege, Russell, Whitehead:
Mathematics as branch of symbolic logic!
10
1900 Hilberts program
1880-1936 first programming languages
1931 Gödels incompleteness theorem
1936 Turing maschine (showed to be equivalent with recursive functions). Commonly accepted: TM as ultimate computer
1950 automata
1956 language/automata hierarchy
11
Every mathematical truth expressed in a formal language consisting of
• a fixed alphabet of admissible symbols, and
• explicit rules of syntax for combining those symbols into meaningful words and sentences
12
Turing used a Universal Turing machine (UTM) to prove an even more powerful incompleteness theorem because it destroyed not one but two of Hilbert's dreams:
1. finding a finite list of axioms from which all
mathematical truths can be deduced
2. Solving the entscheidungsproblem, ("decision
problem“) by producing a "fully automatic procedure"
for deciding whether a given proposition (sentence) is
true or false.
13
Mathematical Preliminaries
14
• Sets
• Functions
• Relations
• Graphs
• Proof Techniques
15
}3,2,1{A
A set is a collection of elements
SETS
},,,{ airplanebicyclebustrainB
We write
A1
Bship
16
Set Representations
C = { a, b, c, d, e, f, g, h, i, j, k }
C = { a, b, …, k }
S = { 2, 4, 6, … }
S = { j : j > 0, and j = 2k for some k>0 }
S = { j : j is nonnegative and even }
finite set
infinite set
17
A = { 1, 2, 3, 4, 5 }
Universal Set: All possible elements
U = { 1 , … , 10 }
1 2 3
4 5
A
U
6
7
8
910
18
Set Operations
A = { 1, 2, 3 } B = { 2, 3, 4, 5}
• Union
A U B = { 1, 2, 3, 4, 5 }
• Intersection
A B = { 2, 3 }
• Difference
A - B = { 1 }
B - A = { 4, 5 }
U
A B
A-B
19
• Complement
Universal set = {1, …, 7}
A = { 1, 2, 3 } A = { 4, 5, 6, 7}
12
3
4
5
6
7
AA
A = A
20
{ even integers } = { odd integers }
02
4
6
1
3
5
7
even
odd
Integers
21
DeMorgan’s Laws
A U B = A BU
A B = A U B
U
22
Empty, Null Set:
= { }
S U = S
S =
S - = S
- S =
U = Universal Set
23
Subset
A = { 1, 2, 3} B = { 1, 2, 3, 4, 5 }
A B
U
Proper Subset: A B
U
A
B
24
Disjoint Sets
A = { 1, 2, 3 } B = { 5, 6}
A B = U
A B
25
Set Cardinality
For finite sets
A = { 2, 5, 7 }
|A| = 3
26
Powersets
A powerset is a set of sets
Powerset of S = the set of all the subsets of S
S = { a, b, c }
2S = { , {a}, {b}, {c}, {a, b}, {a, c}, {b, c}, {a, b, c} }
Observation: | 2S | = 2|S| ( 8 = 23 )
27
Cartesian Product
A = { 2, 4 } B = { 2, 3, 5 }
A X B = { (2, 2), (2, 3), (2, 5), ( 4, 2), (4, 3), (4, 5) }
|A X B| = |A| |B|
Generalizes to more than two sets
A X B X … X Z
28
PROOF TECHNIQUES
• Proof by construction
• Proof by induction
• Proof by contradiction
29
Construction
We define a graph to be k-regular
if every node in the graph has degree k.
Theorem. For each even number n > 2 there exists
3-regular graph with n nodes.
1
2
4
3
0
5
1 2
0
3n = 4 n = 6
30
Construct a graph G = (V, E) with n > 2 nodes.
V= { 0, 1, …, n-1 }
E = { {i, i+1} for 0 i n-2} {{n-1,0}} (*)
{{i, i+n/2 for 0 i n/2 –1} (**)
The nodes of this graph can be written consecutively around the circle.
(*) edges between adjacent pairs of nodes
(**) edges between nodes on opposite sides
Proof by Construction
END OF PROOF
31
Induction
We have statements P1, P2, P3, …
If we know
• for some k that P1, P2, …, Pk are true
• for any n k that
P1, P2, …, Pn imply Pn+1
Then
Every Pi is true
32
Proof by Induction• Inductive basis
Find P1, P2, …, Pk which are true
• Inductive hypothesis
Let’s assume P1, P2, …, Pn are true,
for any n k
• Inductive step
Show that Pn+1 is true
33
Example
Theorem A binary tree of height n
has at most 2n leaves.
Proof
let L(i) be the number of leaves at level i
L(0) = 1
L(3) = 8
34
We want to show: L(i) 2i
• Inductive basis
L(0) = 1 (the root node)
• Inductive hypothesis
Let’s assume L(i) 2i for all i = 0, 1, …, n
• Induction step
we need to show that L(n + 1) 2n+1
35
Induction Step
hypothesis: L(n) 2n
Level
nn+1
36
hypothesis: L(n) 2n
Level
n
n+1
L(n+1) 2 * L(n) 2 * 2n = 2n+1
Induction Step
END OF PROOF
37
Inductionsbevis: Potensmängdens kardinalitet
Påstående: En mängd med n element har 2n delmängder
Kontroll
• Tomma mängden {} (med noll element) har bara en delmängd: {}.
• Mängden {a} (med ett element) har två delmängder: {} och {a}
38
Påstående: En mängd med n element har 2n delmängder
Kontroll (forts.)
• Mängden {a, b} (med två element) har fyra delmängder:
{}, {a}, {b} och {a,b}
• Mängden {a, b, c} (med tre element) har åtta delmängder: {}, {a}, {b}, {c} och {a,b}, {a,c}, {b,c}, {a,b,c}
Påstående stämmer så här långt.
39
Bassteg
Enklaste fallet är en mängd med noll element (det finns bara en sådan), som har 20 = 1 delmängder.
40
Induktionssteg
Antag att påståendet gäller för alla mängder med k element, dvs antag att varje mängd med k element har 2k delmängder.
Visa att påståendet i så fall också gäller för alla mängder med k+1 element, dvs visa att varje mängd med k+1 element har 2k+1 delmängder.
41
Vi betraktar en godtycklig mängd med k+1 element. Delmängderna till mängden kan delas upp i två sorter:
Delmängder som inte innehåller element nr k+1: En sådan delmängd är en delmängd till mängden med de k första elementen, och delmängder till en mängd med k element finns det (enligt antagandet) 2k stycken.
Delmängder som innehåller element nr k+1: En sådan delmängd kan man skapa genom att ta en delmängd som inte innehåller element nr k+1 och lägga till detta element. Eftersom det finns 2k delmängder utan element nr k+1 kan man även skapa 2k delmängder med detta element.
Totalt har man 2k + 2k = 2. 2k= 2k+1 delmängder till den betraktade mängden.
END OF PROOF
(Exempel från boken: Diskret matematik och diskreta modeller, K Eriksson, H. Gavel)
42
Proof by Contradiction
We want to prove that a statement P is true
• we assume that P is false
• then we arrive at a conclusion that contradicts our assumptions
• therefore, statement P must be true
43
Example
Theorem is not rational
Proof
Assume by contradiction that it is rational
= n/m
n and m have no common factors
We will show that this is impossible
2
2
44
Therefore, n2 is evenn is even
n = 2 k
2 m2 = 4k2 m2 = 2k2m is even
m = 2 p
Thus, m and n have common factor 2
Contradiction!
= n/m 2 m2 = n2 2
END OF PROOF
45
Countable Sets (Uppräkneliga mängder)
46
Infinite sets are either
Countable or Uncountable
47
Countable set
There is a one to one correspondence
between elements of the set
and natural numbers
48
We started with the natural numbers, then• add infinitely many negative whole numbers to get the integers, • then add infinitely many rational fractions to get the rationals, • then added infinitely many irrational fractions to get the reals.
Each infinite addition seem to increase cardinality: |N| < |Z| < |Q| < |R|
But is this true? NO!
49
Example
Integers: ,2,2,1,1,0
The set of integers is countable
Correspondence:
Natural numbers: ,4,3,2,1,0
oddnnevennnnf 2/)1(;2/)( {
50
ExampleThe set of rational numbers
is countable
Positive
Rational numbers:,
87
,43
,21
51
Naive Idea
Rational numbers: ,31
,21
,11
Natural numbers:
Correspondence:
,3,2,1
Doesn’t work!
we will never count
numbers with nominator 2:,
32
,22
,12
52
Better Approach
11
21
31
41
12
22
32
13
23
14
...
...
...
...
Rows: constant numerator (täljare)
Columns: constant denominator
53
11
21
31
41
12
22
32
13
23
14
...
...
...
...
54
We proved:
the set of rational numbers is countable
by describing an enumeration procedure
55
Definition
An enumeration procedure for is an
algorithm that generates
all strings of one by one
Let be a set of strings S
S
S
56
A set is countable if there is an
enumeration procedure for it
Observation
57
Example
The set of all finite strings
is countable
},,{ cba
We will describe the enumeration procedure
Proof
58
Naive procedure:
Produce the strings in lexicographic order:
aaaaaa
......Doesn’t work!
Strings starting with will never be produced b
aaaa
59
Better procedure
1. Produce all strings of length 1
2. Produce all strings of length 2
3. Produce all strings of length 3
4. Produce all strings of length 4
..........
Proper Order
60
Produce strings in
Proper Order
aaabacbabbbccacbcc
aaaaabaac......
length 2
length 3
length 1abc
61
Theorem
The set of all finite strings is countable
Proof
Find an enumeration procedure
for the set of finite strings
Any finite string can be encoded
with a binary string of 0’s and 1’s
62
Produce strings in Proper Order
length 2
length 3
length 10
1
00
01
10
11
000
001
….
0
1
2
3
4
5
6
7
….
String = program Natural number
63
PROGRAM = STRING (syntactic way)
PROGRAM = FUNCTION (semantic way)
PROGRAMstring string
PROGRAMnatural number
n
natural number
n
64
Uncountable Sets (Överuppräkneliga mängder)
65
A set is uncountable if it is not countable
Definition
66
Theorem
The set of all infinite strings is uncountable
We assume we have
an enumeration procedure
for the set of infinite strings
Proof (by contradiction)
67
Infinite string Encoding
0w
1w
2w
...
...
...
...
00b
10b
20b
01b
11b
21b
02b
12b
22b
=
=
=
Cantor’s diagonal argument
... ... ... ...
68
Cantor’s diagonal argument
We can construct a new string that is missing in our enumeration!
w
The set of all infinite strings is uncountable!
Conclusion
69
There are some integer functions that
that cannot be described by finite strings (programs/algorithms).
Conclusion
An infinite string can be seen as FUNCTION (n:th output is n:th bit in the string)
70
Theorem
Let be an infinite countable set
The powerset of is uncountable S2 S
S
Example of uncountable infinite sets
71
Proof
Since is countable, we can write S
},,,{ 321 sssS
72
Elements of the powerset have the form:
},{ 31 ss
},,,{ 10975 ssss
……
73
We encode each element of the power set
with a binary string of 0’s and 1’s
1s 2s 3s 4s
1 0 0 0}{ 1s
Powerset
element
Encoding
0 1 1 0},{ 32 ss
1 0 1 1},,{ 431 sss
...
...
...
...
74
Let’s assume (for contradiction)
that the powerset is countable.
we can enumerate
the elements of the powerset
Then:
75
1 0 0 0 0
1 1 0 0 0
1 1 0 1 0
1 1 0 0 1
Powerset
elementEncoding
1p
2p
3p
4p
...
...
...
...
...
76
Take the powerset element
whose bits are the complements
in the diagonal
77
1 0 0 0 0
1 1 0 0 0
1 1 0 1 0
1 1 0 0 1
New element: 0011
(binary complement of diagonal)
...
...
...
...
1p
2p
3p
4p
78
The new element must be some
of the powerset ip
However, that’s impossible:
the i-th bit of must be
the complement of itself
from definition of
Contradiction!
ip
ip
79
Since we have a contradiction:
The powerset of is uncountable S2 S
END OF PROOF
80
Example Alphabet : },{ ba
The set of all finite strings:
},,,,,,,,,{},{ * aabaaabbbaabaababaS
infinite and countable
uncountable infinite
}},,,}{,{},{},{{2 aababaabaaS 1L 2L 3L 4L
The powerset of contains all languages:S
An Application: Languages
81
Finite strings (algorithms): countable
Languages (power set of strings): uncountable
There are infinitely many more languages
than finite strings.
82
There are some languages
that cannot be described by finite strings (algorithms).
Conclusion
83
Kardinaltal
Kardinaltal är mått på storleken av mängder. Kardinaltalet för en ändlig mängd är helt enkelt antalet element i mängden.
Två mängder är lika mäktiga om man kan para ihop elementen i den ena mängden med elementen i den andra på ett uttömmande sätt, dvs det finns en bijektion mellan dem.
Detta mäktighetstänkande kan utvidgas till oändliga mängder. Till exempel är mängden av positiva heltal och mängden av heltal lika mäktiga.
84
Kardinaltal
Däremot kan man inte para ihop alla reella tal med heltalen på detta sätt. Mängden av reella tal har större mäktighet än mängden av heltal.
Man kan införa kardinaltal på ett sådant sätt att två mängder har samma kardinaltal om och endast om de har samma mäktighet. T ex kallas kardinaltalet som hör till de hela talen för 0 (alef 0, alef är den första bokstaven i det hebreiska alfabetet).
Dessa oändliga kardinaltal kallas transfinita kardinaltal.
85
Georg Cantor utvecklade i slutet av 1800-talet matematikens logiska grund, mängdläran.
Cantor införde begreppet transfinita kardinaltal.
Den enklaste, "minsta", oändligheten kallade han 0.
Mer om oändligheter…
86
0 är den uppräkningsbara oändliga mängdens
(exempelvis mängden av alla heltal) kardinaltalet.
Kardinaltalet av mängden punkter på en linje, och även punkterna på ett plan och i en kropp, kallade Cantor 1.
Fanns det större oändligheter?
Mer om oändligheter…
87
Ja! Cantor kunde visa att antalet funktioner på en linje var ännu oändligare än punkterna på
linjen, och han kallade den mängden 2.
Cantor fann att det gick att räkna med kardinaltalen precis som med vanliga tal, men räknereglerna blev något enahanda..
0 + 1= 0 0 + 0 = 0 0 · 0 = 0.
88
Men vid exponering hände det något:
0 0 (0 upphöjt till 0) = 1.
Mer generellt visade det sig att
2 n (2 upphöjt till n) = n+1
Det innebar att det fanns oändligt många oändligheter, den ena mäktigare än den andra!
89
Men var det verkligen säkert att det inte fanns någon oändlighet mellan den uppräkningsbara och punkterna på linjen? Cantor försökte bevisa den så kallade kontinuumhypotesen.
Cantor: two different infinities 0 and 1 http://www.ii.com/math/ch/#cardinals
Continuum Hypothesis: 0 < 1 = 2 0
Se även:http://www.nyteknik.se/pub/ipsart.asp?art_id=26484
90
Languages, Alphabets and
Strings
91
defined over an alphabet:
Languages
zcba ,,,,
A language is a set of strings
A String is a sequence of letters
An alphabet is a set of symbols
92
Alphabets and Strings
We will use small alphabets:
abbaw
bbbaaav
abu
baaabbbaaba
baba
abba
ab
aStrings
ba,
93
Operations on Strings
94
String Operations
m
n
bbbv
aaaw
21
21
y bbbaaax abba
mn bbbaaawv 2121
Concatenation (sammanfogning)
xy abbabbbaaa
95
12aaaw nR
naaaw 21 ababaaabbb
Reverse (reversering)
bbbaaababa
Example:
Longest odd length palindrome in a natural language:
saippuakauppias
(Finnish: soap sailsman)
96
String Length
naaaw 21
1
2
4
a
aa
abba
nw Length:
Examples:
97
Recursive Definition of Length
For any letter:
For any string :
Example:
1a
1wwawa
41111
11111
1
aab
abbabba
98
Length of Concatenation
vuuv
853
8
vuuv
aababaabuv
5,
3,
vabaabv
uaabuExample:
99
Proof of Concatenation Length
Claim:
Proof: By induction on the length
Induction basis:
From definition of length:
vuuv
v
1v
vuuuv 1
100
Inductive hypothesis:
vuuv
nv
1nv
vuuv
Inductive step: we will prove
for
for
101
Inductive Step
Write , where
From definition of length:
From inductive hypothesis:
Thus:
wav 1, anw
1
1
wwa
uwuwauv
wuuw
vuwauwuuv 1
END OF PROOF
102
Empty String
A string with no letters: (Also denoted as )
Observations:
}{{}
0
abbaabbaabba
www
103
Substring (delsträng)
Substring of string:
a subsequence of consecutive characters
String Substring
bbab
b
abba
ab
abbab
abbab
abbab
abbab
104
Prefix and Suffix
Suffixesabbab
abbab
abba
abb
ab
a
b
ab
bab
bbab
abbab uvw
prefix
suffix
Prefixes
105
Repetition
Example:
Definition:
n
n www... w
abbaabbaabba 2
0w
0abba
}
(String repeated n times)w
106
The * (Kleene star) Operation
the set of all possible strings from alphabet
*
,,,,,,,,,*
,
aabaaabbbaabaaba
ba
[Kleene is pronounced "clay-knee“]
107
The + Operation
: the set of all possible strings from
alphabet except
,ba
,,,,,,,,,* aabaaabbbaabaaba
*
,,,,,,,, aabaaabbbaabaaba
108
Example
* , oj, fy, usch, ojoj, fyfy,uschusch, ojfy, ojusch
*
, fyoj , usch
oj, fy, usch, ojoj, fyfy,uschusch, ojfy, ojusch
109
Operations on Languages
110
Language
A language is any subset of
Example:
Languages:
*
,,,,,,,,*
,
aaabbbaabaaba
ba
},,,,,{
,,
aaaaaaabaababaabba
aabaaa
111
Example
An infinite language }0:{ nbaL nn
Labb
aaaaabbbbb
aabb
ab
L
112
Operations on Languages
The usual set operations
aaaaaabbbaaaaaba
ababbbaaaaaba
aaaabbabaabbbaaaaaba
,,,,
}{,,,
},,,{,,,
,,,,,,, aaabbabaabbaa ,,,,,,,,,* aabaaabbbaabaaba
LL *Complement:
113
Reverse
}:{ LwwL RR
ababbaabababaaabab R ,,,,
}0:{
}0:{
nabL
nbaLnnR
nn
Examples:
Definition:
114
Concatenation
Definition: 2121 ,: LyLxxyLL
baaabababaaabbaaaab
aabbaaba
,,,,,
,,,
Example
115
Repeat
Definition:
Special case:
n
n LLLL
bbbbbababbaaabbabaaabaaa
babababa
,,,,,,,
,,,, 3
0
0
,, aaabbaa
L
116
Example
}0:{ nbaL nn
}0,:{2 mnbabaL mmnn
2Laabbaaabbb
117
Star-Closure (Kleene *)
Definition:
Example:
210* LLLL
,,,,
,,,,
,,
,
*,
abbbbabbaaabbaaa
bbbbbbaabbaa
bbabba
118
Positive Closure
Definition
*L 21
LLL
,,,,
,,,,
,,
,
abbbbabbaaabbaaa
bbbbbbaabbaa
bba
bba
119
Regular Expressions
120
Regular Expressions: Recursive Definition
1
1
21
21
*
r
r
rr
rr
are Regular Expressions
,,Primitive regular expressions:
2rGiven regular expressions and 1r
121
Examples
)(* ccbaA regular expression:
baNot a regular expression:
122
Zero or more.
a* means "zero or more a's."
To say "zero or more ab's," that is,
{, ab, abab, ababab, ...}, you need to say (ab)*.
ab* denotes {a, ab, abb, abbb, abbbb, ...}.
cba ,,Building Regular Expressions
123
One or more.
Since a* means "zero or more a's", you can use aa* (or equivalently, a*a) to mean "one or more a's.“
Similarly, to describe "one or more ab's," that is,
{ab, abab, ababab, ...}, you can use ab(ab)*.
cba ,,Building Regular Expressions
124
Any string at all.
To describe any string at all (with = {a, b, c}), you can use (a+b+c)*.
Any nonempty string.
This can be written as any character from followed by any string at all: (a+b+c)(a+b+c)*.
cba ,,
Building Regular Expressions
125
Any string not containing....
To describe any string at all that doesn't contain an a (with = {a, b, c}), you can use (b+c)*.
Any string containing exactly one...
To describe any string that contains exactly one a, put "any string not containing an a," on either side of the a, like this: (b+c)*a(b+c)*.
cba ,,Building Regular Expressions
126
Languages of Regular Expressions
,...,,,,,*)( bcaabcaabcacbaL
Example
rL rlanguage of regular expression
127
Definition
For primitive regular expressions:
aaL
L
L
128
Definition (continued)
For regular expressions and
1r 2r
2121 rLrLrrL
2121 rLrLrrL
** 11 rLrL
11 rLrL
129
Example *aba
*abaL *aLbaL
*aLbaL
*aLbLaL
*aba
,...,,,, aaaaaaba
,...,,,...,,, baababaaaaaa
Regular expression:
130
Example
Regular expression
bbabar *
,...,,,,, bbbbaabbaabbarL
131
Example
Regular expression
bbbaar **
}0,:{ 22 mnbbarL mn
132
Example
Regular expression
*)10(00*)10( r
)(rL { all strings with at least
two consecutive 0 }
1,0
133
Example
Regular expression
(consists of repeating 1’s and 01’s).
)0(*)011(1 r
)(rL = { all strings without
two consecutive 0 }
1,0
134
Example
L = { all strings without
two consecutive 0 }
)0(*1)0(**)011*1(2 r
(In order not to get 00 in a string, after each 0 there must be an 1, which means that strings of the form 1....101....1are repeated. That is the first parenthesis. To take into account strings that end with 0, and those consisting of 1’s solely, the rest of the expression is added.)
Equivalent solution:
135
Equivalent Regular Expressions
Regular expressions and
1r 2r
)()( 21 rLrL are equivalent if
Definition:
136
In order to see that both regular expressions describe the same language, you can even run the a5 program.
137
Example L = { all strings without
two consecutive 0 }
)0(*1)0(**)011*1(2 r
LrLrL )()( 21 1r 2randare equivalent
regular expressions.
)0(*)011(1 r