Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | christal-gibbs |
View: | 220 times |
Download: | 0 times |
Sentences
The sentence as a string of words
E.g I saw the lady with the binoculars
string = a b c d e b f
The relations of parts of a string to each other may be different
I saw the lady with the binoculars
is stucturally ambiguous
Who has the binoculars?
[ I ] saw the lady [ with the binoculars ]
= [a] b c d [e b f]
I saw [ the lady with the binoculars]
= a b [c d e b f]
How can we represent the difference?
By assigning them different structures.
We can represent structures with 'trees'.
I read
the book
a. I saw the lady with the binoculars
S NP VP
V NP
NP PP
I saw the lady with the binoculars I saw [the lady with the binoculars]
b. I saw the lady with the binoculars
S NP VP
VP PP
I saw the lady with the binoculars I [ saw the lady ] with the binoculars
Rules Assumption: natural language grammars are a rule-based
systems What kind of grammars describe natural language
phenomena? What are the formal properties of grammatical
rules?
Chomsky (1957) Syntactic Struc-tures. The Hague: Mouton
Chomsky, N. and G.A. Miller (1958) Finite-state languages Information and Control 1, 99-112
Chomsky (1959) On certain formal properties of languages. Information and Control 2, 137-167
Rules in Linguistics 1. PHONOLOGY
/s/ → [θ] V ___V Rewrite /s/ as [θ] when /s/ occurs in context V ____ V With:V = auxiliary nodes, θ = terminal nodes
Rules in Linguistics 2. SYNTAX
S → NP VPVP → VNP → N
Rewrite S as NP VP in any contextWith:
S, NP, VP = auxiliary nodesV, N = terminal node
PHONOLOGY (sound system) Maltese – Word-final devoicing Orthography Pronunciation(spelling) (sound) Sabet sab [sa-bet] [sap]Ħobża ħobż [hob-za] [hops]Vjaġġi vjaġġ [vjağ-ği] [vjačč] voiced [+vd] voiceless [-vd][b, z, ğ] [p, s, č] [+vd] → [-vd] /____ # (for # = end of word)
MORPHOLOGY (word formation)
Maltese – Progressive assimilation in 3fsg imprefective (present)
Marker for verb in 3rd person feminine singular imperfective t- (3fsgimpf = she)
e.g. she breaks = t-kisser
I break = n-kisser
t-kisser t-ressaq
3fsg-break 3fsg-move
she breaks she moves
s-sakkar d-dur
3fsg-lock 3fsg-turn
she locks she turns
*t-sakkar * t-dur
t → s,d,etc. /____ [s,d,etc.
| [+cor]
μ
[3fsg]
(with μ = morpheme, C = consonant, cor = coronal
SYNTAX (phrase/sentence formation) SENTENCE: The boy kissed the girlSUBJECT PREDICATENOUN PHRASE VERB PHRASEART + NOUN VERB + NOUN PHRASE S → NP VPVP → V NPNP → ART N
SEMANTICS (meaning) The lion attacks the hunter
ATTACK (a, b)
a λy [ATTACK (y, b)]
λz λy [ATTACK (y, z)] b (with a = the lion, b = the hunter)
Chomsky Hierarchy 0. Type 0 (recursively enumerable) languagesOnly restriction on rules: left-hand side cannot be the
empty string (* Ø …….) 1. Context-Sensitive languages - Context-Sensitive (CS)
rules 2. Context-Free languages - Context-Free (CF) rules 3. Regular languages - Non-Context-Free (CF) rules 0 ⊇ 1 ⊇ 2 ⊇ 3
a ⊇ b meaning a properly includes b (a is a superset of b), i.e. b is a proper subset of a or b is in a
Generative power
0.Type 0 (recursively enumerable) languages- only restriction on rules: left-hand side cannot be the empty string (* Ø …….)
- is the most powerful system
3. Type 3(regular language)
- is the least powerful
Rule Type – 3
Name: Regular
Example: Finite State Automata (Markov-process Grammar)
Rule type:a) right-linear
A xB orA x
with:A, B = auxiliary nodes and x = terminal node
b) or left-linear
A Bx orA x
Generates: ambn with m,n 1
Cannot guarantee that there are as many a’s as b’s; no embedding
A regular grammar for natural language sentences S → the A A → cat BA → mouse BA → duck B B → bites CB → sees CB → eats C C → the D D → boyD → girlD → monkey the cat bites the boythe mouse eats the monkeythe duck sees the girl
Regular grammars Grammar 1: Grammar 2:A → a A → aA → a B A → B aB → b A B → A b Grammar 3: Grammar 4:A → a A → aA → a B A → B aB → b B → bB → b A B → A b Grammar 5: Grammar 6:S → a A A → A aS → b B A → B aA → a S B → bB → b b S B → A bS → A → a
Grammars: non-regular Grammar 6: Grammar 7:S → A B A → aS → b B A → B aA → a S B → bB → b b S B → b AS →
Rule Type – 2 Name: Context Free Example:Phrase Structure Grammars/Push-Down Automata Rule type:
A with:A = auxiliary node = any number of terminal or auxiliary nodes Recursiveness (centre embedding) allowed:A A
CF Grammar A Context Free grammar consists of: a) a finite terminal vocabulary VT
b) a finite auxiliary vocabulary VA
c) an axiom S VA
d) a finite number of context free rules of form A → γ,
where A VA
and γ {VA VT}* In natural language syntax S is interpreted as the start symbol for
sentence, as in S → NP VP
CF Grammars The following languages cannot be generated by a regular
grammar Language 1: Language 2: anbn mirror image ab abaabaaabb abbaabba
Context-Free rules:A → a A aA → a bA → b A b
Natural languageIs English regular or CF?
If centre embedding is required, then it cannot be regular
Centre Embedding:1. [The cat] [likes tuna fish]
a b
2. The cat the dog chased likes tuna fish a a b b
3. The cat the dog the rat bit chased likes tuna fish a a a b b b
4. The cat the dog the rat the elephant admired bit chased likes tuna fish a a a a b b b b
abaabbaaabbbaaaabbbb
Natural language
Is English regular or CF?
If centre embedding is required, then it cannot be regular
Centre Embedding
1.[The cat] [likes tuna fish]
a b
= ab
2.[The cat] [the dog] [chased] [likes tuna fish]
a a b b
= aabb
3. [The cat] [the dog] [the rat] [bit] [chased] [likes ...]
a a a b b b
4. [The cat] [the dog] [the rat] [the elephant] [admired] [bit] [chased] [likes ....]
=
a a a a b b b b
aaabbbaaaabbbb
Natural language 2 More Centre Embedding: 1. If S1, then S2
a a 2. Either S3, or S4
b b 3. The man who said S5 is arriving today
4. The man who said S6 is arriving the day after Sentence with embedding:If either the man who said S5 is arriving today or the man who said S5 is arriving
tomorrow, then the man who said S6 is arriving the day after abba = abba
Natural language 2 More Centre Embedding: 1. If S1, then S2
a a 2. Either S3, or S4
b b Sentence with embedding: If either the man is arriving today or the woman is arriving tomorrow, then the child is
arriving the day after. a = [if b = [either the man is arriving today] b = [or the woman is arriving tomorrow]]a = [then the child is arriving the day after] = abba
CS languages The following languages cannot be generated by a CF grammar (by
pumping lemma): anbmcndm
Swiss German: A string of dative nouns (e.g. aa), followed by a string of accusative nouns
(e.g. bbb), followed by a string of dative-taking verbs (cc), followed by a string of accusative-taking verbs (ddd)
= aabbbccddd
= anbmcndm