Post on 14-Jan-2016
description
transcript
1
CS 385 Fall 2006Chapter 14
Understanding Natural Language
(omit 14.4)
2
The Problem
Language is fuzzy– I feel funny
– Fruit flies like bananas
– Is there water in the fridge?
Early history: – dictionary translation, word by word
– out of sight, out of mind → the person is blind and insane
– did not address interrelation among words
– more to it: what you know beyond the simple meaning of a word
– Doug Lenat's CYC project (1984), now Cyc Corporationrepresent world knowledge via logic and frames
12 years, 35 million dollars
questionable resultshttp://www.cyc.com/cyc/cycrandd/areasofrandd_dir/nlu
3
Levels of Analysis (big picture)
Prosody – rhythm and intonation of language
Phonlogy– the sounds which comprise language (phonemes)
– speech analysis: identify phonemes and conglomerated into words.
Morphology– the components that make up words (ing, ed,...)
Syntax – rules for combining words into legal (syntactically correct) sentences
– used to parse a sentence
– the most successful level, because it is formalized
Semantics – attaching meaning to words, phrases, and sentences
Pragmatics – how is language usually used? “How are you" → "fine"
World knowledge– general background necessary to interpret text or conversation
– “My thesis draft is due tomorrow” makes you think of ...?..
4
Today
Acceptance that a general conversationalist is unlikely
Scale back to interpretation in restricted applications – MS word grammar and style checker
– others?
Audio to text, but little interpretation – cell phone speed dial
– United Airlines customer service
– medical transcription
– Acura TL recognizes 650 voice commands
Normal steps of linguistic analysis:– parsing
– semantic interpretation
– expanded representation
5
Specifying a Grammar
1. sentence np vp
2. np n
3. np art n rewrite rules
4. vp v
5. vp v np
6. art a
7. art the
8. n man terminals
9. n dog
10. v likes
11. v bites
Legal sentence: a string of terminals that can be derived from these rules.
6
Parse Tree for Tony growled at Bob
sentence
noun phrase verb phrase
noun verb prepositional phrase
preposition noun phrase
noun
Tony growled
at
Bob
7
Interpret it with a Semantic Net
Construct a semantic net describing mammals:– mammals are covered with hair
– tigers are a subclass with stripes that growls
– Tony is a tiger
– humans are a subclass of mammal that are frightened by tigers
– Bob is a human
mammal
humantiger frightens
subclasssubclass
Bob
instanceinstance
Tony
stripes
has prop
growls
has prop
has prop
hair
8
Semantic Interpretation (conceptual graph)
Growling has an agent and an object (from parse tree):
Expanded representation of the sentencemeaning (from sem. net):
we know Tony is a tiger and Bob should be frightened
do a join:
tiger:Tony growled at person: Bobagent object
tiger frightens personagent object
tiger:Tony growled at person: Bobagent object
frightensagent object
9
Fig 14.2 Stages in producing an internal representation of a sentence.
10
Parsing The man bites the dog
Top down: start at sentence symbol and work down to a string of terminals
1. sentence np vp
2. np n
3. np art n
4. vp v
5. vp v np
6. art a
7. art the
8. n man
9. n dog
10. v likes
11. v bites
sentence
→ np vp
→ art n vp
→ the n vp
→ the man vp
→ the man v np
→ the man bites np
→ the man bites art n
→ the man bites the n
→ the man bites the dog
11
Resulting parse tree for “The man bites the dog”
5
Problem: we needed to know where we are goingGoal driven: need to back up and retrace a lotNew approach: transition net parsers
12
Transition Net Parsers
Grammar: a set of finite state
machines or transition nets
One for each non terminal
Successful transition through
the network == replacing the
nonterminal by the rhs of a
grammar rule
E.g. first arc in sentence ATN
replaced by a path through the
np ATN
13
Sinitial Sfinalnoun phrase verb phrase
np
Sinitial art n Sfinal
n
Sinitial
art
Sfinal
a
the
the
Sinitial
n
man
dog
man
Sfinal
the man
The man bites the dog:
14
How would you augment the grammar to allow "bite the dog?
Sinitial Sfinalnoun phrase verb phrase
verb phrase
15
What paths would be examined to parse it?
Begin with sentence network and try to move along top arc
Go to np network
Try to move along bottom arc
Go to noun network
Try man. fail
Try dog. fail
Try to move along top arc
Go to article network
Try “a” fail
Try “the” fail
Fail article
Fail np network
Sinitial Sfinalnoun phrase verb phrase
verb phrase
Sinitial art n Sfinal
nsentencenonn phrase
16
Try to move along bottom arc
Go to vp network
Go to v network
Try likes. fail
Try bites. fail
Try bite. success
Go to np network
Go to art network
Try a. fail
Try the. succeed
Go to n network
Try man. fail
Try dog. succeed
Succeed (np)
Succeed (vp)
Sinitial Sfinalnoun phrase verb phrase
verb phrase
Sinitialv np
Sfinal
v
verb phrasesentence
17
What Next?
Note, this does not build the parse tree, it just identifies correct sentences
To build a tree:– Each terminal returns success and a tree with the terminal as a
single node
– Each non-terminal network returns a set of subtrees whose root is the nonterminal symbol and whose leaves are the trees for the branches taken
Add the tree to the steps for "bite the dog"
18
Go to vp network
Go to v network
Try likes. fail
Try bites. fail
Try bite. success {return verb
Go to np network bite}
Go to art network
Try a. fail
Try the. succeed {return article
Go to n network the}
Try man. fail
Try dog. succeed {return noun
Succeed (np) dog }
Succeed (vp) {return vp
v np
bite art n
the dog}
Sinitial Sfinalnoun phrase verb phrase
verb phrase
Sinitialv np
Sfinal
v
verb phrasesentence
19
Pseudo-code for a transition network parser
Defined using two mutually recursive functions, parse and transitionfunction parse(grammar_symbol)
continued…
20
21
Fig 14.5 Trace of a transition network parse of the sentence “Dog bites.”
9
transition(Noun_phrase)
parse(Article)
parse(sentence)
terminals don't match Dog parse(Noun) terminal
matches Dog
Red corresponds to function calls
22
14.2.3 The Chomsky Hierarchy and Context Sensitive Languages
Chomsky hierarchy: – of languages by increasing linguistic complexity
– we will be concerned with context-free
context sensitive
Context-free: – one non-terminal symbol on the lhs of a rewrite rule
– problem: no requirement that dog is followed by bites, not bite
– e.g. no relation between dog and its appropriate verb because the two can’t both be on lhs.
Is a programming language (C++) context-free?cast expressions
template syntax
23
Context-Sensitive Grammars
More than one symbol on lhs → a noun and verb can be related
singular and plural are part of the spec via "number"
Example:
sentence ↔ noun_phrase verb_phrase
noun_phrase ↔ article number noun
article singular ↔ a singular
article singular ↔ the singular
article plural ↔ the plural
singular noun ↔ man singular
singular verb phrase ↔ singular verb
singular verb ↔ bites
Parse: The man bites:sentence
noun_phrase verb_phrase
article singular noun verb phrase
The singular noun verb phrase
The dog singular verb phrase
The dog singular verb
The dog bites
24
Data-Driven Parse?
Example:
1. sentence ↔ noun_phrase verb_phrase
2. noun_phrase ↔ article number noun
3. article singular ↔ a singular
4. article singular ↔ the singular
5. article plural ↔ the plural
6. singular noun ↔ man singular
7. singular verb phrase ↔ singular verb
8. singular verb ↔ bites
The man bites:Rule 8 matches bites
The man singular verb
Rule 7
The man singular verb-phrase
Rule 6
The singular noun verb-phrase
Rule 4
article singular noun verb-phrase
Rule 6
noun_phrase verb-phrase
Rule 1
sentence
25
Problems with Context-Sensitive
More rules
Obscured phrase structure, semantics mush in with syntax
Still no semantic representation
Next step: ATN parsers
Terminals and non-terminals represented
as identifiers (frames) with
attached features (slots)
Procedures attached to arcs of the network– executed when ATN traverses an arc
– values assigned to grammatical features
– tests performed and transition can fail, e.g. if no number agreement
26
Fig 14.7 Dictionary entries for a simple ATN
27
Fig 14.8 An ATN grammar that checks number agreement and builds a parse tree.
.NOUN-PHRASE
checking for agreement →
←typo
28
Fig 14.8 continued from previous slide.
29
Fig 14.8 continued from previous slide.
30
Fig 14.9 Parse tree for “The dog likes a man”
31
Combining Syntax and Semantics
Build conceptual graph the parse treee.g. representation for sentence:
get representation for subject from the noun phrase
get representation for verb phrase
bind subject to agent of the graph for the verb phrase
When you reach a terminal, retrieve information from a knowledge baseconcepts. e.g. dog, man as in a type hierarchy (next slide)
conceptual relations as in next slide
32
Knowledge BaseType hierarchy: Frames for likes and bites
33
34
Parse tree → Semantic Representation
1. call sentence
2. sentence calls noun_phrase
3. noun_phrase calls noun
4. noun returns concept for dog (1)
5. article is definite →bind a marker to dog (2)
6. sentence calls verb_phrase
7. verb_phrase calls verb which retrieves frame
for like (3)
8. verb_phrase calls noun_phrase which calls
noun to retrieve man (4)
9 . article is definite → leave concept generic (7)
35
14.5 Natural Language Applications
Story understanding and question answering– goal: a program that can read a story and answer questions
– why useful?
What can we do so far?– parse and interpret a sentence
(perform network joins between semantic interpretation of the input and conceptual graphs in the knowledge base)
– can we expand this?
Yes– answer questions
– scripts
– join semantic representations for multiple sentences
36
Answer Questions
Answer questions:
fido bit tony
What did fido bite tony with?
Scripts:
fido bit tony
tony has blood on his coat
A script might infer that the blood came from the bite.
37
Join Semantic Representations Sentences
Given fido bit tony
fido has no teeth
What?
38
14.5.2 Database Front End
Information is structuredselect salary
from employee_salary
where employee ="John Smith"
select salary
from employee_salary, manager_of_hire
where manager ="Ed Angel" and
manager_of_hire.employee=employee_salary.employee
What is John Smith's salary?
List the salaries of employees
who work for Ed Angel
39
Entity-relationship diagrams Knowledge base entry
40
Database query from natural language input "Who hired john smith?"
41
14.5.3 Information extraction from the Web
42
Fig 14.20 An architecture for information extraction, from Cardie (1997).
As on preceding slide