Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 214 times |
Download: | 1 times |
Approximate Factoring for A* Search
Aria Haghighi, John DeNero, and Dan Klein
Computer Science Division
University of California Berkeley
Inference for NLP Tasks
A* Search
Inference as Search
ya1
a2
a3
PartialHypothesis
a2
VP
S
NP
Bitext Parsing as Search
translation is hard , la traducción es dificil
Weighted Synchronous Grammar
Parsing O(n6)
Modified CKY over bi-spans (X[i,j],X’[i’,j’])
Source Target
VP
S
NP
S S’
A* Search
Completion ScoreScore So Far
y
A* Search
Heuristic Design Tight
small Admissible
Efficient to compute
This way hypothesis!
A* Heuristic ManOptimal Result
A* Example: Bitext Search
Viterbi Inside Score
Cost So Far
Bi-Span
A* Bitext Search
Viterbi Outside Score
Completion Score
O(n6)Ideal Heuristic
Of Stately Projections ¼
S S’
S SVP
S
NP
S S’
S S’
VP
S
NP VP’
S’
NP’
VP’
S’
NP’
A* Bitext Search
Suppose,Then,
VP
S
NP
S S’
VPVP
S
NP
S
NP
VP’
S’
NP’
Projection Heuristic
O(n3) O(n3) O(n6)
Klein and Manning [2003]
When models don’t factorize
When models don’t factorize
Pointwise Admissibility
y
c(a)
x
¼s(y)
Ás(a)
¼s(x) ¼t(y)
Át(a)
¼t(x)
When models don’t factorize
Admissibility
¼s(y) ¼t(y)
y
Finding Factored Costs
Pointwise Gap
How to find Ás and Át?
Finding Factored Costs
Small gaps
Finding Factored Costs
PointwiseAdmissibility
Finding Factored Costs
Bitext Experiments
Synchronous Tree-to-Tree Transducer Trained on 40k sentences of English-Spanish Europarl [Galley et. al, 2004] Rare words replaced with POS tags Tested on 1,200 sent. max length 5-15
Optimization Problem Solved only once per grammar 206K Variables 160K Constraints 29 minutes
Bitext Experiments
Bitext Experiments
Bitext Experiments
Zhang and Gildea (2006)
Bitext Experiments
Zhang and Gildea (2006)
Lexicalized Parsing
NP-(translation,NN)
S-(is,VBZ)
VP-(is,VBZ)
(is,VBZ)
(translation, NN)NP
S
VP
Klein and Manning [2003]
Lexicalized Parsing
Lexicalized Parsing
Too many constraints to efficiently solve!
Over 64e13
possiblelexicalized
rules
Lexicalized Parsing
Lexicalized Parsing
Lexicalized Parsing
Lexicalized Parsing
Lexicalized Model Experiments
Standard Setup Train on section 2-21 of the treebank Test on section 23 (length · 40)
Models Tested Factored model [Klein and Manning, 2003]
Non-Factored Model
Lexicalized Parsing
Factored Model [Klein and Manning, 2003]
Lexicalized Parsing
Non-Factored Model
Conclusions
General technique for generating A* estimates
Can explicitly control admissibility tightness trade-off
Future Work: Explore different objectives and applications
Thanks
http://nlp.cs.berkeley.edu