University of Texas at Austin Machine Learning Group Department of Computer Sciences University of...

transcript

University of Texas at Austin

Machine Learning Group

Machine Learning GroupDepartment of Computer Sciences

University of Texas at Austin

Learning Semantic Parsers Using Statistical Syntactic Parsing Techniques

February 8, 2006

Ruifang Ge

Supervisor Professor: Raymond J. Mooney

Semantic Parsing

• Semantic Parsing: maps a natural-language sentence to a complete, detailed and formal meaning representation (MR) in a meaning representation language

• Applications– Core component in practical spoken language systems:

• JUPITER (MIT weather 1-888-573-talk)

• MERCURY (MIT flight 1-877-MIT-talk)

– Advice taking (Kuhlmann et al., 2004)

CLang: RoboCup Coach Language

• In RoboCup Coach competition teams compete to coach simulated players

• The coaching instructions are given in a formal language called CLang

Simulated soccer field

If our player 2 has the ball, our player 4

should stay in our half

((bowner our {2})

(do our {4} (pos (half our))))

Semantic Parsing

Motivating Example

Semantic parsing is a compositional process. Sentence structures are needed for building meaning representations.

((bowner our {2}) (do our {4} (pos (half our))))

If our player 2 has the ball, our player 4 should stay in our half

Roadmap

• Related work on semantic parsing• SCISSOR• Experimental results• Proposed work• Conclusions

Category I: Syntax-Based Approaches

• Meaning composition follows the tree structure of a syntactic parse

• Composing the meaning of a constituent from the meanings of its sub-constituents in a syntactic parse – specified using syntactic relations and semantic

constraints in application domains

• Miller et al. (1996), Zettlemoyer & Collins (2005)

Category I: Example

our player 2 has

the ball

PRP$-our NN-player(_,_) CD-2 VB-bowner(_)

DT-null NN-null

NP-null

VP-bowner(_)NP-player(our,2)

S-bowner(player(our,2))

player(team,unum) semantic vacuous

require argumentsrequire no arguments

bowner(player)

Category I: Example

our player 2 has

the ball

DT-null NN-null

NP-null

VP-bowner(_)

NP-player(our,2)

player(team,unum)

bowner(player)

Category I: Example

our player 2 has

the ball

DT-null NN-null

NP-null

VP-bowner(_)NP-player(our,2)

player(team,unum)

bowner(player)

Category II: Purely Semantic-Driven Approaches

• No syntactic information is used in building tree structures

• Non-terminals in this category correspond to semantic concepts in application domains

• Tang & Mooney (2001), Kate (2005), Wong(2005)

Category II: Example

our player 2

has the ballour 2

player

bowner

Category III: Hybrid Approaches

• Utilizing syntactic information in semantic parsing approaches driven by semantics– Syntactic phrase boundaries

– syntactic category of semantic concepts

– word dependencies

• Kate, Wong & Mooney (2005)

Our Approach

• We introduce an approach falling into category I: a syntax-driven approach

• Reason– Employ state-of-the-art statistical syntactic parsing

techniques to help building tree structures for meaning composition

– State-of-the-art statistical parsing techniques are becoming more and more robust and accurate [Collins (1997) and Charniak & Johnson (2005)]

Roadmap

SCISSOR: Semantic Composition that Integrates Syntax and Semantics to get Optimal Representations

• An integrated syntax-based approach – Allows both syntax and semantics to be used

simultaneously to build meaning representations

• A statistical parser is used to generate a semantically augmented parse tree (SAPT)

• Translate a SAPT into a complete formal meaning representation (MR) using a meaning composition process

SCISSOR

MR: bowner(player(our,2))

our player 2 has

the ball

PRP$-team NN-player CD-unum VB-bowner

DT-null NN-null

NP-null

VP-bownerNP-player

S-bowner

• An integrated syntax-based approach – Allows both syntax and semantics to be used

simultaneously to build meaning representations

• A statistical parser is used to generate a semantically augmented parse tree (SAPT)

• Translate a SAPT into a complete formal meaning representation (MR) using a meaning composition process

• Allow statistical modeling of semantic selectional constraints in application domains– (AGENT pass) = PLAYER

SCISSOR

Overview of SCISSOR

Integrated Semantic ParserSAPT Training Examples

TRAINING

ComposeMR

NL Sentence

TESTING

learner

Extending Collins’ (1997) Syntactic Parsing Model

• Collins’ (1997) introduced a lexicalized head-driven syntactic parsing model

• Bikel’s (2004) provides an easily-extended open-source version of the Collins statistical parser

• Extending the parsing model to generate semantic labels simultaneously with syntactic labels constrained by semantic constraints in application domains

Example: Probabilistic Context Free Grammar (PCFG)

PRP$ NN CD VB

our player 2 has

the ball

S NP VP 0.4

NP PRP$ NN CD 0.06

VP VB NP 0.3

PRP$ our 0.01

NN player 0.001

CD 2 0.0001

VB has 0.02

NN ball 0.01

DT the 0.1

P(Tree, S) = 0.4*0.06*0.3*…*0.01

Probability of rules are independent Of words involved

Example: Lexicalized PCFG

PRP$ NN CD VBDT NN

our player 2 has

the ball

PRP$ NN CD VB

NP(ball)

VP(has)NP(player)

S(has)

our player 2 has

the ball

Nodes in purple are heads of the rules

Example: Estimating Rule Probability

P(NP(player) VP(has) | S(has))

VP(has)NP(player)

S(has)

= P(VP(has) | S(has)) ×

P(NP(player) | S(has) VP(has))

Decompose expansion of a non-terminal into primitive steps

In Collins’ model, syntactic subcategorization frames are used to constrainthe generation of modifiers, e.g., has requires an NP as its subject

Integrating Semantics into the Model

PRP$-team NN-null CD-unum VB-bowner

DT-null NN-null

NP-null(ball)

VP-bowner(has)NP-player(player)

S-bowner(has)

our player 2 has

the ball

Non-terminals now have both syntactic and semantic labels

Estimating Rule Probability Including Semantic Labels

S-bowner(has)

VP-bowner(has)

Ph(VP-bowner | S-bowner, has)

S-bowner(has)

VP-bowner(has)

Plc({NP}-{player} | S-bowner, VP-bowner, has) × Prc({}-{}| S-bowner, VP-bowner, has)

Ph(VP-bowner | S-bowner, has) ×

{NP}-{player} { }-{ }

has requires an NP as its object, but it’s within VP

{NP}: syntactic constraint to the left{player}: semantic constraint to the left

Pd(NP-player(player) | S-bowner, VP-bowner, has, LEFT, {NP}-{player})

Plc({NP}-{player} | S-bowner, VP-bowner, has) × Prc({}-{}| S-bowner, VP-bowner, has) ×

NP-player(player)

S-bowner(has)

VP-bowner(has)

{NP}-{player} { }-{ }

Pd(NP-player(player) | S-bowner, VP-bowner, has, LEFT, {NP}-{player})

Plc({NP}-{player} | S-bowner, VP-bowner, has) × Prc({}-{}| S-bowner, VP-bowner, has) ×

S-bowner(has)

VP-bowner(has)NP-player(player)

{ }-{ } { }-{ }

Parser Implementation

• Supervised training on annotated SAPTs is just frequency counting

• Augmented smoothing technique is employed to account for additional data sparsity created by semantic labels.

• Parsing of test sentences to find the most probable SAPT is performed using a variant of standard CKY chart-parsing algorithm.

Roadmap

Experimental Results: Experimental Corpora

• CLang – 300 randomly selected rules from the log files of the

2003 RoboCup Coach Competition

– Coaching advice is annotated with NL sentences by 4 annotators independently

– 22.52 words per sentence

• GeoQuery [Zelle & Mooney, 1996] – 250 queries for U.S. geography database

– 6.87 words per sentence

Experimental Methodology

• Evaluated using standard 10-fold cross validation• Correctness

– CLang: output exactly matches the correct representation

– Geoquery: query retrieves correct answer

Experimental Methodology

• Metrics

|Parses Completed|

|Parses CompletedCorrect |Precision

Sentences

Parses CompletedCorrect Recall

RecallPrecision

Recall*Precision*2measure-F

Compared Systems• COCKTAIL (Tang & Mooney, 2001)

– A purely semantic-driven approach which learns a shift-reduce deterministic parser using inductive logic programming techniques

• WASP (Wong, 2005)– A purely semantic-driven approach using machine translation

techniques

• KRISP (Kate, 2005)– A purely semantic-driven approach based on string kernel

The above systems all learn from sentences paired with meaning representations

SCISSOR need extra annotation (SAPTs)

Precision Learning Curve for CLang

deterministic parsing memory overflow

Recall Learning Curve for CLang

F-measure Learning Curve for CLang

Significantly better at the 95% confidence interval

Results on Sentences within Different Length Range

• How does sentence complexity affect parsing performance

• Sentence complexity is a difficult thing to measure• Use sentence length as an indicator

Sentence Length Distribution (CLang)

10 20 30 40 50+

Sent ence Lengt h

Detailed CLang Results on Sentence Length

Syntactic structure is needed on longer sentences where using semantic constraints alone can not sufficiently

eliminate ambiguities

Precision Learning Curve for GeoQuery

Recall Learning Curve for GeoQuery

F-measure Learning Curve for GeoQuery

Not significantly better at the 95% confidence interval

Zettlemoyer & Collins (2005)

• It introduces a syntax-based semantic parser based on combinatory categorical grammar (CCG) (Steedman, 2000)

• Require a set of hand-built rules to specify possible syntactic categories for each type of semantic concepts

Zettlemoyer & Collins (2005)

• Provide results on a larger GeoQuery dataset (880 examples):– Using a different experimental setup

– Prec/Recall: 96.25/79.29

(SCISSOR Prec/Recall: 92.08/72.27)

• Performance on more complex domains such as CLang is not clear– Need to design another set of hand-built template rules

Roadmap

• Related work on semantic parsing• SCISSOR• Experimental results• Proposed work

– Discriminative Reranking for Semantic Parsing

– Automating the SAPT-Generation

– Other issues

• Conclusions

Reranking for Semantic Parsing

Reranker

SAPTs after Reranking

SCISSOR

Input Sentence

Current Ranked SAPTs

local features global features

Reranking has been successfully used in parsing, tagging, machine translation, …

Reranking Features

• Collins (2000) introduces syntactic features for reranking syntactic parses– One level rules: f(NP PRP$ NN CD)=1

– Bigrams, two level rules, …

• To reranking SAPTs, we can introduce a semantic feature type for each syntactic feature type– Based on the coupling of syntax and semantics

– Example: one level rules• f(PLAYER TEAM PLAYER UNUM)=1

NP-PLAYER

NN-PLAYERPRP$-TEAM CD-UNUM

Reranking Evaluation• Rerank on top 50 best parses generated by SCISSOR• Reranking algorithm: averaged perceptron (Collins, 2002)

– Simple, fast and effective

SCISSOR 86.94 78.19 82.33

Oracle score - 85.58 -

sem 89.55 80.54 84.81(14.0)

syn 87.31 78.52 82.68

sem+syn 88.81 79.87 84.10

Significantly better

• Reranking does not improve the results on GeoQuery

Further Investigation of Reranking Features

• Semantic Role Labeling (SRL) features– Identifying the semantic relations, or semantic roles of

a target word in a given sentence

[giver John] gave [entity given to Mary] [thing given a pen]

Roadmap

– Other issues

• Conclusions

Discriminative Learner

Syntactic Parser

Training set{(NL, MR)}

Training set{(NL, MR, SynT)}

Training set{(NL, MR, SAPT)}

Automating the SAPT-Generation

NL: natural language sentenceMR: meaning representationSynT: syntactic parse treeSAPT: semantically-augmented parse tree

SCISSOR

Correct SAPTs are not available,Only MRs are available

Step 1: Obtaining Automatic Syntactic Parses

• Automatically generated syntactic parses have been used successfully in many NLP tasks

• High performance parsers– Collins(1997), Charniak(2000), Hockenmaier &

Steedman(2000)

• Charniak & Johnson (2005) reported the highest F-measure on parsing the Penn Treebank: 91.02%

Syntactic F-measure Learning Curve for CLang

statistics inherent in application

reduce generalization error

Syntactic F-measure Learning Curve for GeoQuery

Step 2: Discriminating Good SAPTs from Bad SAPTs

• Generating candidate SAPTs given a syntactic parse tree– Initialize each word with its candidate semantic labels

using co-occurrence measures, word alignment systems, or dictionary learning methods

– Label non-terminals with semantic labels passed up from one of its children using a function of compositional semantics recursively

• Discriminative features: semantic labels of words, predicate-argument pairs, …

• Maximum Entropy (ME) models can be used on learning on incomplete data (Reizler 2002)– Acquire empirical statistics required for training a ME

model from SAPTs that lead to correct MRs as correct

The training process is still integrated, because syntactic parse trees which cannot lead to correct MRs will be rejected. An

alternative syntactic parse tree can be provided by the parser.

Step 2: Discriminating Good SAPTs from Bad SAPTs (Cont.)

Roadmap

– Other issues

• Conclusions

Future Work: Other Issues

• Apply to other application domains– Air Travel Information Service (ATIS) data [price

• Investigate parsers in CCG formalism (Hockenmaier & Steedman 2002, Clark & Curran 2004)

– Elegant treatment of a variety of linguistic phenomena

• Compare WASP, KRISP, SCISSOR trained on the same amount of supervision– Sentences annotated with tree structures

– Sentences only paired with MRs

Conclusions

• Introduced SCISSOR for semantic parsing• Evaluated on two real-world corpora• Produced more accurate semantic representations

than other approaches, especially on long sentences

• Future work: – Discriminative reranking for semantic parsing

– Automating the SAPT-generation

– Other issues

Thank You!

Questions?

University of Texas at Austin Machine Learning Group Department of Computer Sciences University of...

Documents