+ All Categories
Home > Documents > Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from...

Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from...

Date post: 14-Dec-2015
Category:
Upload: winston-neal
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
23
Progress update Lin Ziheng
Transcript
Page 1: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Progress update

Lin Ziheng

Page 2: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

2

System overview

Page 3: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Components – Connective classifier

• Features from Pitler and Nenkova (2009):– Connective: because– Self category: IN– Parent category: SBAR– Left sibling category: none– Right sibling category: S– Right sibling contains a VP: yes

3

Page 4: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Components – Connective classifier

• New features– Conn POS– Prev word + conn: even though, particularly since– Prev word POS– Prev word POS + conn POS– Conn + Next word– Next word POS– Conn POS + Next word POS– All lemmatized verbs in the sentence containing conn

4

Page 5: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

5

Components – Argument labeler

Page 6: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

6

Argument labeler – Argument position classifier

• Relative positions of Arg1– Arg1 and Arg2 in the same sentence: SS (60.9%)– Arg1 in the immediately previous sentence: IPS (30.1%)– Arg1 in some non-adjacent previous sentence: NAPS (9.0%)– Arg1 in some following sentence: FS (0%, only 8 instances)

• FS ignored

Page 7: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

Argument labeler – Argument position classifier

• Features:– Connective string– Conn POS– Conn position in the sentence: first, second, third, third last, second

last, or last– Prev word– Prev word POS– Prev word + conn– Prev word POS + conn POS– Second prev word– Second prev word POS– Second prev word + conn– Second prev word POS + conn POS

7

Page 8: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

8

Argument labeler – Argument extractor

• SS cases: handcrafted a set of syntactically motivated rules to extract Arg1 and Arg2

Page 9: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

9

Argument labeler – Argument extractor

• An example:

Page 10: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

10

Argument labeler – Argument extractor

• IPS cases: label the sentence containing the connective as Arg2 and the immediately previous sentence as Arg1

• NAPS cases: – Arg1 locates in the second previous sentence in

45.8% of the NAPS cases– Use the majority decision and assume Arg1 is

always in the second previous sentence

Page 11: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

11

Components – Explicit classifier

• Prasad et al. (2008) reported human agreements of 94% on Level 1 classes and 84% on Level 2 types

• A baseline using only connectives as features gives 95.7% and 86% on Sec. 23– Difficult to improve acc. on testing section

• 3 types of features:– Connective string– Conn POS– Conn + prev word

Page 12: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

12

Components – Non-explicit classifier

• Non-explicit: Implicit, AltLex, EntRel, NoRel– 11 Level 2 types for Implicit/AltLex, plus EntRel and

NoRel 13 types• 4 feature sets from Lin et al. (2009)– Contextual features– Constituent parse features– Dependency parse features– Word-pair features

• 3 features to capture AltLex: Arg2_word1, Arg2_word2, Arg2_word3

Page 13: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

13

Components – Attribution span labeler

• Two steps: split the text into clauses, and decide which clauses are attribution spans

• Rule-based clause splitter: – first split a sentence into clauses by punctuations – for each clause, we further split it if one of the

following production links if found: VPSBAR, SSINV, SS, SINVS, SSBAR, VPS

Page 14: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

14

Components – Attribution span labeler

• Attr span classifier features: (curr, prev and next clauses)– Unigrams of curr– Lowercased and lemmatized vers in curr– The first and last terms of curr– The last term of prev– The first term of next– The last term of prev + the first term of curr– The last term of curr + the first term of next– The position of curr in the sentence– Punctuations rules extracted from curr

Page 15: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

15

Evaluation

• Train: 02-21, dev: 22, test: 23• Each component is tested – without and with error propagation (EP) from

previous component– with gold standard (GS) parse trees and sentence

boundaries, and with automatic (Auto) parser and sentence splitter

Page 16: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

16

Evaluation – Connective classifier

• GS: increased acc and F1 by 2.05% and 3.05%• Auto: increased acc and F1 by 1.71% and

2.54%• Contextual info is helpful

Page 17: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

17

Evaluation – Argument position classifier

• Able to accurately label SS• But performs badly on the NAPS class– Due to the similarity between IPS and NAPS

classes

Page 18: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

18

Evaluation – Argument extractor

• Human agreements on partial and exact matches: 94.5% and 90.2%

• Exact F1 much lower than partial F1– Due to small portions of text deleted

Page 19: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

19

Evaluation – Explicit classifier

• Baseline: using only connective strings– 86%

• GS + no EP F1 increased by 0.44%

Page 20: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

20

Evaluation – Non-explicit classifier

• Majority baseline: all classified as EntRel• Adding EP degrades F1 by ~13%, but still

outperforms baseline by ~6%

Page 21: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

21

Evaluation – Attribution span labeler

• When EP added: the decrease of F1 is largely due to the drop in precision

• When Auto added: the decrease of F1 is largely due the drop in recall

Page 22: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

22

Evaluation – The whole pipeline

• Definition: a relation is correct if its relation type is classified correctly, and both Arg1 and Arg2 are partially or exactly matched

• GS + EP– Partial: 46.38% F1– Exact: 31.72% F1

Page 23: Progress update Lin Ziheng. System overview 2 Components – Connective classifier Features from Pitler and Nenkova (2009): – Connective: because – Self.

23

On-going changes

• Joint learning• Change rule-based argument extractor to a

machine learning approach


Recommended