+ All Categories
Home > Documents > An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1...

An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1...

Date post: 05-Jan-2016
Category:
Upload: brittany-gilmore
View: 212 times
Download: 0 times
Share this document with a friend
16
1 An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng Li 2 Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen 1 Institute for Infocomm Research 2 Harbin Institute of Technology 3 National University of Singapore ACL 2008
Transcript
Page 1: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

1

An Entity-Mention Model for Coreference Resolution

with Inductive Logic Programming

Xiaofeng Yang1 Jian Su1 Jun Lang2

Chew Lim Tan3 Ting Liu2 Sheng Li2

Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen

1 Institute for Infocomm Research2 Harbin Institute of Technology

3 National University of SingaporeACL 2008

Page 2: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

22

Introduction

Coreference resolution : The process of linking multiple mentions that refer to the same entity

coreference; anaphor 同指涉 antecedent 先行詞

Inductive logic programming : Supervised learning Inductive rule from positive cases

22

Page 3: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

33

Related Work1. Mention pair model

Aone and Bennett (1995); McCarthy and Lehnert (1995);Soon et al. (2001); Ng and Cardie (2002)) Individual mention usually lacks adequate descriptive

information of the referred entity (ex: Powell vs he)

2. Entity-mention model Luo et al., 2004; Yang et al., 2004 Can’t describing each individual mention in an entity

3. Inductive logic programming in NLP Parsing (Mooney, 1997) POS disambiguation (Cussens, 1996) Lexicon construction (Claveau et al., 2003) WSD (Specia et al., 2007) 3

Page 4: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

4

Modeling Coreference Resolution

The probability that a mention belongs to an entity

Example

e1 : Microsoft Corp. - its - The company e2 : its new CEO - he e3 : yesterday

4

Page 5: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

5

Mention-Pair Model

Soon et al. (2001) and Ng and Cardie (2002) Instance i{mk, mj}

mj is an active mention & mk is a preceding mention

Positive: mj and its closest antecedent (only one for mj )

Negative: every intervening mentions between mj and its closest antecedent

mj is linked with the mention that is classified as positive (if any) with the highest confidence value

Page 6: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

6

Feature Set for Coreference Resolution

1

2

3

同位語

述詞

Page 7: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

7

Entity-Mention Model

Mention-pair model error: Lack adequate descriptive information “ Mr. Powell”, “Powell”, and “she”

Instance i{mk, mj} Positive: mj and the entity to which mj belongs. Negative: every entity whose last mention occurs

between mj and its closest antecedent

If no positive entity exists, the mj forms a new entity

entity features: first-order features Any-X, Most-X, All-X Distance feature :the minimum distance between

the mentions in the entity and the active mention.

Page 8: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

8

Entity-mention Model with ILP (1/3)

Tool: ALEPH by Srinivasan (2000) (Oxford) Input: positive example E+

negative example E-

background knowledge K Output: hypotheses h

e1_6 denotes the part of e1 before m6,

example representation: link(e1_6 , m6)

8

Page 9: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

9

Entity-mention Model with ILP (2/3)

background knowledge K predicates1. Information related to ei_j and mj

2. Relations between ei_j and its mentions

has_mention(e1_6 , m6)

3. Information related to mj and each mention mk in ei_j

9

Page 10: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

10

Entity-mention Model with ILP (3/3)

Hypothesis rule

link(A,B) :-has mention(A,C), numAgree(B,C,1),strMatch Head(B,C,1), bareNP(C,1).

Page 11: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

11

Experiments and Result(1/4) Corpus: ACE-2 V1.0 corpus (NIST, 2003)

Modify ILP tool, ALEPH: Rule accuracy 100% to 50% 3 predicates to 10 predicates

11

Page 12: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

12

Baseline model: C4.5 algorithm

Preprocessing Tokenizer Part-of-Speech tagger

accuracy of 97% on Penn WSJ TreeBank NP chunker (Zhou and Su, 2000)

F-measure above 94% on Penn WSJ TreeBank Named-Entity Recognizer (Zhou and Su, 2002)

F-measure of 96.6% (MUC-6) and 94.1%(MUC-7)

12

Experiments and Result(2/4)

Page 13: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

13

Experiments and Result(3/4)

F-measure is 2-4% lower than the state-of-the-art, which utilized sophisticated semantic or real world knowledge

Significant under 2-tailed t test (p < 0.05) 13

Page 14: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

14

Experiments and Result(4/4)

Multiple non-instantiated arguments (i.e. C and D) could possibly appear in the same rule

Page 15: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

1515

Conclusion & Future Work The model can express the relations between an

entity and its mentions, and to automatically learn the first-order rules

ILP based entity-model performs better than the mention-pair model (with up to 2.3% increase in F-measure)

Future work: Investigate more sophisticated clustering methods

that would lead to global optimization keeping a large search space (Luo et al., 2004) using integer programming (Denis and Baldridge,

2007)

1515

Page 16: An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming Xiaofeng Yang 1 Jian Su 1 Jun Lang 2 Chew Lim Tan 3 Ting Liu 2 Sheng.

1616

Thank you!

1616


Recommended