+ All Categories
Home > Documents > Textual Relations

Textual Relations

Date post: 06-Feb-2016
Category:
Upload: rich
View: 36 times
Download: 1 times
Share this document with a friend
Description:
Relational Inference for Wikification. Xiao Cheng and Dan Roth. - PowerPoint PPT Presentation
Popular Tags:
1
Textual Relations Task Definition Annotate input text with disambiguated Wikipedia titles: Motivation Current state-of-the-art Wikifiers, using purely statistical methods, already achieve good performance, leveling off at around 75%~80% F1 Limitation of Bag-of-words representation Evaluations Achieves significant improvement over the previous state-of-the-art systems Run the Relational Inference Wikifier (RI) “as-is” without retraining on the target domain, still obtains significant gain over our previous submitted Entity Linking system(Cogcomp). Discussion Relational Wikification Approach 1.Identify Textual Relations 2.Retrieve Relational Knowledge 3.Formulate the inference problem 4.Rerank via constraints Promote candidate pair: Slobodan_Milošević Socialist_Party_of_Serbia Relational Inference for Wikification Xiao Cheng and Dan Roth This research is sponsored by DARPA under agreement number FA8750-13-2-0008, and partly supported by the IARPA under contract number D11PC20155, by the ARL under agreement W911NF-09-2-0053, and by the Multimodal Information Access & Synthesis Center at UIUC. Demo: http://cogcomp.cs.illinois.edu/demo/wik Blumenthal (D ) is a candidate for the U.S. Senate seat now held by Christopher Dodd (D), and he has held a commanding lead in the race since he entered it. But the Times report has the potential to fundamentally reshape the contest in the Nutmeg State . Chris Dodd From Wikipedia, the free encyclopedia The New York Times From Wikipedia, the free encyclopedia Connecticut From Wikipedia, the free encyclopedia Democratic Party (United States) From Wikipedia, the free encyclopedia United States Senate From Wikipedia, the free encyclopedia Richard Blumenthal From Wikipedia, the free encyclopedia ...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević 's Socialist Party e k 4 =Socialist Party s k 4 Socialist_Party_(F rance) .2 3 Socialist_Party_(P ortugal) .1 6 Socialist_Party_of _America .0 7 Socialist_Party_(A rgentina) .0 6 Socialist_Party_of _Serbia .0 e k 3 =Milošević s k 3 Slobodan_Miloš ević .7 Milošević_(sur name) .1 Boki_Milošević .1 Alexander_Milo šević .0 5 Argument 1 Relation Type Argument 2 Yugoslav President apposition Slobodan Milošević Slobodan Milošević coreference Milošević Milošević possessive Socialist Party 68 72 76 80 84 88 RI Cog Comp LCC MS _MLI NUS hime *Median TAC KBP 2011 Entity Linking Performance Micro Average System Names ACE MSNBC AQUAINT Wikipedia 60 65 70 75 80 85 90 95 F1 Performance on Wikification datasets Milne&Witten Ratinov&Roth Relational Inference founded Slobodan Milošević From Wikipedia, the free encyclopedia Socialist Party of Serbia From Wikipedia, the free encyclopedia Mubarak , the wife of deposed Egyptian President Hosni Mubarak , … Mubarak wife Egyptian President Hosni Mubarak 34 ( 1,21) = 1 We are interested in extracting high-precision textual relations that help with disambiguation. Specifically, we focus on the following types of relations: Syntactico-semantic relations (Chan & Roth ‘10) Coreference relations Acronyms, partial names, nominal mentions We show that both linguistic and world knowledge, specifically the ability to use relational information, are crucial in the task of Wikification. To do that, we introduce an extensible and efficient inference framework that leverages better language understanding. Additional work is needed to accumulate and better integrate our knowledge about NIL entities to fully address the Entity Linking task and handle additional encyclopedic resources. The performance gains and error analysis also calls for joint entity typing, coreference and disambiguation. Bag-of-words loses important relational information Modeling constraining interaction between concepts Need to link Mubarak to Suzanne Mubarak Identify relation (Mubarak , wife, Hosni Mubarak ) Promote a pair of candidates that is coherent with text meaning • For each pair of entity candidates and , represents whether we found a relation in the text between their mentions AND a relation in our knowledge base either rewards or penalizes a relation for its coherency with the text. High-level algorithm description Type Example Premodifier Iranian Ministry of Defense Possessive NYC’s stock exchange Formulaic Chicago , Illinois Preposition President of the US Relation Retrieval Uses DBPedia and Wikipedia page link relations as our knowledge base Retrieve lexically similar candidates and filter q 1 =(Socialist Party of France ,?, *Milošević*) q 2 =(Slobodan Milošević ,?,*Socialist Party*) Relation Inference Relation scoring Relaxes constraint when ambiguity exists Scores each retrieved relation for each query is the relation weight for different knowledge source is the lexical similarity between the query and the retrieved relation is a normalization factor, keeping the weights for each pair of mentions between 0 and 1 Special handling of “local knowledge” Creates NIL entity candidate for inference propagation, so that locally extracted high precision knowledge can be considered across long-range textual relations Uses off-the-shelf Integer Linear References: X. Cheng and D. Roth, Relational Inference for Wikification. EMNLP’13 L. Ratinov and D. Roth and D. Downey and M. Anderson, Local and Global Algorithms for Disambiguation to Wikipedia. ACL’11
Transcript
Page 1: Textual Relations

Textual RelationsTask Definition

Annotate input text with disambiguated Wikipedia titles:

Motivation

Current state-of-the-art Wikifiers, using purely statistical methods, already achieve good performance, leveling off at around 75%~80% F1• Limitation of Bag-of-words representation

Task Definition

Annotate input text with disambiguated Wikipedia titles:

Motivation

Current state-of-the-art Wikifiers, using purely statistical methods, already achieve good performance, leveling off at around 75%~80% F1• Limitation of Bag-of-words representation

Evaluations

• Achieves significant improvement over the previous state-of-the-art systems

• Run the Relational Inference Wikifier (RI) “as-is” without retraining on the target domain, still obtains significant gain over our previous submitted Entity Linking system(Cogcomp).

Discussion

Evaluations

• Achieves significant improvement over the previous state-of-the-art systems

• Run the Relational Inference Wikifier (RI) “as-is” without retraining on the target domain, still obtains significant gain over our previous submitted Entity Linking system(Cogcomp).

Discussion

Relational Wikification Approach

1. Identify Textual Relations

2. Retrieve Relational Knowledge

3. Formulate the inference problem

4. Rerank via constraints

Promote candidate pair: Slobodan_Milošević Socialist_Party_of_Serbia

Relational Inference for WikificationXiao Cheng and Dan Roth

This research is sponsored by DARPA under agreement number FA8750-13-2-0008, and partly supported by the IARPA under contract number D11PC20155, by the ARL under agreement W911NF-09-2-0053, and by the Multimodal Information Access & Synthesis Center at UIUC.

Demo: http://cogcomp.cs.illinois.edu/demo/wikify/

Blumenthal (D) is a candidate for the U.S. Senate seat now held by Christopher Dodd (D), and he has held a commanding lead in the race since he entered it. But the Times report has the potential to fundamentally reshape the contest in the Nutmeg State.Chris DoddFrom Wikipedia, the free encyclopedia

The New York TimesFrom Wikipedia, the free encyclopedia

ConnecticutFrom Wikipedia, the free encyclopedia

Democratic Party (United States)From Wikipedia, the free encyclopedia

United States SenateFrom Wikipedia, the free encyclopedia

Richard BlumenthalFrom Wikipedia, the free encyclopedia

...ousted long time Yugoslav President Slobodan Milošević in October. Mr. Milošević's Socialist Party…

ek4=Socialist Party sk

4

Socialist_Party_(France) .23Socialist_Party_(Portugal) .16Socialist_Party_of_America .07Socialist_Party_(Argentina) .06…Socialist_Party_of_Serbia .0

ek3=Milošević sk

3

Slobodan_Milošević .7Milošević_(surname) .1Boki_Milošević .1Alexander_Milošević .05…

Argument 1 Relation Type Argument 2

Yugoslav President apposition Slobodan Milošević

Slobodan Milošević coreference Milošević

Milošević possessive Socialist Party

68

72

76

80

84

88

RI

CogCompLC

C

MS_MLI

NUShime

*Median

TAC KBP 2011 Entity Linking Performance

Micro AverageB³F1

System Names

ACE MSNBC AQUAINT Wikipedia60

65

70

75

80

85

90

95

F1 Performance on Wikification datasets

Milne&WittenRatinov&RothRelational Inference

founded

Slobodan MiloševićFrom Wikipedia, the free encyclopedia

Socialist Party of SerbiaFrom Wikipedia, the free encyclopedia

Mubarak, the wife of deposed Egyptian President Hosni Mubarak, …

Mubarak

wife

Egyptian President

Hosni Mubarak

𝑟34(1,21)=1

We are interested in extracting high-precision textual relations that help with disambiguation. Specifically, we focus on the following types of relations:• Syntactico-semantic relations (Chan & Roth ‘10)

• Coreference relations• Acronyms, partial names, nominal mentions

We show that both linguistic and world knowledge, specifically the ability to use relational information, are crucial in the task of Wikification. To do that, we introduce an extensible and efficient inference framework that leverages better language understanding. Additional work is needed to accumulate and better integrate our knowledge about NIL entities to fully address the Entity Linking task and handle additional encyclopedic resources. The performance gains and error analysis also calls for joint entity typing, coreference and disambiguation.

• Bag-of-words loses important relational information

• Modeling constraining interaction between concepts• Need to link Mubarak to Suzanne

Mubarak• Identify relation (Mubarak, wife, Hosni

Mubarak)• Promote a pair of candidates that is

coherent with text meaning

• For each pair of entity candidates and , represents whether we found a relation in the text between their mentions AND a relation in our knowledge base

• either rewards or penalizes a relation for its coherency with the text.High-level algorithm description

Type ExamplePremodifier Iranian Ministry of DefensePossessive NYC’s stock exchangeFormulaic Chicago, IllinoisPreposition President of the US

Relation Retrieval• Uses DBPedia and Wikipedia page link

relations as our knowledge base• Retrieve lexically similar candidates and filter

• q1=(Socialist Party of France,?, *Milošević*)

• q2=(Slobodan Milošević,?,*Socialist Party*)

Relation Inference• Relation scoring

• Relaxes constraint when ambiguity exists• Scores each retrieved relation for each

query

• is the relation weight for different knowledge source

• is the lexical similarity between the query and the retrieved relation

• is a normalization factor, keeping the weights for each pair of mentions between 0 and 1

• Special handling of “local knowledge”• Creates NIL entity candidate for inference

propagation, so that locally extracted high precision knowledge can be considered across long-range textual relations

• Uses off-the-shelf Integer Linear Programming (ILP) packages to optimize the objective function

References:X. Cheng and D. Roth, Relational Inference for Wikification. EMNLP’13 L. Ratinov and D. Roth and D. Downey and M. Anderson, Local and Global Algorithms for Disambiguation to Wikipedia. ACL’11

Recommended