CS460/626 : Natural Language Processing/Speech, NLP and the Web ( Lecture 38–Universal ...

Post on 23-Mar-2016

61 views 0 download

Tags:

description

CS460/626 : Natural Language Processing/Speech, NLP and the Web ( Lecture 38–Universal Networking Language). Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th April, 2011. A Perpective. Discourse. Pragmatics. Semantics. Syntax. Lexicon. Morphology. UNL: a United Nations project. - PowerPoint PPT Presentation

transcript

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 38–Universal Networking Language)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

14th April, 2011

A Perpective

Morphology

Lexicon

Syntax

SemanticsPragmatics

Discourse

UNL: a United Nations project Started in 1996 10 year program 15 research groups across continents First goal: generators Next goal: analysers (needs solving various

ambiguity problems) Current active language groups

UNL_French (GETA-CLIPS, IMAG) UNL_English+Hindi UNL_Italian (Univ. of Pisa) UNL_Portugese (Univ of Sao Paolo, Brazil) UNL_Russian (Institute of Linguistics, Moscow) UNL_Spanish (UPM, Madrid)

4

World-wide Universal Networking Language (UNL) Project

UNL

English Russian

Japanese

Hindi

Spanish

Language independent meaning representation.

Marathi

Others

Foundations and Applications

UNL Foundations Semantic Relations Universal Words Attributes How to write UNL expressions

UNL Applications Machine Translation: Rule based and

Statistical Search Text Entailment Sentiment Analysis

UNL represents knowledge: John eats rice with a spoon

Semantic relations

attributes

Universal words

Repositoryof 42SemanticRelations and84 attributelabels

Sentence embeddingsDeepa claimed that she had composed a

poem.[UNL]

agt(claim.@entry.@past, Deepa)obj(claim.@entry.@past, :01)agt:01(compose.@past.@entry.@complete,

she)obj:01(compose.@past.@entry.@complete,

poem.@indef)[\UNL]

English sentences: basic structure

A <verb> B John eats bread agt(eat.@entry,

John) obj(eat.@entry,

bread) A <verb>

John sleeps aoj(sleep.@entry,

John) A <be> B

John is good aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

Hindi sentences: basic structure

A B <verb> John roti khaataa hai agt(eat.@entry, John) obj(eat.@entry,

bread) A <verb>

John sotaa hai aoj(sleep.@entry,

John) A <be> B

John acchaa hai aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

:02:01

Complex English sentences: Use recursion on the basic structure

A <verb> B John who is a good boy eats

bread which is toasted

agt(eat.@entry, :01) obj(eat.@entry, :02) aoj:01(boy, John.@entry) mod:01(boy, good) obj:01(toast,

bread.@entry.@focus)

boy

John

aoj

toast

Bread

obj

eat

:02

:01

agt obj

good

mod

Red arrows indicate entry nodes

11

Constituents of Universal Networking Language Universal Words (UWs) Relations Attributes Knowledge Base

12

What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the

syntactic-semantic units to form UNL expressions

A UW represents a concept Basic UW (an English word/compound

word/phrase with no restrictions or Constraint List)

Restricted UW (with a Constraint List ) Examples:

“crane(icl>device)” “crane(icl>bird)”

13

The LexiconFormat of the dictionary entry

e.g., [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN); Head word Universal word Attributes

Morphological - Pl(plural), V_ed(past tense form)

Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME

[headword] {} “Universal word“ (Attribute list);

14

The Lexicon (cntd)

Content words:

[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;

[mail] {} “mail(icl>message)” (N,PHSCL,INANI) <E,0,0>;

[minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>;

Headword Universal Word Attributes

He forwarded the mail to the minister.

15

The Lexicon (cntd)

function words:

[he] {} “he” (PRON,SUB,SING,3RD)[the] {} “the” (ART,THE) <E,0,0>;[to] {} “to” (PRE,#TO) <E,0,0>;

Headword Universal Word

Attributes

He forwarded the mail to the minister.

Multilingual dictionary

सार्व�भौमशब्दमुख्य शब्द

farmer(icl>creator)farmer

शेतकरी

किकसान N,M,ANIMT,FAUNA,MML,PRSN,Na

N,ANIMT,FAUNA,MML,PRSN

E

M

H

N,M,ANIMT,FAUNA,MML,PRSN

गुण

17

The Features of a UW Every concept existing in any

language must correspond to a UW The constraint list should be as

small as necessary to disambiguate the headword

Every UW should be defined in the UNL Knowledge-Base (now wordnet)

18

Restricted UWs Examples

He will hold office until the spring of next year.

The spring was broken. Restricted UWs, which are Headwords

with a constraint list, for example:“spring(icl>season)” “spring(icl>device)”“spring(icl>jump)”“spring(icl>fountain)”

19

How to create UWs? Pick up a concept

the concept of “crane" as "a device for lifting heavy loads”

or as “a long-legged bird that wade in water in search of food”

Choose an English word for the concept. In the case for “crane", since it is a word of

English, the corresponding word should be ‘crane'

Choose a constraint list for the word. [ ] ‘crane(icl>device)' [ ] ‘crane(icl>bird)'

Example: Hindi word ghar ghar- house

usne garmii me ghar kii marammat kii he renovated the house in the summer

ghar- home office ke baad ghar louto return home after office

Ghar- family bade ghar kii betii girl from a renowned family

Example: ghar (cntd) ghar- own country

bahut saal bidesh me kaam karke ghar louta aayaa

returned home after working abroad for many years

Ghar- astrological position ashtam ghar par budh hai Mercury in in the eighth house

House in English Wordnet 1. (1029) house -- (a dwelling that serves as

living quarters for one or more families; "he has a house on Cape Cod"; "she felt she had to get out of the house")

3. (51) house -- (a building in which something is sheltered or located; "they had a large carriage house")

4. (39) family, household, house, home, menage -- (a social unit living together; "he moved his family to Virginia"; "It was a good Christian household“;)

House in English Wordnet 7. (13) house -- (aristocratic family

line; "the House of York") 11. sign of the zodiac, star sign,

sign, mansion, house, planetary house -- ((astrology) one of 12 equal areas into which the zodiac is divided)

Unambiguous construction of UWs

Use constraints: Ontological, Semantic and Argument

Example: forward a mail to the minister forward(icl>do, icl>send, agt>thing(icl>animate), obj>thing(icl>inanimate), gol>thing)

Constraint types:icl>do: ontological,icl>send: semanticagt>thing, obj>thing, gol>thing: argument

UNL Relations

Relations constitute the syntax of UNL Express how concepts (UWs) constitute

a sentence Represented as strings of 3 characters

or less A set of 41 relations specified in UNL

(e.g., agt, aoj, ben, gol, obj, plc, src, tim,…)

Refer to a semantic role between two lexical items in a sentence

27

AGT / AOJ / OBJ AGT  (Agent)

Definition:  Agt defines a thing which initiates an action

AOJ (Thing with attribute)Definition:  Aoj defines a thing which is in a state or has an attribute

OBJ (Affected thing)Definition: Obj defines a thing in focus which is directly affected by an event or state

28

Examples John broke the window.

agt ( break.@entry.@past, John)

This flower is beautiful.aoj ( beautiful.@entry, flower)

He blamed John for the accident.obj ( blame.@entry.@past, John)

Example: UNL Graph with agt, obj, ben

objagt

@ entry @ past

baby(icl>child)

carve(icl>cut)

toy(icl>plaything)

he(iof>person) @def

ben

He carved a toy for the baby.

30

GOL / SRC GOL  (Goal : final state)

Definition:  Gol defines the final state of an object or the thing finally associated with an object of an event

SRC  (Source : initial state)Definition:  Src defines the initial state of object or the thing initially associated with object of an event

31

GOL I deposited my money in my bank

account.

objagt

@ entry @ past

account(icl>statement)

deposit(icl>put)

money(icl>currency)

I

gol

bank(icl>possession)

modmod mod

I I

32

SRC They make a small income from fishing.

objagt

@ entry @ present

fishing(icl>business)

make(icl>do)

income(icl>gain)

they(icl>persons)

src

small(aoj>thing)

mod

33

PUR PUR (Purpose or objective)

Definition:  Pur defines the purpose or objectives of the agent of an event or the purpose of a thing exist

This budget is for food.pur ( food.@entry, budget )mod ( budget, this )

34

RSN RSN (Reason)

Definition:  Rsn defines a reason why an event or a state happens

They selected him for his honesty.agt(select(icl>choose).@entry, they)obj(select(icl>choose) .@entry, he)rsn (select(icl>choose).@entry, honesty)

35

TIM TIM (Time)

Definition:  Tim defines the time an event occurs or a state is true

I wake up at noon.agt ( wake up.@entry, I )tim ( wake up.@entry, noon(icl>time))

36

PLC PLC (Place)

Definition:  Plc defines the place an event occurs or a state is true or a thing exists

Temples are very famous in India.aoj (famous.@entry,

temple@pl )man (famous.@entry, very)plc (famous.@entry, India)

37

INS INS   (Instrument)

Definition:  Ins defines the instrument to carry out an event

I solved it with computeragt ( solve.@entry.@past, I )ins ( solve.@entry.@past, computer )obj ( solve.@entry.@past, it )

38

INS

objagt

@ entry @ past

blanket(icl>object)

cover(icl>do)

baby(icl>child)

John(iof>person)

@def

ins

John covered the baby with a blanket.

39

Attributes Constitute syntax of UNL Play the role of bridging the conceptual world

and the real world in the UNL expressions Show how and when the speaker views what is

said and with what intention, feeling, and so on

Seven types: Time with respect to the speaker Aspects Speaker’s view of reference Speaker’s emphasis, focus, topic, etc. Convention Speaker’s attitudes Speaker’s feelings and viewpoints

40

Tense: @past

The past tense is normally expressed by @past

{unl}agt(go.@entry.@past, he)…{/unl}

He went there yesterday

41

Aspects: @progress

{unl}man

( rain.@entry.@present.@progress, hard )

{/unl}

It’s raining hard.

42

Speaker’s view of reference

@def (Specific concept (already referred))The house on the corner is for sale.

@indef (Non-specific class)There is a book on the desk

@not is always attached to the UW which is negated.

He didn’t come. agt ( come.@entry.@past.@not, he )

43

Speaker’s emphasis @emphasis

John his name is.mod ( name, he )aoj ( John.@emphasis.@entry, name )

@entry denotes the entry point or main UW of an UNL expression

How to generate UNL

45

Early Enco (1996-98)

Analysis windows -Two in number Left Analysis Window (LAW) Right Analysis Window (RAW)

Condition windows - Many in number Left Condition Window (LCW) Right Condition Window (RAW)

LAW

Word2

Word1

Word4

RAW

RCW

Wordn

LCW

Word3

sentence

windows

46

UNL Rule for a Semantic Relation

;Create relation between V and N2, after resolving the preposition preceding N2

<{V,VOA,:::}{N,TIME,DAY,ONRES,PRERES::tim:}P25;

IFthe left analysis window is on a verb(V) which is

verb of action (VOA) AND

the right analysis window is on a noun (N) and has TIME, DAY attribute for which the preceding preposition (on) has been processed and deleted

THENset up the tim relation between V and N2. (indicated by < at the start of the rule)

UNL generation using NLP tools and resources

47

SRS based system

Multi parser based system

Evaluation Recall =

#expressions matched in gold and generated UNL

#expressions expected in gold UNL

Precision =#expressions matched in gold and generated

UNL #expressions in generated UNL

F1 score = 2 * recall * precision recall + precision

Comparison between the two systems

Table Name Accuracy of XLE Parser Based System

Accuracy of Multi-parser based system

evalTb_OXF_V_TO_INF 0.8376 0.8591evalTb_OXF_VN_TO_INF 0.8369 0.8429evalTb_OXF_S_TO_DO_VERB 0.7833 0.7833evalTb_XTAG 0.7181 0.7835evalTb_FRAMENET 0.6618 0.7591evalTb_RADFORD 0.8141 0.8542evalTb_V 0.5920 0.7587evalTb_VN 0.7528 0.7625evalTb_VNN 0.7692 0.7902evalTb_VING 0.7084 0.7084evalTb_VADJ 0.5486 0.6214evalTb_VINF 0.7236 0.7772evalTb_VTHAT 0.7988 0.7999evalTb_TOI_Education 0.3875 0.3669evalTb_test 0.4667 0.4667evalTb_demo 1.0000 1.0000evalTb_Test2 0.3913 0.5116evalTb_t3 0.7155 0.8553evalTb_Barcelona 0.3194 0.3181

Total 0.6489 0.7010

LanguageProcessing & Understanding

Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization

Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications)Using graphical models, support vector machines, neural networks

IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback

Machine Translation: Statistical Interlingua Based EnglishIndian languages Indian languagesIndian languages Indowordnet

Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb

Linguistics is the eye and computation thebody

Use of UNL in multiple NLP tasks

Summing up Some NLP milestones covered

WSD: various approaches SMT Parsing (classical and probabilistic) Phonology, Phonetics, Syllabification,

Transliteration Semantics, UNL

Assignments: to reinforce understanding of lectures

Important topics left out: IR, Similarity measures

Seminars: wide range of topics for breadth and exposure

Lectures: Foundation and depth

God Bless!!