CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal...

Post on 30-Dec-2015

217 views 0 download

Tags:

transcript

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 37– Semantics; Universal Networking Language)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

12th April, 2011

Semantics: wikipedia

• Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning.

• It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.

Computational Semantics: wikipedia• Computational semantics is the study of how to

automate the process of constructing and reasoning with meaning representations of natural language expressions.

• Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution.

• Methods employed usually draw from formal semantics or statistical semantics.

• Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving).

• Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

A hurdle: signifier-denotata dichotomy

Divide between a word and what it stands for

“red” is NOT red in colour “red wine”, “red rose”, “he is in the

red” denote very different sense of the word

Translation into another language reveals this difference

A Perpective

Morphology

Lexicon

Syntax

Semantics

Pragmatics

Discourse

Our tryst with semantics:

Universal Networking Language (UNL)

Motivation

Extraction of semantics, i.e., deep meaning is important for many applications. Machine Translation, Meaning-based IR, CLIR

Robust, scalable & efficient methods of knowledge extraction required

Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier

Interlingua: a vehicle for machine translation

Interlingua(UNL)

English

French

Hindi

Chinese

generation

Analysis

UNL: a United Nations project Started in 1996 10 year program 15 research groups across continents First goal: generators Next goal: analysers (needs solving various

ambiguity problems) Current active language groups

UNL_French (GETA-CLIPS, IMAG) UNL_English+Hindi UNL_Italian (Univ. of Pisa) UNL_Portugese (Univ of Sao Paolo, Brazil) UNL_Russian (Institute of Linguistics, Moscow) UNL_Spanish (UPM, Madrid)

10

World-wide Universal Networking Language (UNL) Project

UNL

English Russian

Japanese

Hindi

Spanish

Language independent meaning representation.

Marathi

Others

11

The UNL MT System: an Overview

NLP@IITB

Foundations and Applications

UNL Foundations Semantic Relations Universal Words Attributes How to write UNL expressions

UNL Applications Machine Translation: Rule based and

Statistical Search Text Entailment Sentiment Analysis

LanguageProcessing & Understanding

Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization

Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications)Using graphical models, support vector machines, neural networks

IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback

Machine Translation: Statistical Interlingua Based EnglishIndian languages Indian languagesIndian languages Indowordnet

Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb

Linguistics is the eye and computation thebody

UNL represents knowledge: John eats rice with a spoon

Semantic relations

attributes

Universal words

Repositoryof 42SemanticRelations and84 attributelabels

Sentence embeddings

Deepa claimed that she had composed a poem.

[UNL]agt(claim.@entry.@past, Deepa)obj(claim.@entry.@past, :01)agt:01(compose.@past.@entry.@complete,

she)obj:01(compose.@past.@entry.@complete,

poem.@indef)

[\UNL]

17

Constituents of Universal Networking Language

Universal Words (UWs) Relations Attributes Knowledge Base

18

UNL Graph

obj

agt

@ entry @ past

minister(icl>person)

forward(icl>send)

mail(icl>collection)

he(icl>person)

@def

@def

gol

He forwarded the mail to the minister.

19

UNL Expression

agt (forward(icl>send).@ entry @ past, he(icl>person))

obj (forward(icl>send).@ entry @ past, minister(icl>person))

gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)

20

What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the

syntactic-semantic units to form UNL expressions

A UW represents a concept Basic UW (an English word/compound

word/phrase with no restrictions or Constraint List)

Restricted UW (with a Constraint List ) Examples:

“crane(icl>device)” “crane(icl>bird)”

21

The Lexicon

Format of the dictionary entry

e.g., [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN);

Head word Universal word Attributes

Morphological - Pl(plural), V_ed(past tense form)

Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME

[headword] {} “Universal word“ (Attribute list);

22

The Lexicon (cntd)

Content words:

[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;

[mail] {} “mail(icl>message)” (N,PHSCL,INANI) <E,0,0>;

[minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>;

Headword Universal Word Attributes

He forwarded the mail to the minister.

23

The Lexicon (cntd)

function words:

[he] {} “he” (PRON,SUB,SING,3RD) <E,0,0>;

[the] {} “the” (ART,THE) <E,0,0>;

[to] {} “to” (PRE,#TO) <E,0,0>;

Headword Universal Word

Attributes

He forwarded the mail to the minister.

Hindi example: सं�ज्ञा� का� उदा�हरण १/२

सं�र्व�भौ�मशब्दाम�ख्य शब्दा

farmer(icl>creator)farmer

शे�तकार�

किकासं�न N,M,ANIMT,FAUNA,MML,PRSN,Na

N,ANIMT,FAUNA,MML,PRSN

E

M

H

N,M,ANIMT,FAUNA,MML,PRSN

गु�ण

25

The Features of a UW

Every concept existing in any language must correspond to a UW

The constraint list should be as small as necessary to disambiguate the headword

Every UW should be defined in the UNL Knowledge-Base

26

Restricted UWs

Examples He will hold office until the spring of next

year. The spring was broken.

Restricted UWs, which are Headwords with a constraint list, for example:

“spring(icl>season)” “spring(icl>device)”“spring(icl>jump)”“spring(icl>fountain)”

27

How to create UWs?

Pick up a concept the concept of “crane"

as "a device for lifting heavy loads” or

as “a long-legged bird that wade in water in search of food”

Choose an English word for the concept. In the case for “crane", since it is a word of

English, the corresponding word should be ‘crane'

Choose a constraint list for the word. [ ] ‘crane(icl>device)' [ ] ‘crane(icl>bird)'

How to create UNL expressions

English sentences: basic structure

A <verb> B John eats bread agt(eat.@entry,

John) obj(eat.@entry,

bread)

A <verb> John sleeps aoj(sleep.@entry,

John)

A <be> B John is good aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

Hindi sentences: basic structure

A B <verb> John roti khaataa hai agt(eat.@entry, John) obj(eat.@entry,

bread)

A <verb> John sotaa hai aoj(sleep.@entry,

John)

A <be> B John acchaa hai aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

:02:01

Complex English sentences: Use recursion on the basic structure

A <verb> B John who is a good boy eats

bread which is toasted

agt(eat.@entry, :01) obj(eat.@entry, :02) aoj:01(boy, John.@entry) mod:01(boy, good) obj:01(toast,

bread.@entry.@focus)

boy

John

aoj

toast

Bread

obj

eat

:02

:01

agt obj

good

mod

Red arrows indicate entry nodes