+ All Categories
Home > Documents > CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal...

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal...

Date post: 30-Dec-2015
Category:
Upload: cordelia-wilkerson
View: 217 times
Download: 0 times
Share this document with a friend
Popular Tags:
31
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept., IIT Bombay 12 th April, 2011
Transcript
Page 1: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

CS460/626 : Natural Language Processing/Speech, NLP and the Web

(Lecture 37– Semantics; Universal Networking Language)

Pushpak BhattacharyyaCSE Dept., IIT Bombay

12th April, 2011

Page 2: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Semantics: wikipedia

• Semantics (from Greek sēmantiká, neuter plural of sēmantikós) is the study of meaning.

• It typically focuses on the relation between signifiers, such as words, phrases, signs and symbols, and what they stand for, their denotata.

Page 3: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Computational Semantics: wikipedia• Computational semantics is the study of how to

automate the process of constructing and reasoning with meaning representations of natural language expressions.

• Some traditional topics of interest are: construction of meaning representations, semantic underspecification, anaphora resolution, presupposition projection, and quantifier scope resolution.

• Methods employed usually draw from formal semantics or statistical semantics.

• Computational semantics has points of contact with the areas of lexical semantics (word sense disambiguation and semantic role labeling), discourse semantics, knowledge representation and automated reasoning (in particular, automated theorem proving).

• Since 1999 there has been an ACL special interest group on computational semantics, SIGSEM.

Page 4: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

A hurdle: signifier-denotata dichotomy

Divide between a word and what it stands for

“red” is NOT red in colour “red wine”, “red rose”, “he is in the

red” denote very different sense of the word

Translation into another language reveals this difference

Page 5: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

A Perpective

Morphology

Lexicon

Syntax

Semantics

Pragmatics

Discourse

Page 6: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Our tryst with semantics:

Universal Networking Language (UNL)

Page 7: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Motivation

Extraction of semantics, i.e., deep meaning is important for many applications. Machine Translation, Meaning-based IR, CLIR

Robust, scalable & efficient methods of knowledge extraction required

Machine Translation and Cross Lingual IR: a need of the hour for crossing language barrier

Page 8: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Interlingua: a vehicle for machine translation

Interlingua(UNL)

English

French

Hindi

Chinese

generation

Analysis

Page 9: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

UNL: a United Nations project Started in 1996 10 year program 15 research groups across continents First goal: generators Next goal: analysers (needs solving various

ambiguity problems) Current active language groups

UNL_French (GETA-CLIPS, IMAG) UNL_English+Hindi UNL_Italian (Univ. of Pisa) UNL_Portugese (Univ of Sao Paolo, Brazil) UNL_Russian (Institute of Linguistics, Moscow) UNL_Spanish (UPM, Madrid)

Page 10: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

10

World-wide Universal Networking Language (UNL) Project

UNL

English Russian

Japanese

Hindi

Spanish

Language independent meaning representation.

Marathi

Others

Page 11: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

11

The UNL MT System: an Overview

Page 12: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

NLP@IITB

Page 13: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Foundations and Applications

UNL Foundations Semantic Relations Universal Words Attributes How to write UNL expressions

UNL Applications Machine Translation: Rule based and

Statistical Search Text Entailment Sentiment Analysis

Page 14: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

LanguageProcessing & Understanding

Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization

Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications)Using graphical models, support vector machines, neural networks

IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback

Machine Translation: Statistical Interlingua Based EnglishIndian languages Indian languagesIndian languages Indowordnet

Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb

Linguistics is the eye and computation thebody

Page 15: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

UNL represents knowledge: John eats rice with a spoon

Semantic relations

attributes

Universal words

Repositoryof 42SemanticRelations and84 attributelabels

Page 16: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Sentence embeddings

Deepa claimed that she had composed a poem.

[UNL]agt(claim.@entry.@past, Deepa)obj(claim.@entry.@past, :01)agt:01(compose.@past.@entry.@complete,

she)obj:01(compose.@past.@entry.@complete,

poem.@indef)

[\UNL]

Page 17: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

17

Constituents of Universal Networking Language

Universal Words (UWs) Relations Attributes Knowledge Base

Page 18: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

18

UNL Graph

obj

agt

@ entry @ past

minister(icl>person)

forward(icl>send)

mail(icl>collection)

he(icl>person)

@def

@def

gol

He forwarded the mail to the minister.

Page 19: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

19

UNL Expression

agt (forward(icl>send).@ entry @ past, he(icl>person))

obj (forward(icl>send).@ entry @ past, minister(icl>person))

gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)

Page 20: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

20

What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the

syntactic-semantic units to form UNL expressions

A UW represents a concept Basic UW (an English word/compound

word/phrase with no restrictions or Constraint List)

Restricted UW (with a Constraint List ) Examples:

“crane(icl>device)” “crane(icl>bird)”

Page 21: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

21

The Lexicon

Format of the dictionary entry

e.g., [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN);

Head word Universal word Attributes

Morphological - Pl(plural), V_ed(past tense form)

Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME

[headword] {} “Universal word“ (Attribute list);

Page 22: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

22

The Lexicon (cntd)

Content words:

[forward] {} “forward(icl>send)” (V,VOA) <E,0,0>;

[mail] {} “mail(icl>message)” (N,PHSCL,INANI) <E,0,0>;

[minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) <E,0,0>;

Headword Universal Word Attributes

He forwarded the mail to the minister.

Page 23: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

23

The Lexicon (cntd)

function words:

[he] {} “he” (PRON,SUB,SING,3RD) <E,0,0>;

[the] {} “the” (ART,THE) <E,0,0>;

[to] {} “to” (PRE,#TO) <E,0,0>;

Headword Universal Word

Attributes

He forwarded the mail to the minister.

Page 24: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Hindi example: सं�ज्ञा� का� उदा�हरण १/२

सं�र्व�भौ�मशब्दाम�ख्य शब्दा

farmer(icl>creator)farmer

शे�तकार�

किकासं�न N,M,ANIMT,FAUNA,MML,PRSN,Na

N,ANIMT,FAUNA,MML,PRSN

E

M

H

N,M,ANIMT,FAUNA,MML,PRSN

गु�ण

Page 25: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

25

The Features of a UW

Every concept existing in any language must correspond to a UW

The constraint list should be as small as necessary to disambiguate the headword

Every UW should be defined in the UNL Knowledge-Base

Page 26: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

26

Restricted UWs

Examples He will hold office until the spring of next

year. The spring was broken.

Restricted UWs, which are Headwords with a constraint list, for example:

“spring(icl>season)” “spring(icl>device)”“spring(icl>jump)”“spring(icl>fountain)”

Page 27: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

27

How to create UWs?

Pick up a concept the concept of “crane"

as "a device for lifting heavy loads” or

as “a long-legged bird that wade in water in search of food”

Choose an English word for the concept. In the case for “crane", since it is a word of

English, the corresponding word should be ‘crane'

Choose a constraint list for the word. [ ] ‘crane(icl>device)' [ ] ‘crane(icl>bird)'

Page 28: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

How to create UNL expressions

Page 29: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

English sentences: basic structure

A <verb> B John eats bread agt(eat.@entry,

John) obj(eat.@entry,

bread)

A <verb> John sleeps aoj(sleep.@entry,

John)

A <be> B John is good aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

Page 30: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

Hindi sentences: basic structure

A B <verb> John roti khaataa hai agt(eat.@entry, John) obj(eat.@entry,

bread)

A <verb> John sotaa hai aoj(sleep.@entry,

John)

A <be> B John acchaa hai aoj(good.@entry,

John)

verb

A

R1

R2

B

A

aoj

verb

BA

R1R2

Page 31: CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 37– Semantics; Universal Networking Language) Pushpak Bhattacharyya CSE Dept.,

:02:01

Complex English sentences: Use recursion on the basic structure

A <verb> B John who is a good boy eats

bread which is toasted

agt(eat.@entry, :01) obj(eat.@entry, :02) aoj:01(boy, John.@entry) mod:01(boy, good) obj:01(toast,

bread.@entry.@focus)

boy

John

aoj

toast

Bread

obj

eat

:02

:01

agt obj

good

mod

Red arrows indicate entry nodes


Recommended