+ All Categories
Home > Documents > Introduction to Computational Linguistics

Introduction to Computational Linguistics

Date post: 21-Mar-2016
Category:
Upload: siran
View: 82 times
Download: 2 times
Share this document with a friend
Description:
Introduction to Computational Linguistics. Martha Palmer April 19, 2006. Natural Language Processing. Machine Translation Predicate argument structures Syntactic parses Producing semantic representations Ambiguities in sentence interpretation. Machine Translation. - PowerPoint PPT Presentation
Popular Tags:
30
LING 2000 - 2006 NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006
Transcript

LING 2000 - 2006 NLP1

Introduction to Computational Linguistics

Martha PalmerApril 19, 2006

LING 2000 - 2006 NLP2

Natural Language Processing

• Machine Translation• Predicate argument structures• Syntactic parses• Producing semantic representations• Ambiguities in sentence interpretation

LING 2000 - 2006 NLP3

Machine Translation

• One of the first applications for computers– bilingual dictionary > word-word translation

• Good translation requires understanding!– War and Peace, The Sound and The Fury?

• What can we do? Sublanguages.– technical domains, static vocabulary– Meteo in Canada, Caterpillar Tractor Manuals,

Botanical descriptions, Military Messages

LING 2000 - 2006 NLP4

Example translation

LING 2000 - 2006 NLP5

Translation Issues: Korean to English

- Word order- Dropped arguments- Lexical ambiguities- Structure vs morphology

LING 2000 - 2006 NLP6

Common Thread

• Predicate-argument structure– Basic constituents of the sentence and how

they are related to each other• Constituents

– John, Mary, the dog, pleasure, the store.• Relations

– Loves, feeds, go, to, bring

LING 2000 - 2006 NLP7

Abstracting away from surface structure

LING 2000 - 2006 NLP8

Transfer lexicons

LING 2000 - 2006 NLP9

Machine Translation Lexical Choice- Word Sense Disambiguation

Iraq lost the battle. Ilakuka centwey ciessta. [Iraq ] [battle] [lost].

John lost his computer. John-i computer-lul ilepelyessta. [John] [computer] [misplaced].

LING 2000 - 2006 NLP10

Natural Language Processing

• Syntax– Grammars, parsers, parse trees,

dependency structures• Semantics

– Subcategorization frames, semantic classes, ontologies, formal semantics

• Pragmatics– Pronouns, reference resolution, discourse

models

LING 2000 - 2006 NLP11

Syntactic Categories

• Nouns, pronouns, Proper nouns• Verbs, intransitive verbs, transitive verbs,

ditransitive verbs (subcategorization frames)

• Modifiers, Adjectives, Adverbs• Prepositions• Conjunctions

LING 2000 - 2006 NLP12

Syntactic Parsing

• The cat sat on the mat. Det Noun Verb Prep Det Noun

• Time flies like an arrow. Noun Verb Prep Det Noun

• Fruit flies like a banana. Noun Noun Verb Det Noun

Context Free Grammar

• S -> NP VP• NP -> det (adj) N• NP -> Proper N• NP -> N• VP -> V, VP -> V PP• VP -> V NP• VP -> V NP PP, PP -> Prep NP• VP -> V NP NP

LING 2000 - 2006 NLP13

LING 2000 - 2006 NLP14

Parses

V PP

VP

S

NP

the

the mat

satcat

onNPPrep

The cat sat on the mat

DetN

Det N

LING 2000 - 2006 NLP15

Parses

VPP

VP

S

NP

time

an arrow

flies

likeNPPrep

Time flies like an arrow.

N

Det N

LING 2000 - 2006 NLP16

Parses

V NP

VP

S

NP

flies like

anNDet

Time flies like an arrow.

Ntime

arrow

N

LING 2000 - 2006 NLP17

Features• C for Case, Subjective/Objective

– She visited her. • P for Person agreement, (1st, 2nd, 3rd)

– I like him, You like him, He likes him, • N for Number agreement, Subject/Verb

– He likes him, They like him.• G for Gender agreement, Subject/Verb

– English, reflexive pronouns He washed himself.– Romance languages, det/noun

• T for Tense, – auxiliaries, sentential complements, etc. – * will finished is bad

LING 2000 - 2006 NLP18

Probabilistic Context Free Grammars

• Adding probabilities• Lexicalizing the probabilities

LING 2000 - 2006 NLP19

Simple Context Free Grammar in BNFS → NP VPNP → Pronoun

| Noun | Det Adj Noun |NP PP

PP → Prep NPV → Verb

| Aux VerbVP → V

| V NP | V NP NP | V NP PP | VP PP

LING 2000 - 2006 NLP20

Simple Probabilistic CFGS → NP VPNP → Pronoun [0.10]

| Noun [0.20]| Det Adj Noun [0.50]|NP PP [0.20]

PP → Prep NP [1.00]V → Verb [0.33]

| Aux Verb [0.67]VP → V [0.10]

| V NP [0.40]| V NP NP [0.10]| V NP PP [0.20]| VP PP [0.20]

LING 2000 - 2006 NLP21

Simple Probabilistic Lexicalized CFGS → NP VPNP → Pronoun [0.10]

| Noun [0.20]| Det Adj Noun [0.50]|NP PP [0.20]

PP → Prep NP [1.00]V → Verb [0.33]

| Aux Verb [0.67]VP → V [0.87] {sleep, cry, laugh}

| V NP [0.03]| V NP NP [0.00]| V NP PP [0.00]| VP PP [0.10]

LING 2000 - 2006 NLP22

Simple Probabilistic Lexicalized CFGVP → V [0.30]

| V NP [0.60] {break,split,crack..}

| V NP NP [0.00]| V NP PP [0.00]| VP PP [0.10]

VP → V [0.10] what about | V NP [0.40] leave?| V NP NP [0.10] leave1,

leave2?| V NP PP [0.20]| VP PP [0.20]

LING 2000 - 2006 NLP23

Language to Logic

• John went to the book store. John store1, go(John, store1)

• John bought a book. buy(John,book1)

• John gave the book to Mary. give(John,book1,Mary)

• Mary put the book on the table. put(Mary,book1,table1)

LING 2000 - 2006 NLP24

SemanticsSame event - different sentences

  John broke the window with a hammer.

  John broke the window with the crack.

  The hammer broke the window.

  The window broke.

LING 2000 - 2006 NLP25

Same event - different syntactic frames

  John broke the window with a hammer.  SUBJ VERB OBJ MODIFIER

  John broke the window with the crack.  SUBJ VERB OBJ MODIFIER

  The hammer broke the window.  SUBJ VERB OBJ

  The window broke.  SUBJ VERB

LING 2000 - 2006 NLP26

Semantics -predicate arguments

  break(AGENT, INSTRUMENT, PATIENT)

  AGENT PATIENT INSTRUMENT  John broke the window with a hammer.

  INSTRUMENT PATIENT  The hammer broke the window.

  PATIENT  The window broke.

  Fillmore 68 - The case for case

LING 2000 - 2006 NLP27

    AGENT PATIENT INSTRUMENT  John broke the window with a hammer.  SUBJ OBJ MODIFIER

  INSTRUMENT PATIENT  The hammer broke the window.  SUBJ OBJ

  PATIENT  The window broke.  SUBJ

LING 2000 - 2006 NLP28

Canonical Representation

  break (Agent: animate,  Instrument: tool,  Patient: physical-object)

  Agent <=> subj  Instrument <=> subj, with-pp  Patient <=> obj, subj

 

LING 2000 - 2006 NLP29

Syntax/semantics interaction

• Parsers will produce syntactically valid parses for semantically anomalous sentences

• Lexical semantics can be used to rule them out

LING 2000 - 2006 NLP30

Headlines

• Police Begin Campaign To Run Down Jaywalkers

• Iraqi Head Seeks Arms

• Teacher Strikes Idle Kids

• Miners Refuse To Work After Death

• Juvenile Court To Try Shooting Defendant


Recommended