+ All Categories
Home > Documents > CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Date post: 14-Jan-2016
Category:
Upload: isabel-mccoy
View: 235 times
Download: 7 times
Share this document with a friend
Popular Tags:
27
Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?
Transcript
Page 1: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

CSA2050 Introduction to Computational

Linguistics

Lecture 1

What is Computational Linguistics?

Page 2: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 2

Lecture 1

Course Information What is CL?

Linguistics CS

Course Contents

Page 3: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 3

Course Information

Webhttp://www.cs.um.edu.mt/~mros/csa2050

[email protected]@um.edu.mt

Book (nominally)Jurafsky & Martin, Speech and Language Processing, Prentice Hall 2000, ISBN 0-13-095069-6

Page 4: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 4

CL: Two Main Disciplines

COMP SCILINGUISTICS

Page 5: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 5

Computers and Language

Computational Linguistics Emphasis on mechanised linguistic theories. Grew out of early Machine Translation efforts

Natural Language Processing Computational models of language analysis, interpretation,

and generation. syntax/semantics interface

Language Engineering emphasis on large-scale performance example: Google

Speech Technology

Page 6: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 6

Linguistics

Phonetics: The study of speech sounds Phonology: The study of sound systems Morphology: The study of word structure Syntax: The study of sentence structure Semantics: The study of meaning Pragmatics: The study of language use

Page 7: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 7

History of Grammar

Until 50 years ago, most linguistic work concerned sound systems (phonology), word structure (morphology), and the historical relationships among languages.

Writings on grammar go back at least 3000 years. Until 200 years ago, almost all of it was prescriptive.

Scientific study of sentence grammar is comparatively recent.

[source: Sag & Wasow]

Page 8: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 8

Grammar: the rules of a language

Prescriptive Grammar

Subjective Rules for and against

certain uses Proscribed forms that

are in current use “don’t end a sentence

with a preposition”

Descriptive Grammar

Objective Rules characterizing

what people actually say

Goal is to characterize all and only sentences that belong to the language.

Page 9: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 9

Noam Chomsky

Noam Chomsky’s work in the 1950s radically changed linguistics, making syntax central.

Chomsky has been the dominant figure in linguistics ever since.

Chomsky invented the generative approach to grammar.

Page 10: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 10

Generative Grammar:What Follows?

Grammars should be formulated precisely and explicitly

Grammar is a theory of linguistic knowledge. Mathematical definition of a grammar as a

generative device. Grammar should generate exactly the strings

of the language.

[source: Sag & Wasow]

Page 11: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 11

Generative Power of a Grammar

G

G

GL

L

L

undergenerationonly but not all

overgenerationall but not only

all and only

Page 12: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 12

Theories of Sentence and Word Structure: Rewrite Rules

Rewrite rules can be used to specify the sentences of a language.

Rules have the formLHS RHS LHS may be a sequence of symbols RHS may be a sequence of symbols or words.

Lexicon specifies words and their categories

Page 13: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 13

A Simple Grammar/Lexicon

grammar:

S NP VPNP NVP V NPlexicon:

V kicksN JohnN Bill

S

NP

N

John kicks

NPV

VP

N

Bill

Page 14: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 14

Grammar + Lexicon

Defines language = (possibly infinite) set of sentences.

But grammar is finite. Assigns structures that are

general "closer" to meaning than sentence itself.

Grammar/Lexicon = Linguistic knowledge? Learnability: grammar is concrete entity that

can be acquired.

Page 15: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 15

Formal v. Natural Languages

Formal Languages

Numbers3290 1 1010101

Logicx man(x) mortal(x)

Cif (i >10) exit(0);

Natural Languages

EnglishJohn saw the dog

GermanJohann hat den hund gesehen

MalteseGianni ra kelb

Page 16: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 16

Points of Similarity

A language is considered to be a (possibly infinite) set of sentences.

Sentences are sequences of words. Formation rules determine which sequences

are valid sentences. Sentences have a definite structure. Sentence structure related to meaning.

Page 17: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 17

Points of Difference

Formal Languages The grammar

defines the language

Restricted application

Non ambiguous

Natural Languages The language

defines the grammar

Universal application

Highly ambiguous

Page 18: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 18

Ambiguity Lexical Ambiguity

the sheep is in the pen Syntactic Ambiguity

small animals and children laugh Semantic Ambiguity

every girl loves a sailor Pragmatic Ambiguity

can you pass the salt? The management of ambiguity is central to the

success of CL

Page 19: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 19

Computer Science

The study of basic concepts Algorithm Program Information Data

The application of these concepts to practical tasks.

Implementation of information processing models from other fields.

Page 20: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 20

Unimplemented theoriescan be dangerous

Representational details omitted. Computer memory requirements omitted. Nature of individual steps may be unclear. Difficult to test. Potentially unimplementable

Page 21: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 21

PsychologicalMemory Model

Page 22: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 22

Algorithms and Linguistics

Does linguistic theory make sense without implementing the concepts?

Linguistic theory provides linguistic knowledge in the form of grammar rules theories about grammar rules

Putting knowledge to some use involves processing issues: parsing generation

Page 23: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 23

Computational Linguistics – Issues

How are a grammar and a lexicon represented? How is the structure of a given sentence actually

discovered? How can we actually generate a sentence to

express a particular meaning? How can linguistic theory be made concrete enough

to test algorithmically? Can an artificial system learn a language with

limited exposure to grammatical sentences?

Page 24: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 24

Computational LinguisticsTwin Goals

Scientific Goal:Contribute to Linguistics by adding a computational dimension.

Technological Goal: Develop basis for machinery capable of handling human language that can support “language engineering”

Page 25: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 25

Applications of Computational Linguistics

Machine Translation Information Retrieval/Extraction Document Classification Question Answering Style and Spell Checking Integrated Multimodal Tasks

Page 26: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 26

Course Contents

1 (MR) Overview

2 (RF) Chomsky Hierarchy

3 (MR) Examples

4 (RF) Grammatical Categories

5, 6 (MR) Tagging

7 (RF) Morphology

8, 9, 10 (MR) Comp Morphology

11 (RF) Syntax

12, 13, 14(MR) Grammar Formalism

Page 27: CSA2050 Introduction to Computational Linguistics Lecture 1 What is Computational Linguistics?

Feb 2005 -- MR CSA2050 - Lecture I: What Is CL? 27

Computational Linguistics – Tools & Resources

Grammar Formalisms, e.g.Definite Clause Grammars

Parsing Algorithmssentence structure

Generation Algorithmsstructure sentence

Statistical Methods Linguistic Corpora


Recommended