+ All Categories
Home > Documents > CSA2050 Introduction to Computational Linguistics

CSA2050 Introduction to Computational Linguistics

Date post: 09-Jan-2016
Category:
Upload: myron
View: 33 times
Download: 5 times
Share this document with a friend
Description:
CSA2050 Introduction to Computational Linguistics. Lecture 8 Definite Clause Grammars. Rationale. Prolog Program. Logic. CFG + Sentence. Sentence Structure. Logic Rules and Grammar Rules. Basic Question: what is the connection between logic rules and grammar rules? - PowerPoint PPT Presentation
Popular Tags:
21
09.04.2003 CSA2050: DCG I 1 Introduction to Computational Linguistics Lecture 8 Definite Clause Grammars
Transcript
Page 1: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 1

CSA2050 Introduction to Computational

Linguistics

Lecture 8

Definite Clause Grammars

Page 2: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 2

Rationale

Logic

CFG+

Sentence

Prolog Program

SentenceStructure

Page 3: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 3

Logic Rules andGrammar Rules

Basic Question: what is the connection between logic rules and grammar rules?

x y male(x) & parent(x,y) → father(x,y)

S → NP VP

They are both concerned with the definition of predicates.

Page 4: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 4

Logic Rulesand Grammar Rules

Logic: arbitrary n-ary predicates, eg raining; clever(x); father(x,y); between(x,y,z)

Grammar Rules: predicates over text segments, egnp(x); vp(y); s(z).

Page 5: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 5

Text Segments

A text segment is a sequence of consecutive words.

A text segment can be identified by two pointers, if we assign names to the spaces between words. 0 the 1 cat 2 sat 3 on 4 the 5 mat 6

(0,6) is the whole sentence (0,2) is the first noun phrase

Page 6: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 6

From Grammar Rules to Logic

The general statement made by the CF rule S → NP, VP

can be summarised using predicates over segments with the following logic statement

NP(p1,p) & VP(p,p2) => S(p1,p2)

Page 7: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 7

From Grammar Rules to Logic

0 the 1 cat 2 sat 3 on 4 the 5 mat 6

NP

VP

S

Page 8: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 8

From Logic to Prolog

Each logic statement of the form

NP(p1,p) & VP(p,p2) => S(p1,p2)Corresponds to the "definite clause"

s(P1,P2) :- np(P1,P), vp(P,P2).

Page 9: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 9

Converting a Grammar

S → NP, VP

NP → N

NP → Det N

VP → V NP

s(P1,P2) :- np(P1,P), vp(P,P2).

np(P1,P2) :- n(P1,P2).

np(P1,P2) :- det(P1,P), n(P,P2).

vp(P1,P2) :-v(P1,P), np(P, P2)

Page 10: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 10

Lexical Categories and Rules

Lexical categories are those which are not defined in the grammar itself (eg. N and V in our grammar)

Instead, they are defined by the words that they rewriteV → run, sleep, talk etc

Lexical categories always derive exactly one input token.

Page 11: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 11

Lexical Rules

A rule defining lexical category C must express the following information:there is a C between positions p1 and p2 if some word of syntactic category C spans those positions

There are many different ways to translate such a rule into a Prolog clause.

Each way needs to make reference to how the input sentence is represented.

Page 12: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 12

Defining Lexical Categories

Each category is defined in terms of the words it can rewrite

d(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]).

How is the input sentence represented?

Page 13: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 13

Representing the Input

Define the predicate input(P1,P2,L) such that P1 and P2 are positions and L is a list containing the words spanning those positions

Checkpoint: show how to represent the input sentence "John ate the cat"

Page 14: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 14

John ate the cat

input(0,1,['John']).

input(1,2,[ate]).

input(2,3,[the]).

input(3,4,[cat]). Checkpoints

Why is John in quotes? Why use a list of one element rather than an atom? Is this the only way to do it?

Page 15: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 15

Complete Program

1. Grammar

s(P1,P2) :- np(P1,P), vp(P,P2).

np(P1,P2) :- n(P1,P2).

np(P1,P2) :- d(P1,P), n(P,P2).

vp(P1,P2) :- v(P1,P2).

vp(P1,P2) :-v(P1,P), np(P, P2)

2. Lexicond(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]).3. Inputinput(0,1,['John']).input(1,2,[ate]).input(2,3,[the]).input(3,4,[cat]).4. Query?- s(0,4).

Page 16: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 16

Trace of query?- vp(1,4)

1 1 Call: vp(1,4) ?2 2 Call: v(1,4) ?3 3 Call: input(1,4,[ate]) ?3 3 Fail: input(1,4,[ate]) ? 2 2 Fail: v(1,4) ? 2 2 Call: v(1,_349) ? 3 3 Call: input(1,_349,[ate]) ? 3 3 Exit: input(1,2,[ate]) ? 2 2 Exit: v(1,2) ? 4 2 Call: np(2,4) ? 5 3 Call: n(2,4) ? 6 4 Call: input(2,4,[cat]) ? 6 4 Fail: input(2,4,[cat]) ?

6 4 Call: input(2,4,[John]) ? 6 4 Fail: input(2,4,[John]) ? 5 3 Fail: n(2,4) ? 5 3 Call: d(2,_1338) ? 6 4 Call: input(2,_1338,[the]) ? 6 4 Exit: input(2,3,[the]) ? 5 3 Exit: d(2,3) ? 7 3 Call: n(3,4) ? 8 4 Call: input(3,4,[cat]) ? 8 4 Exit: input(3,4,[cat]) ? 7 3 Exit: n(3,4) ? 4 2 Exit: np(2,4) ? 1 1 Exit: vp(1,4) ?

Page 17: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 17

Representing the Sentence Using Difference Lists

We can represent the input as a pair of pointers The first pointer points to the entire list The second pointer points to a suffix of the list. The represented list is the difference between

the two lists.input(['John',ate,the,cat],['John',ate,the,cat]).input(['John',ate,the,cat],[ate,the,cat]).input(['John',ate,the,cat],[the,cat]).input(['John',ate,the,cat],[]).input([X|Y],Y,X).

Page 18: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 18

DCG Notation

The conversion of CF rules into Prolog is so simple that it can be done automatically.

Clauses in DCG notation:s --> np, vp.np --> d, n.n --> [cat].are automatically translated when read in tos(P1,P2) --> np(P1,P),vp(P,P2).np(P1,P2) --> d(P1,P), n(P,P2).n([dog|L],L).

Page 19: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 19

DCG Notation

Every DCG rule takes the formnonterminal --> expansionwhere expansion is any of A nonterminal symbol np A list of non-terminal symbols [each,other] A null constitutent [ ] A plain Prolog goal enclosed in braces {write('Found')}

A series of any of these expansions joined by commas.

Page 20: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 20

Complete DCG

1. Grammar

s --> np, vp.

np --> n.

np --> d, n.

vp --> v.

vp --> v, np

2. Lexicond --> [the].n --> [cat].n --> ['John'].v --> ['ate']. 3. Input

4. Query?- s(['john', ate, the, cat], []).

Page 21: CSA2050 Introduction to Computational Linguistics

09.04.2003 CSA2050: DCG I 21

Checkpoints

What is your system's translation ofs --> np, vp.n --> [cat].


Recommended