10/23/2015CPSC503 Winter 20081 CPSC 503 Computational Linguistics Natural Language Generation...

04/20/23 CPSC503 Winter 2008 1

CPSC 503Computational Linguistics

Natural Language Generation

Lecture 15

Giuseppe Carenini

04/20/23 CPSC503 Winter 2008 2

Knowledge-Formalisms Map for Generation

Logical formalisms

(First-Order Logics)

Rule systems

(features and unification)

State Machines Morphology

Syntax

Pragmatics

Discourse and Dialogue

Semantics

AI planners

Understanding

Generation

Intended meaning

Discourse (English)

04/20/23 CPSC503 Winter 2008 3

NLG Systems (see handout)

NLG

System

• Communicative Goals• Domain Knowledge• Context Knowledge

Text

Examples• FOG – Input: numerical data about future. Output: textual

wheatear forecasts

• IDAS – Input: KB describing a machinery (e.g., bike), user’s level of expertise Output: hypertext help messages

• ModelExplainer – Input: OO model. Output: textual description of information on aspects of the model

• STOP – Input: user history and attitudes toward smoking Output: personalize smoking cessation letter

04/20/23 CPSC503 Winter 2008 4

GEA: the Generator of Evaluative Arguments

- NLG pipeline architecture- Research methodology- Extrinsic evaluation (vs. Intrinsic)- Research / commercialization cycle

04/20/23 CPSC503 Winter 2008 5

Four Basic Types of Persuasive Text -“Arguments”… (main claim)

• Factual Argument (e.g., Canada is the only country outside of Asia to record SARS-related deaths….)

• Causal Argument (e.g., Travelers from Honk Kong

brought SARS to Toronto….)

• Recommendation (e.g., You should not go to China in the

next few weeks…..)

• Evaluative Argument (e.g., Some Asian governments

were inefficient in stopping the SARS outbreak…)

04/20/23 CPSC503 Winter 2008 6

Sample Textual Evaluative Arguments

Vancouver is better than Seattle. There is less crime. Also, social services are more accessible.

Comparison

House-A is great! Although it is somewhat old, the house is spacious and is in an excellent location.

Single entity

04/20/23 CPSC503 Winter 2008 7

Evaluative Arguments: Importance

Natural Language Generation Theory: model of argument type which is pervasive in natural human communication.

Ability to generate evaluative arguments is crucial in large classes of systems:– Personal assistants (e.g., travel advisor) – Recommender systems (e.g., movie, book) – Tutoring Systems– ……

04/20/23 CPSC503 Winter 2008 8

Limitations of Previous Research[Ardissono and Goy 99] [Chu-Carroll and Carberry 98]

[Elhadad 95] [Kolln 95] [Klein 94] [Morik 89]

• Focus on specific aspects of generation– Selection of content

– Realization of content into language

• Lack of systematic evaluation– proof-of-concept system

– analyzed on a few examples

04/20/23 CPSC503 Winter 2008 9

Methodology

• Develop generator of evaluative arguments– complete– integrate and extend previous work

• Develop evaluation framework

• Perform experiment within framework to test generator

04/20/23 CPSC503 Winter 2008 10

Outline

• Generator of Evaluative Arguments (GEA)

• Evaluation Framework

• Experiment

• More recent results from others

04/20/23 CPSC503 Winter 2008 11

Text Generator Architecture

Text Planner

TextPlan

Communicative Strategies

Knowledge Sources:

- User Model

- Domain Model

Content Selection and Organization

Content Realization

SentenceGenerator

TextMicro-planner

English

Linguistic Knowledge Sources:

- Lexicon

- Grammar

04/20/23 CPSC503 Winter 2008 12

User Model must

GEA User Model Argumentation Theory tells

us [Miller 96, Mayberry 96]

– Supporting (opposing) evidence depends on values and preferences of audience

– Evidence arranged according to importance (i.e., strength of support or opposition)

– Concise: only important evidence included

– Represent values and preferences of user

– Enable identification of supporting and opposing evidence

– Provide measure of evidence importance

… and can be elicited in practice ...

04/20/23 CPSC503 Winter 2008 13

Model of User’s PreferencesAdditive Multi-attribute Value Function (AMVF)• Decision Theory and Psychology (Consumer’s Behavior)

• Can be elicited in practice [Edwards and Barron 1994]

0.7

0.4

0.6

0.3

0.2

0.8

HouseValue

Location

Amenities

Neighborhood

Park-Distance

Porch-Size

Deck-Size

OBJECTIVES

Park-Distance - PD v2(PD) 0=<PD<=5 1- (1/5 * PD)

PD>5 0

Neighborhood - N v1(N) Southside 0 Westend 0.6 Eastend 1

Deck-Size - DS v3(DS) 0=<DS<=80 (1/80 * DS)

Porch-Size - PS v4(PS) 0=<PS<=60 (1/60 * PS)

COMPONENT VALUE FUNCTIONS

04/20/23 CPSC503 Winter 2008 14

HouseValue

Location

Amenities

Neighborhood

Park-Distance

Porch-Size

Deck-Size

0.7

0.4

0.6

0.3

0.2

0.8

OBJECTIVES

House-A

Westend

0.5 km

20 m2

36 m2

0.64

0.6

0.9

0.25

0.6

0.32

0.78

AMVF application

Park-Distance - PD v2(PD) 0=<PD<=5 1- (1/5 * PD)

PD>5 0

Neighborhood - N v1(N) Southside 0 Westend 0.6 Eastend 1

Deck-Size - DS v3(DS) 0=<DS<=80 (1/80 * DS)

Porch-Size - PS v4(PS) 0=<PS<=60 (1/60 * PS)

COMPONENT VALUE FUNCTIONS

+

_+

+

+

+

_

_+ Likes it

Does not like it

04/20/23 CPSC503 Winter 2008 15

Supporting and Opposing Evidence

Location

Amenities

Porch-Size

Deck-Size

0.7

0.4

0.6

0.3

0.2

0.8

0.64 0.9

0.25

0.6

0.32

0.78

Neighborhood

Park-Distance

House-A

n2

0.5 km

20 m2

36 m2

HouseValue

0.6

_

+

_

+

+

+

_+ Supporting

Opposing

+

_

+

+

+

+

_

_+ Likes it

Does not like it

opposing

opposing

supporting

supporting

relationParent(o) o

+_

+

+

+

__

_

04/20/23 CPSC503 Winter 2008 16

Measure of Importance [Klein 94]

_+ Supporting

Opposing

Location

Amenities

Porch-Size

Deck-Size

0.7

0.4

0.6

0.3

0.2

0.8

0.64 0.9

0.25

0.6

0.32

0.78

Neighborhood

Park-Distance

House-A

n2

0.5 km

20 m2

36 m2

HouseValue

0.6+

_

_

+

+

+

+

_

+

+

+

+

_

_+ Likes it

Does not like it

]1,max[)( Importance : objectiveeach For * ]v[vwo

oooo

0.55

0.2

0.12

0.6

0.24

0.54

10.5

1

0 vo

04/20/23 CPSC503 Winter 2008 17

Why AMVF? - summary

An AMVF

• Represents user’s values and preferences

• Enables identification of supporting and opposing evidence

• Provides measure of evidence importance

– Evidence arranged according to importance

– Concise arguments can be generated

• Can be elicited in practice

04/20/23 CPSC503 Winter 2008 18

GEA Architecture

Text Planner

SentenceGenerator

TextMicro-planner

English

TextPlan


Knowledge Sources:

- User Model

- Domain Model


Content Realization


- Lexicon

- Grammar

AMVF

04/20/23 CPSC503 Winter 2008 19

Argumentative Strategy

Selection: include only “important” evidence (i.e., above threshold on measure of importance)

Organization:

(1) Main Claim (e.g., “This house is interesting”)

(2) Opposing evidence

(3) Most important supporting evidence

(4) Further supporting evidence -- ordered by importance with strongest last

Strategy applied recursively on supporting evidence

Based on guidelines from argumentation theory [Miller 96, Mayberry 96]

04/20/23 CPSC503 Winter 2008 20

Sample GEA Text Plan

SUPPORTING EVIDENCEMAIN-CLAIM

SUB-CLAIM

(VALUE (House-A) 0.72)

(VALUE (Location) 0.7)

(VALUE

(distance-from-park 1.8m) 0.3) (VALUE

(distance-from-work 1mi)

0.75)

(VALUE

(distance-from-rap-trans 0.5 mi)

0.75)

decomposition ordering rhetorical relations

OPPOSING EVIDENCE SUPPORTING EVIDENCE

EVALUATIVE ARGUMENT

04/20/23 CPSC503 Winter 2008 21

GEA Architecture

Text Planner

SentenceGenerator

TextMicro-planner

English

TextPlan


Knowledge Sources:

- User Model

- Domain Model


Content Realization


- Lexicon

- Grammar

Argumentative Strategy

AMVF

04/20/23 CPSC503 Winter 2008 22

Text Micro-Planner

• Aggregation: combining multiple propositions in one

single sentence [Shaw 98]

• Lexicalization:

– Scalar Adjectives (e.g., nice, far, convenient) [Elhadad 93]

– Discourse cues (e.g., although, because, in fact) [Knott 96; Di Eugenio, Moore and Paolucci 97]

– Pronominalization: deciding whether to use a pronoun to refer to an entity (centering [Grosz,Joshi and Weinstein 95])

04/20/23 CPSC503 Winter 2008 23

Aggregation (Logical Forms)

• Conjunction via shared participants“House B-11 is far from a shopping area” +

“House B-11 is far from public transportation” =

“House B-11 is far from a shopping area and public transportation”.

• Syntactic embedding“House B-11 offers a nice view” +

“House B-11 offers a view on the river” =

“House B-11 offers a nice view on the river”.

04/20/23 CPSC503 Winter 2008 24

Scalar Adjectives Selection

HOUSE-LOCATION

HAS_PARK_DISTANCE

HAS_COMMUTING_DISTANCE

HAS_SHOPPING_DISTANCE

HOUSE-AMENITIES

Value > 0.8

0.65 < Value < 0.8

0.5 < Value < 0.65

0.35 < Value < 0.5

Value < 0.2

The house has an excellent location

… a convenient …

… an average…

… a bad …

… a terrible …

… a reasonable …

0.2 < Value < 0.35

.

.

.

HOUSE-LOCATION

HAS_PARK_DISTANCE

HAS_COMMUTING_DISTANCE

HAS_SHOPPING_DISTANCE

HOUSE-AMENITIES

Value > 0.8

0.65 < Value < 0.8

0.5 < Value < 0.65

0.35 < Value < 0.5

Value < 0.2

The house has an excellent location

… a convenient …

… an average…

… a bad …

… a terrible …

… a reasonable …

0.2 < Value < 0.35

04/20/23 CPSC503 Winter 2008 25

Discourse Cues Selection

Rel-type Type-of-nesting

Typed-ordering Discourse cue

EVIDENCE

Although (placed on contributor)

("CORE" "CONCESSION" "EVIDENCE") CONCESSION

SEQUENCE

EVIDENCE

ROOT

("CORE" "CONCESSION" "EVIDENCE") Even though (placed on contributor)

("CONCESSION”"CORE" "EVIDENCE") ) However(placed on core)

04/20/23 CPSC503 Winter 2008 26

Simple Pronominalization Strategy inspired by Centering Theory

Centering tells us: entity providing link preferentially realized as pronoun (within a discourse segment)

Our Strategy:

• Within a discourse segment successive references always pronoun

• First reference in segment definite description unless– Segment boundary explicitly marked by discourse cue and– No pronoun was used in previous sentence

“House B-11 is an interesting house. In fact, it has a reasonable…”.

04/20/23 CPSC503 Winter 2008 27

Output of MicroPlanning

Sequence of Lexicalized Functional Descriptions (LFDs)

Example:“House B-11 is close to shops and reasonably close to work”

((CAT CLAUSE) (PROCESS ((TYPE ASCRIPTIVE) (MODE ATTRIBUTIVE)((POLARITY POSITIVE(EPISTEMIC-MODALITY NONE))) (PARTICIPANTS ((CARRIER((CAT NP)(COMPLEX APPOSITION) (RESTRICTIVE YES) (DISTINCT ((AND ((CAT COMMON)(DENOTATION ZERO-ARTICLE-THING)(HEAD ((LEX "house")))) ((CAT PROPER) (LEX "B-11")))(CDR NONE)))) (ATTRIBUTE (AND((CAT AP)(HEAD ((CAT ADJ)(LEX "close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP((CAT COMMON) (NUMBER PLURAL)(DEFINITE NO) (HEAD ((CAT NOUN) (LEX "shop"))))))))) ((CAT AP)(HEAD ((CAT ADJ)(LEX "reasonably close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP ((CAT COMMON)(DEFINITE NO) (HEAD ((CAT NOUN)(LEX "work"))))))) )))))))))))

04/20/23 CPSC503 Winter 2008 28

Last Step: Sentence Generator

• Unify LFDs with large grammar of English (FUF/SURGE [Elhadad 93, Robin 94])

– fill in syntactic constraints (e.g., agreement, ordering)

– choose closed class words (e.g., prepositions, articles)

• Apply morphology

• Linearize as English sentences

04/20/23 CPSC503 Winter 2008 29

GEA Highlights• GEA implements a computational model of

generating evaluative arguments

• All aspects covered in a principled way:– argumentation theory (argumentative strategy

and requirements on user model)– decision theory (user model and elicitation

method)– computational linguistics (architecture, micro-

planning techniques and sentence generator)

04/20/23 CPSC503 Winter 2008 30

Outline



• Experiment


04/20/23 CPSC503 Winter 2008 31

Evaluation Framework:

Task EfficacyUser presented

with info about set of alternatives

- Select preferred N alternatives- Order them by preference

1st best

2nd best

…..

nth best

Subtask1Hot List

Subtask2

User presented with Evaluative argument about NewInstance

Include?

NO

YES

End

2nd best

.....

nth best

Where?

NewInstance is created

Fill-out final questionnaire

1st best

Hot List

04/20/23 CPSC503 Winter 2008 32

Selection Task in Real-Estate

• Why Real-Estate?– No sophisticated background or expertise– But still presents challenging decision task

• Instructions– Move to new town– Buy house– Use system for data exploration

04/20/23 CPSC503 Winter 2008 33

Data Exploration System

2-13

04/20/23 CPSC503 Winter 2008 34

Argument is presented…

2-13

04/20/23 CPSC503 Winter 2008 35

Measures of Effectiveness

• Behavior and Attitude change– Record of user actions

• Whether or not adopts new instance• Position in Hot List

– Final Questionnaire• How much likes new instance• How much likes the instances in Hot-List

• Others (Final questionnaire)

– Decision Confidence– Decision Rationale

SatisfactionZ-score

SAMPLE SELF-REPORT

How would you judge the new house?

The more you like the house the closer you should put a cross to “good choice”

bad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___: good choiceX

04/20/23 CPSC503 Winter 2008 36

Outline



• Experiment


04/20/23 CPSC503 Winter 2008 37

Two Empirical Questions

• Arguments should be concise.

Conciseness can be varied, but….

What is the optimal level of conciseness?

• Argument content, structure and phrasing tailored to user-specific AMVF, but . . .

Does this tailoring actually contribute to argument effectiveness?

04/20/23 CPSC503 Winter 2008 38

Experimental Conditions

• Tailored-Concise (~ 50% of objectives)

• Tailored-Verbose (~ 80% of objectives)

• Non-Tailored-Concise (~ 50% of objectives)

• No-Argument

04/20/23 CPSC503 Winter 2008 39

Experimental Hypotheses

Tailored-Concise Non-Tailored-Concise

Tailored-Verbose

No-Argument

>

>

>

?

??

04/20/23 CPSC503 Winter 2008 40

Experimental Procedure

PHASE1 Online questionnaire to acquire preferences (AMVF - 19 objectives, 3 layers)

[Edwards and Barron 1994]

PHASE2- randomly assigned to condition

-interacts with evaluation framework

- fill-out questionnaire

40 subjects (10 for each condition)

04/20/23 CPSC503 Winter 2008 41

House-value

Location

Quality

Amenities

Distance-park

Distance-work

Distance-shopping

Garden-Size

Porch-size

Neighborhood

Distance-rapid-trans

Crime

Deck-size

#-of-bars

Street-traffic

Appearance-quality

View-quality Architectural-styleView-objectmodern

decovictorian

river parkuniversity

houses

AMVF used in the experiment

04/20/23 CPSC503 Winter 2008 42

Satisfaction Z-score

Decision Confidence

Decision Rationale

Experiment Results

04/20/23 CPSC503 Winter 2008 43

Results Satisfaction Z-score

0.28

0.28

0.05

1

Non-Tailored-Concise >

>

Tailored-Verbose

No-Argument

>

Tailored-Concise

p=0.08

p=0.02

p=0.08

Dennett test

04/20/23 CPSC503 Winter 2008 44

Summary

Generator of Evaluative Argument (GEA): generates

concise arguments tailored to a model of the user’s

preferences (AMVF)

Evaluation Framework

– Basic decision tasks

– Evaluate wide range of generation techniques

Experiment– Differences in conciseness influence effectiveness– Tailoring to AMVF seems to be effective

04/20/23 CPSC503 Winter 2008 45

Future Work (in 2001!)

Extend Argument Generator– More Complex Textual Arguments

– Speech

– Other domains

– Other languages

– Arguments combining text and graphics

More Experiments to test:– Whether tailoring to AMVF is actually effective

– Extensions

AT&T MATCHsystem

04/20/23 CPSC503 Winter 2008 46

Multimodal Access to City Help (MATCH)

Multimodal interface• Portable Fujitsu tablet

• Input: Pen for deictic gestures and Speech input

• Output: Text, Speech and graphics

(AT&T Johnston, Ehlen, Bangalore, Walker, Stent, Maloor and Whittaker

2002)

04/20/23 CPSC503 Winter 2008 47

MATCH Example:

User:“Compare”

• Comparison: evaluative argument comparing at most five alternatives (reasons for choosing each of them)

• Recommendation: evaluative argument about the best alternative

MATCH generates responses using techniques inspired by GEA

User: “Show me Italian restaurants in the West Village”

User: “Recommend”

04/20/23 CPSC503 Winter 2008 48

Result: tailored preferred p<.05

MATCH Evaluation

• 16 subjects “overheard” 4x2 dialogues each about selecting a restaurant

• In each dialogue 6 arguments are generated (3 tailored and 3 non-tailored)

• Subjects rate each argument information quality on 0-5 scale “..is easy to understand and it provides exactly the info I am interested in when choosing a restaurant”

[CogSci 2004]

• 768 judgments (vs. 36 in our experiment)

04/20/23 CPSC503 Winter 2008 49

Commercial application• Product by CoGenTex (an NLG company) in 2003 • Recommender

Explaining product recommendations for Active Decisions, the leading provider of web-based guided-selling solutions.

04/20/23 CPSC503 Winter 2008 50

• Computational framework for generating and testing user-tailored evaluative arguments:– Argumentation theory– Decision Theory– Computational Linguistics– Interactive Data Exploration– Social Psychology

• Independent experiments indicate that proposed tailoring influences user’s behavior/attitudes

Conclusions

04/20/23 CPSC503 Winter 2008 51

Next Time (Mon Nov.3)

• Project proposal deadline (bring your write-up to class)

• Project proposal Presentation– 6 min presentation + 3 min for questions– For content, follow instructions at course project web page– Bring 1 handout (copy of your slides)– Please send me your presentation by Mon @12pm

04/20/23 CPSC503 Winter 2008 52

Combining Measures of Behavior and Attitude

Record of user interaction• Whether or not adopts new instance• Position in Hot List

- Ranking (no equality)

- Ranking with equality- Measure of difference

Self-reports• How much likes new instance• How much likes (other) instances in Hot List

a) How would you judge the houses in your Hot List?

The more you like the house the closer you should put a cross to “good choice” 1st housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___: good choice2nd housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___ : good choice3rd house bad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___ : good choice4th housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___ : good choice

Combine self-reports in a single, precise measure: satisfaction z-score for new instance

04/20/23 CPSC503 Winter 2008 53

Satisfaction z-score

• For each subject compute:

),(- SHLsniscorez

- sni is the measure of satisfaction with the new instance- SHL is the set of measures of satisfaction with the instances in the Hot-List and the new instance

)(

)(),(,

X

XxXxscorezXx i

ii

04/20/23 CPSC503 Winter 2008 54

AMVF used in the experiment

House-value

Location

Quality

Amenities

Distance-park

Distance-work

Distance-shopping

Garden-Size

Porch-size

Neighborhood

Distance-rapid-trans

Crime

Deck-size

#-of-bars

Street-traffic

Appearance-quality

View-quality Architectural-styleView-objectmodern

decovictorian

river parkuniversity

houses

04/20/23 CPSC503 Winter 2008 55

Sample GEA Text Plan

SUPPORTING EVIDENCEMAIN-CLAIM

SUB-CLAIM

(VALUE (House-A) 0.72)

(VALUE (Location) 0.7)

(VALUE

(distance-from-park 1.8m) 0.3) (VALUE

(distance-from-work 1mi)

0.75)

(VALUE

(distance-from-rap-trans 0.5 mi)

0.75)

decomposition ordering rhetorical relations

OPPOSING EVIDENCE SUPPORTING EVIDENCE

EVALUATIVE ARGUMENT

04/20/23 CPSC503 Winter 2008 56

Decision-Support: Research Space

Explain, Justifyand PresentElicitation

Representation and Inference

Low-stakes

High-stakes

Individual

Group

(Preferences)

GEA

? ?? ? ?

?

?

? (CF)

MATCH

E

E

E

E

ValueCharts(J. Lloyd)

“New” Decision Theory

AI (D. Poole)

Collaborative Filtering (CF)

(J. Smith D. Poole)

New Evaluation Measures for

(CF)(R. Sharma)

04/20/23 CPSC503 Winter 2008 57

Natural Language Generation

• Goal: – (computational model implemented as a) computer software which

produces understandable texts in English or other human languages

• Input: – some underlying non-linguistic representation of information

• Output: – documents, reports, explanations, help messages, and other kinds of

texts

• Knowledge sources required: – knowledge of language and of the domain

04/20/23 CPSC503 Winter 2008 58

NLG Systems

NLG

System

• Communicative Goals• Domain Knowledge• Context Knowledge

Text

Examples• FOG – Input: numerical data about future. Output: textual

wheatear forecasts • IDAS – Input: KB describing a machinery (e.g., bike),

user’s level of expertise Output: hypertext help messages• ModelExplainer – Input: OO model. Output: textual

description of information on aspects of the model • STOP – Input: user history and attitudes toward smoking

Output: personalize smoking cessation letter

04/20/23 CPSC503 Winter 2008 59

Example z-scorea) How would you judge the houses in your Hot List?The more you like the house the closer you should put a cross to “good choice” 1st housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : _ : good choice2nd housebad choice : ___ : ___ : ___ : ___ : __ : ___ : : ___ : ___ : good choice3rd house (NEW HOUSE) bad choice : ___ : ___ : ___ : ___ : __ : ___ : : ___ : ___ : good choice4th housebad choice : ___ : ___ : ___ : ___ : __ : _ : ___ : ___ : ___ : good choice

2.0})6,7,7,9({

})6,7,7,9{7})6,7,7,9{,7(

scorez

9 7 7 6 7 7.25 1.25831 -0.1987

X

X

X

X

Date post:	02-Jan-2016
Category:	Documents
Upload:	loreen-goodman
View:	218 times
Download:	0 times

10/23/2015CPSC503 Winter 20081 CPSC 503 Computational Linguistics Natural Language Generation...

Documents