Date post: | 02-Jan-2016 |
Category: |
Documents |
Upload: | loreen-goodman |
View: | 218 times |
Download: | 0 times |
04/20/23 CPSC503 Winter 2008 1
CPSC 503Computational Linguistics
Natural Language Generation
Lecture 15
Giuseppe Carenini
04/20/23 CPSC503 Winter 2008 2
Knowledge-Formalisms Map for Generation
Logical formalisms
(First-Order Logics)
Rule systems
(features and unification)
State Machines Morphology
Syntax
Pragmatics
Discourse and Dialogue
Semantics
AI planners
Understanding
Generation
Intended meaning
Discourse (English)
04/20/23 CPSC503 Winter 2008 3
NLG Systems (see handout)
NLG
System
• Communicative Goals• Domain Knowledge• Context Knowledge
Text
Examples• FOG – Input: numerical data about future. Output: textual
wheatear forecasts
• IDAS – Input: KB describing a machinery (e.g., bike), user’s level of expertise Output: hypertext help messages
• ModelExplainer – Input: OO model. Output: textual description of information on aspects of the model
• STOP – Input: user history and attitudes toward smoking Output: personalize smoking cessation letter
04/20/23 CPSC503 Winter 2008 4
GEA: the Generator of Evaluative Arguments
- NLG pipeline architecture- Research methodology- Extrinsic evaluation (vs. Intrinsic)- Research / commercialization cycle
04/20/23 CPSC503 Winter 2008 5
Four Basic Types of Persuasive Text -“Arguments”… (main claim)
• Factual Argument (e.g., Canada is the only country outside of Asia to record SARS-related deaths….)
• Causal Argument (e.g., Travelers from Honk Kong
brought SARS to Toronto….)
• Recommendation (e.g., You should not go to China in the
next few weeks…..)
• Evaluative Argument (e.g., Some Asian governments
were inefficient in stopping the SARS outbreak…)
04/20/23 CPSC503 Winter 2008 6
Sample Textual Evaluative Arguments
Vancouver is better than Seattle. There is less crime. Also, social services are more accessible.
Comparison
House-A is great! Although it is somewhat old, the house is spacious and is in an excellent location.
Single entity
04/20/23 CPSC503 Winter 2008 7
Evaluative Arguments: Importance
Natural Language Generation Theory: model of argument type which is pervasive in natural human communication.
Ability to generate evaluative arguments is crucial in large classes of systems:– Personal assistants (e.g., travel advisor) – Recommender systems (e.g., movie, book) – Tutoring Systems– ……
04/20/23 CPSC503 Winter 2008 8
Limitations of Previous Research[Ardissono and Goy 99] [Chu-Carroll and Carberry 98]
[Elhadad 95] [Kolln 95] [Klein 94] [Morik 89]
• Focus on specific aspects of generation– Selection of content
– Realization of content into language
• Lack of systematic evaluation– proof-of-concept system
– analyzed on a few examples
04/20/23 CPSC503 Winter 2008 9
Methodology
• Develop generator of evaluative arguments– complete– integrate and extend previous work
• Develop evaluation framework
• Perform experiment within framework to test generator
04/20/23 CPSC503 Winter 2008 10
Outline
• Generator of Evaluative Arguments (GEA)
• Evaluation Framework
• Experiment
• More recent results from others
04/20/23 CPSC503 Winter 2008 11
Text Generator Architecture
Text Planner
TextPlan
Communicative Strategies
Knowledge Sources:
- User Model
- Domain Model
Content Selection and Organization
Content Realization
SentenceGenerator
TextMicro-planner
English
Linguistic Knowledge Sources:
- Lexicon
- Grammar
04/20/23 CPSC503 Winter 2008 12
User Model must
GEA User Model Argumentation Theory tells
us [Miller 96, Mayberry 96]
– Supporting (opposing) evidence depends on values and preferences of audience
– Evidence arranged according to importance (i.e., strength of support or opposition)
– Concise: only important evidence included
– Represent values and preferences of user
– Enable identification of supporting and opposing evidence
– Provide measure of evidence importance
… and can be elicited in practice ...
04/20/23 CPSC503 Winter 2008 13
Model of User’s PreferencesAdditive Multi-attribute Value Function (AMVF)• Decision Theory and Psychology (Consumer’s Behavior)
• Can be elicited in practice [Edwards and Barron 1994]
0.7
0.4
0.6
0.3
0.2
0.8
HouseValue
Location
Amenities
Neighborhood
Park-Distance
Porch-Size
Deck-Size
OBJECTIVES
Park-Distance - PD v2(PD) 0=<PD<=5 1- (1/5 * PD)
PD>5 0
Neighborhood - N v1(N) Southside 0 Westend 0.6 Eastend 1
Deck-Size - DS v3(DS) 0=<DS<=80 (1/80 * DS)
Porch-Size - PS v4(PS) 0=<PS<=60 (1/60 * PS)
COMPONENT VALUE FUNCTIONS
04/20/23 CPSC503 Winter 2008 14
HouseValue
Location
Amenities
Neighborhood
Park-Distance
Porch-Size
Deck-Size
0.7
0.4
0.6
0.3
0.2
0.8
OBJECTIVES
House-A
Westend
0.5 km
20 m2
36 m2
0.64
0.6
0.9
0.25
0.6
0.32
0.78
AMVF application
Park-Distance - PD v2(PD) 0=<PD<=5 1- (1/5 * PD)
PD>5 0
Neighborhood - N v1(N) Southside 0 Westend 0.6 Eastend 1
Deck-Size - DS v3(DS) 0=<DS<=80 (1/80 * DS)
Porch-Size - PS v4(PS) 0=<PS<=60 (1/60 * PS)
COMPONENT VALUE FUNCTIONS
+
_+
+
+
+
_
_+ Likes it
Does not like it
04/20/23 CPSC503 Winter 2008 15
Supporting and Opposing Evidence
Location
Amenities
Porch-Size
Deck-Size
0.7
0.4
0.6
0.3
0.2
0.8
0.64 0.9
0.25
0.6
0.32
0.78
Neighborhood
Park-Distance
House-A
n2
0.5 km
20 m2
36 m2
HouseValue
0.6
_
+
_
+
+
+
_+ Supporting
Opposing
+
_
+
+
+
+
_
_+ Likes it
Does not like it
opposing
opposing
supporting
supporting
relationParent(o) o
+_
+
+
+
__
_
04/20/23 CPSC503 Winter 2008 16
Measure of Importance [Klein 94]
_+ Supporting
Opposing
Location
Amenities
Porch-Size
Deck-Size
0.7
0.4
0.6
0.3
0.2
0.8
0.64 0.9
0.25
0.6
0.32
0.78
Neighborhood
Park-Distance
House-A
n2
0.5 km
20 m2
36 m2
HouseValue
0.6+
_
_
+
+
+
+
_
+
+
+
+
_
_+ Likes it
Does not like it
]1,max[)( Importance : objectiveeach For * ]v[vwo
oooo
0.55
0.2
0.12
0.6
0.24
0.54
10.5
1
0 vo
04/20/23 CPSC503 Winter 2008 17
Why AMVF? - summary
An AMVF
• Represents user’s values and preferences
• Enables identification of supporting and opposing evidence
• Provides measure of evidence importance
– Evidence arranged according to importance
– Concise arguments can be generated
• Can be elicited in practice
04/20/23 CPSC503 Winter 2008 18
GEA Architecture
Text Planner
SentenceGenerator
TextMicro-planner
English
TextPlan
Communicative Strategies
Knowledge Sources:
- User Model
- Domain Model
Content Selection and Organization
Content Realization
Linguistic Knowledge Sources:
- Lexicon
- Grammar
AMVF
04/20/23 CPSC503 Winter 2008 19
Argumentative Strategy
Selection: include only “important” evidence (i.e., above threshold on measure of importance)
Organization:
(1) Main Claim (e.g., “This house is interesting”)
(2) Opposing evidence
(3) Most important supporting evidence
(4) Further supporting evidence -- ordered by importance with strongest last
Strategy applied recursively on supporting evidence
Based on guidelines from argumentation theory [Miller 96, Mayberry 96]
04/20/23 CPSC503 Winter 2008 20
Sample GEA Text Plan
SUPPORTING EVIDENCEMAIN-CLAIM
SUB-CLAIM
(VALUE (House-A) 0.72)
(VALUE (Location) 0.7)
(VALUE
(distance-from-park 1.8m) 0.3) (VALUE
(distance-from-work 1mi)
0.75)
(VALUE
(distance-from-rap-trans 0.5 mi)
0.75)
decomposition ordering rhetorical relations
OPPOSING EVIDENCE SUPPORTING EVIDENCE
EVALUATIVE ARGUMENT
04/20/23 CPSC503 Winter 2008 21
GEA Architecture
Text Planner
SentenceGenerator
TextMicro-planner
English
TextPlan
Communicative Strategies
Knowledge Sources:
- User Model
- Domain Model
Content Selection and Organization
Content Realization
Linguistic Knowledge Sources:
- Lexicon
- Grammar
Argumentative Strategy
AMVF
04/20/23 CPSC503 Winter 2008 22
Text Micro-Planner
• Aggregation: combining multiple propositions in one
single sentence [Shaw 98]
• Lexicalization:
– Scalar Adjectives (e.g., nice, far, convenient) [Elhadad 93]
– Discourse cues (e.g., although, because, in fact) [Knott 96; Di Eugenio, Moore and Paolucci 97]
– Pronominalization: deciding whether to use a pronoun to refer to an entity (centering [Grosz,Joshi and Weinstein 95])
04/20/23 CPSC503 Winter 2008 23
Aggregation (Logical Forms)
• Conjunction via shared participants“House B-11 is far from a shopping area” +
“House B-11 is far from public transportation” =
“House B-11 is far from a shopping area and public transportation”.
• Syntactic embedding“House B-11 offers a nice view” +
“House B-11 offers a view on the river” =
“House B-11 offers a nice view on the river”.
04/20/23 CPSC503 Winter 2008 24
Scalar Adjectives Selection
HOUSE-LOCATION
HAS_PARK_DISTANCE
HAS_COMMUTING_DISTANCE
HAS_SHOPPING_DISTANCE
HOUSE-AMENITIES
Value > 0.8
0.65 < Value < 0.8
0.5 < Value < 0.65
0.35 < Value < 0.5
Value < 0.2
The house has an excellent location
… a convenient …
… an average…
… a bad …
… a terrible …
… a reasonable …
0.2 < Value < 0.35
.
.
.
HOUSE-LOCATION
HAS_PARK_DISTANCE
HAS_COMMUTING_DISTANCE
HAS_SHOPPING_DISTANCE
HOUSE-AMENITIES
Value > 0.8
0.65 < Value < 0.8
0.5 < Value < 0.65
0.35 < Value < 0.5
Value < 0.2
The house has an excellent location
… a convenient …
… an average…
… a bad …
… a terrible …
… a reasonable …
0.2 < Value < 0.35
04/20/23 CPSC503 Winter 2008 25
Discourse Cues Selection
Rel-type Type-of-nesting
Typed-ordering Discourse cue
EVIDENCE
Although (placed on contributor)
("CORE" "CONCESSION" "EVIDENCE") CONCESSION
SEQUENCE
EVIDENCE
ROOT
("CORE" "CONCESSION" "EVIDENCE") Even though (placed on contributor)
("CONCESSION”"CORE" "EVIDENCE") ) However(placed on core)
04/20/23 CPSC503 Winter 2008 26
Simple Pronominalization Strategy inspired by Centering Theory
Centering tells us: entity providing link preferentially realized as pronoun (within a discourse segment)
Our Strategy:
• Within a discourse segment successive references always pronoun
• First reference in segment definite description unless– Segment boundary explicitly marked by discourse cue and– No pronoun was used in previous sentence
“House B-11 is an interesting house. In fact, it has a reasonable…”.
04/20/23 CPSC503 Winter 2008 27
Output of MicroPlanning
Sequence of Lexicalized Functional Descriptions (LFDs)
Example:“House B-11 is close to shops and reasonably close to work”
((CAT CLAUSE) (PROCESS ((TYPE ASCRIPTIVE) (MODE ATTRIBUTIVE)((POLARITY POSITIVE(EPISTEMIC-MODALITY NONE))) (PARTICIPANTS ((CARRIER((CAT NP)(COMPLEX APPOSITION) (RESTRICTIVE YES) (DISTINCT ((AND ((CAT COMMON)(DENOTATION ZERO-ARTICLE-THING)(HEAD ((LEX "house")))) ((CAT PROPER) (LEX "B-11")))(CDR NONE)))) (ATTRIBUTE (AND((CAT AP)(HEAD ((CAT ADJ)(LEX "close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP((CAT COMMON) (NUMBER PLURAL)(DEFINITE NO) (HEAD ((CAT NOUN) (LEX "shop"))))))))) ((CAT AP)(HEAD ((CAT ADJ)(LEX "reasonably close"))) (QUALIFIER ((CAT PP) (PREP ((CAT PREP) (LEX "to"))) (NP ((CAT COMMON)(DEFINITE NO) (HEAD ((CAT NOUN)(LEX "work"))))))) )))))))))))
04/20/23 CPSC503 Winter 2008 28
Last Step: Sentence Generator
• Unify LFDs with large grammar of English (FUF/SURGE [Elhadad 93, Robin 94])
– fill in syntactic constraints (e.g., agreement, ordering)
– choose closed class words (e.g., prepositions, articles)
• Apply morphology
• Linearize as English sentences
04/20/23 CPSC503 Winter 2008 29
GEA Highlights• GEA implements a computational model of
generating evaluative arguments
• All aspects covered in a principled way:– argumentation theory (argumentative strategy
and requirements on user model)– decision theory (user model and elicitation
method)– computational linguistics (architecture, micro-
planning techniques and sentence generator)
04/20/23 CPSC503 Winter 2008 30
Outline
• Generator of Evaluative Arguments (GEA)
• Evaluation Framework
• Experiment
• More recent results from others
04/20/23 CPSC503 Winter 2008 31
Evaluation Framework:
Task EfficacyUser presented
with info about set of alternatives
- Select preferred N alternatives- Order them by preference
1st best
2nd best
…..
nth best
Subtask1Hot List
Subtask2
User presented with Evaluative argument about NewInstance
Include?
NO
YES
End
2nd best
.....
nth best
Where?
NewInstance is created
Fill-out final questionnaire
1st best
Hot List
04/20/23 CPSC503 Winter 2008 32
Selection Task in Real-Estate
• Why Real-Estate?– No sophisticated background or expertise– But still presents challenging decision task
• Instructions– Move to new town– Buy house– Use system for data exploration
04/20/23 CPSC503 Winter 2008 33
Data Exploration System
2-13
04/20/23 CPSC503 Winter 2008 34
Argument is presented…
2-13
04/20/23 CPSC503 Winter 2008 35
Measures of Effectiveness
• Behavior and Attitude change– Record of user actions
• Whether or not adopts new instance• Position in Hot List
– Final Questionnaire• How much likes new instance• How much likes the instances in Hot-List
• Others (Final questionnaire)
– Decision Confidence– Decision Rationale
SatisfactionZ-score
SAMPLE SELF-REPORT
How would you judge the new house?
The more you like the house the closer you should put a cross to “good choice”
bad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___: good choiceX
04/20/23 CPSC503 Winter 2008 36
Outline
• Generator of Evaluative Arguments (GEA)
• Evaluation Framework
• Experiment
• More recent results from others
04/20/23 CPSC503 Winter 2008 37
Two Empirical Questions
• Arguments should be concise.
Conciseness can be varied, but….
What is the optimal level of conciseness?
• Argument content, structure and phrasing tailored to user-specific AMVF, but . . .
Does this tailoring actually contribute to argument effectiveness?
04/20/23 CPSC503 Winter 2008 38
Experimental Conditions
• Tailored-Concise (~ 50% of objectives)
• Tailored-Verbose (~ 80% of objectives)
• Non-Tailored-Concise (~ 50% of objectives)
• No-Argument
04/20/23 CPSC503 Winter 2008 39
Experimental Hypotheses
Tailored-Concise Non-Tailored-Concise
Tailored-Verbose
No-Argument
>
>
>
?
??
04/20/23 CPSC503 Winter 2008 40
Experimental Procedure
PHASE1 Online questionnaire to acquire preferences (AMVF - 19 objectives, 3 layers)
[Edwards and Barron 1994]
PHASE2- randomly assigned to condition
-interacts with evaluation framework
- fill-out questionnaire
40 subjects (10 for each condition)
04/20/23 CPSC503 Winter 2008 41
House-value
Location
Quality
Amenities
Distance-park
Distance-work
Distance-shopping
Garden-Size
Porch-size
Neighborhood
Distance-rapid-trans
Crime
Deck-size
#-of-bars
Street-traffic
Appearance-quality
View-quality Architectural-styleView-objectmodern
decovictorian
river parkuniversity
houses
AMVF used in the experiment
04/20/23 CPSC503 Winter 2008 42
Satisfaction Z-score
Decision Confidence
Decision Rationale
Experiment Results
04/20/23 CPSC503 Winter 2008 43
Results Satisfaction Z-score
0.28
0.28
0.05
1
Non-Tailored-Concise >
>
Tailored-Verbose
No-Argument
>
Tailored-Concise
p=0.08
p=0.02
p=0.08
Dennett test
04/20/23 CPSC503 Winter 2008 44
Summary
Generator of Evaluative Argument (GEA): generates
concise arguments tailored to a model of the user’s
preferences (AMVF)
Evaluation Framework
– Basic decision tasks
– Evaluate wide range of generation techniques
Experiment– Differences in conciseness influence effectiveness– Tailoring to AMVF seems to be effective
04/20/23 CPSC503 Winter 2008 45
Future Work (in 2001!)
Extend Argument Generator– More Complex Textual Arguments
– Speech
– Other domains
– Other languages
– Arguments combining text and graphics
More Experiments to test:– Whether tailoring to AMVF is actually effective
– Extensions
AT&T MATCHsystem
04/20/23 CPSC503 Winter 2008 46
Multimodal Access to City Help (MATCH)
Multimodal interface• Portable Fujitsu tablet
• Input: Pen for deictic gestures and Speech input
• Output: Text, Speech and graphics
(AT&T Johnston, Ehlen, Bangalore, Walker, Stent, Maloor and Whittaker
2002)
04/20/23 CPSC503 Winter 2008 47
MATCH Example:
User:“Compare”
• Comparison: evaluative argument comparing at most five alternatives (reasons for choosing each of them)
• Recommendation: evaluative argument about the best alternative
MATCH generates responses using techniques inspired by GEA
User: “Show me Italian restaurants in the West Village”
User: “Recommend”
04/20/23 CPSC503 Winter 2008 48
Result: tailored preferred p<.05
MATCH Evaluation
• 16 subjects “overheard” 4x2 dialogues each about selecting a restaurant
• In each dialogue 6 arguments are generated (3 tailored and 3 non-tailored)
• Subjects rate each argument information quality on 0-5 scale “..is easy to understand and it provides exactly the info I am interested in when choosing a restaurant”
[CogSci 2004]
• 768 judgments (vs. 36 in our experiment)
04/20/23 CPSC503 Winter 2008 49
Commercial application• Product by CoGenTex (an NLG company) in 2003 • Recommender
Explaining product recommendations for Active Decisions, the leading provider of web-based guided-selling solutions.
04/20/23 CPSC503 Winter 2008 50
• Computational framework for generating and testing user-tailored evaluative arguments:– Argumentation theory– Decision Theory– Computational Linguistics– Interactive Data Exploration– Social Psychology
• Independent experiments indicate that proposed tailoring influences user’s behavior/attitudes
Conclusions
04/20/23 CPSC503 Winter 2008 51
Next Time (Mon Nov.3)
• Project proposal deadline (bring your write-up to class)
• Project proposal Presentation– 6 min presentation + 3 min for questions– For content, follow instructions at course project web page– Bring 1 handout (copy of your slides)– Please send me your presentation by Mon @12pm
04/20/23 CPSC503 Winter 2008 52
Combining Measures of Behavior and Attitude
Record of user interaction• Whether or not adopts new instance• Position in Hot List
- Ranking (no equality)
- Ranking with equality- Measure of difference
Self-reports• How much likes new instance• How much likes (other) instances in Hot List
a) How would you judge the houses in your Hot List?
The more you like the house the closer you should put a cross to “good choice” 1st housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___: good choice2nd housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___ : good choice3rd house bad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___ : good choice4th housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : ___ : good choice
Combine self-reports in a single, precise measure: satisfaction z-score for new instance
04/20/23 CPSC503 Winter 2008 53
Satisfaction z-score
• For each subject compute:
),(- SHLsniscorez
- sni is the measure of satisfaction with the new instance- SHL is the set of measures of satisfaction with the instances in the Hot-List and the new instance
)(
)(),(,
X
XxXxscorezXx i
ii
04/20/23 CPSC503 Winter 2008 54
AMVF used in the experiment
House-value
Location
Quality
Amenities
Distance-park
Distance-work
Distance-shopping
Garden-Size
Porch-size
Neighborhood
Distance-rapid-trans
Crime
Deck-size
#-of-bars
Street-traffic
Appearance-quality
View-quality Architectural-styleView-objectmodern
decovictorian
river parkuniversity
houses
04/20/23 CPSC503 Winter 2008 55
Sample GEA Text Plan
SUPPORTING EVIDENCEMAIN-CLAIM
SUB-CLAIM
(VALUE (House-A) 0.72)
(VALUE (Location) 0.7)
(VALUE
(distance-from-park 1.8m) 0.3) (VALUE
(distance-from-work 1mi)
0.75)
(VALUE
(distance-from-rap-trans 0.5 mi)
0.75)
decomposition ordering rhetorical relations
OPPOSING EVIDENCE SUPPORTING EVIDENCE
EVALUATIVE ARGUMENT
04/20/23 CPSC503 Winter 2008 56
Decision-Support: Research Space
Explain, Justifyand PresentElicitation
Representation and Inference
Low-stakes
High-stakes
Individual
Group
(Preferences)
GEA
? ?? ? ?
?
?
? (CF)
MATCH
E
E
E
E
ValueCharts(J. Lloyd)
“New” Decision Theory
AI (D. Poole)
Collaborative Filtering (CF)
(J. Smith D. Poole)
New Evaluation Measures for
(CF)(R. Sharma)
04/20/23 CPSC503 Winter 2008 57
Natural Language Generation
• Goal: – (computational model implemented as a) computer software which
produces understandable texts in English or other human languages
• Input: – some underlying non-linguistic representation of information
• Output: – documents, reports, explanations, help messages, and other kinds of
texts
• Knowledge sources required: – knowledge of language and of the domain
04/20/23 CPSC503 Winter 2008 58
NLG Systems
NLG
System
• Communicative Goals• Domain Knowledge• Context Knowledge
Text
Examples• FOG – Input: numerical data about future. Output: textual
wheatear forecasts • IDAS – Input: KB describing a machinery (e.g., bike),
user’s level of expertise Output: hypertext help messages• ModelExplainer – Input: OO model. Output: textual
description of information on aspects of the model • STOP – Input: user history and attitudes toward smoking
Output: personalize smoking cessation letter
04/20/23 CPSC503 Winter 2008 59
Example z-scorea) How would you judge the houses in your Hot List?The more you like the house the closer you should put a cross to “good choice” 1st housebad choice : ___ : ___ : ___ : ___ : __ : ___ : ___ : ___ : _ : good choice2nd housebad choice : ___ : ___ : ___ : ___ : __ : ___ : : ___ : ___ : good choice3rd house (NEW HOUSE) bad choice : ___ : ___ : ___ : ___ : __ : ___ : : ___ : ___ : good choice4th housebad choice : ___ : ___ : ___ : ___ : __ : _ : ___ : ___ : ___ : good choice
2.0})6,7,7,9({
})6,7,7,9{7})6,7,7,9{,7(
scorez
9 7 7 6 7 7.25 1.25831 -0.1987
X
X
X
X