Qualitative and Analogical modeling of cultural reasoning

transcript

Kenneth D. Forbus

Emmett Tomai

Morteza Dehghani

Qualitative Reasoning Group

Northwestern University

Our overall approach

Story Workbench(w/MIT)

Qualitative Concept Maps

Interviews, surveys,& cultural stories

collected

Produce models via analogicalgeneralization, predictions via simulation

Predicate calculusrepresentations of

stories, explanations

Catalyze research by speeding encoding.Improve results by decreasing tailorabilityEventually, practical modeling tools for analyst & decision-maker support

Overview

• Key ideas– Qualitative modeling

– Analogical reasoning and learning

– Practical natural language processing

• Modeling cultural models of food webs– Qualitative models to capture content

– Analogical modeling to gain insights, construct classifier

• Modeling blame assignment (leave out, lack of time)– Qualitative model of attribution theory

• Modeling moral decision-making – Sacred versus secular values represented via qualitative order of

magnitude representations

– Input representations via natural language

Qualitative Modeling

• Formalizes intuitive knowledge of systems with continuous aspects– Levels of knowledge range from the person on the street to expert

scientists and engineers

• Has been used in wide range of scientific and engineering modeling– Design of mechanical, electrical, and hybrid systems, modeling

ecosystems, modeling genetic regulation mechanisms

• Has been used as formalism for human mental models– Cognitive modeling efforts, new educational systems

• Offers useful level of precision for social science work

– Responsibility Q- Coercion

Building Blocks for Analogical ProcessingBase

Target

SME models analogical matching• Consistent with large body of psychological evidence• Has been used to make novel psychological predictions• Has been used in performance systems

SEQL models generalization from examples• Used to model several learning experiments• Used to make novel psychological predictions

CVmatch

Memory pool

Output =memory

item+ SME results

Cheap, fast, non-structural

MAC/FAC models similarity-based retrieval• Does not require hand-indexing of descriptions• Used to model several psychological experiments• Has been used in a performance system

Generalize

Retrieve

Practical NL Processing

• Most cognitive simulations have used hand-coded representations

• Problematic– Tailorability

– Scaling up hard

• An alternative: Semi-automatic NL processing– Simplified English eases

parsing, semantic interpretation

As a result of a dam on a river, 20 species of fish are threatened with extinction. By opening the dam for a month each year, you can save these species, but 2 species downstream will become extinct because of the changing water level.

Hand translation or tagging

Human Results Simulation Results

New WorkflowHand translation

to simplified English

Human Results Simulation Results

As a result of a dam on a river, 20 species of fish are threatened with extinction. By opening the dam for a month each year, you can save these species, but 2 species downstream will become extinct because of the changing water level.

Because of a dam on a river, 20 species of fish will be extinct. You can save them by opening the dam. The opening would cause 2 species of fish to be extinct."

Predicate calculus versionsof stimuli, backgroundknowledge, stories…

Story Workbench:Semiautomaticpractical NLUusing QRG-CE

controlled language

The EA natural language system

Novel combination of off-the-shelf components:• ResearchCyc KB contents (1.2M facts)• Comlex lexicon• Allen’s parser• DRT-based semantic interpreterOriginally developed by Kuehne (2004) for modeling roles of qualitative representations in NL semantics

Sentence interpretationSentence interpretation

Quantifier scoping

Situation reificationSituation reification

QP frame and process

QRG-CEgrammarQRG-CEgrammar

COMLEXLexicon

ParsingParsing

Semantic frames

Semantic role

assignment

Semantic role

assignment

Task and domain specific reasoningTask and domain specific reasoning

QP frame constructionQP frame

construction

Word-sense disambiguationWord-sense

disambiguation

Discourse interpretation

Anaphora resolutionAnaphora resolution

Sentence attachmentSentence

attachment

Intra-sentential anaphora resolution

Temporal ordering

Cultural Models of Food Webs

• How groups conceive of relationships in the natural world

• Experiments carried out by Medin’s group: – Participants given scenario about a perturbation to a population.

– Asked to predict effects

• Example:

E: Do you think that the disappearance in the bears would affect other plants and animals in the forest?

P: Well, there probably be a lot more berry growth for example, because they wouldn’t be eating the berries. There probably would be a lot less maybe dead trees, because they won’t be climbing on the trees and shredding them.

• QCM provides a scientist-friendly interface for encoding causal models using Qualitative Process (QP) Theory (Forbus, 1984)

• QCM uses a concept map interface (Novak & Gowin,1984)• QCM automatically checks for modeling errors, provides

detailed feedback.

Bears Disappearing Example

Bears eat berries

there probably be a lot more berries because they wouldn’t be eating the berries

Number of bears influence the eating rate

Number of berries increase as eating rate decreases

• Experiment: Automatic classification of the models based on the cultural group they belonged to

• Data: Interviews collected by Medin’s group

Experiment: Detecting culture via models

Menominee0

MenomineeGeneralization

Menominee1

Menominee2

Menominee3

European American Generalization

Example

Example X is more similar to culture Y

Construct generalizations from concept maps

Measure similarity of new model to generalizations about cultures

SEQL SEQL

Results: Culture classification via analogical generalization

• 81 models encoded using QCM, in response to 3 food web scenarios

• Results are averaged over 1000 trials. Trial = 4 models from each group chosen randomly for generalization, 8-10 models randomly chosen for test set

• Conjecture for improving accuracy: Increase uniformity in follow-up questions during interviews.

64%64%64%Poplar Disappearing

67%52%82%Bears Doubling

61%57%65%Bears Disappearing

Overall Accuracy

Non-MenomoneeMenomonee i.e., if a test model for Bears

Disappearing was from Menomonee,

the system correctly categorized it 65%

of the time

Can Inspect Generalizations for Insights

• The number of facts that were consistent across individuals was higher in Menominee models

• The number of consistent causal relations was higher among Menominee– On average, there were 24 facts found consistently across all

Menominee models vs. 16 facts for non- Menominee– On average, Menominee models contained 4 causal relations whereas

non-Menominee models only contained 2.

Experiment: Classification via expertise

• What can we learn from automatic classification of the models based on the level of expertise?– Hunters and fishermen are considered experts within this domain

• Analogical processing results:– Classifying experts from non-experts within Menomonee models:

72.5%. – Classifying experts from non-experts within European American

models: 52% (almost chance) • Suggests Menomonee are more influenced by their daily activities

than European Americans• Consistent with independent manual analysis

– Menominee hunters more likely than Menominee non-hunters to mention ecological relations (19.8 vs 10.14, p < 0.01)

– No significant difference between # ecological relations mentioned by hunters versus non-hunters for European-Americans (16.08 vs 16.22, p < 0.97)

Next Steps

• Possible source of noise: Degree of follow-up questioning varied in interviews– Working with Medin’s group to figure out practical

protocols to get more uniform data

• Closing the loop: Making predictions from automatically generated models– The same participants could be re-interviewed, or more

individuals from same group, depending on level of modeling

– Use initial interviews for gathering training set– Construct generalizations, make predictions – Conduct more interviews to test predictions

Modeling Moral Decision-Making

• Goal: Model effects found in literature on moral dilemmas (e.g., Trolley Problem)– Sacred values vs secular values

– Quantity sensitivity

– Differences in group responses

• Given: a scenario S, outcomes A, B that depend on what action is taken

• Predict: which action someone would prefer

MoralDM model

NewDilemma

Prior Casesw/DecisionsProtected Values

UtilityCalculator

Decision

Rules for extracting relevant quantities,

producing valuations

Cultural or life stories

Vary according to

Protected Values and Quantity Insensitivity

• Protected values (PVs) concern acts and not outcomes

• People with PVs show insensitivity to quantity of outcomes (Baron and Spranca 1997, Lim and Baron 1997)– In trade-off situations, they are less sensitive to the

consequences of their choices

• Quantity insensitivity of PVs are not absolute (Bartels and Medin 2007)

Modeling Protected Values

• Idea: Use order of magnitude formalism from QR to model protected versus secular values– Introduces stratification in values

– Degree of stratification can be varied to model context effects

1 child

$1,000

$1,000,000

Order of Magnitude

• Based on Dague (1988) formalization– A ~K B |A-B| ≤ K * Max(|A|,|B|)

• A and B are equivalent

– A ~!K B |A-B| > K * Max(|A|,|B|)

• A and B are comparable – magnitudes tell which is greater

– A «K B |A| < K * |B|

• B dominates A, or A negligible w.r.t. B

• K determines how stratified values are

• K can be adjusted to account for different sensitivities towards consequences– K= 1/10: 20 > 15

– K= 1/3: 20 ~ K 15

– K= 2: 20 >> K 15

An Example Dilemma

• A convoy of food trucks is on its way to a refugee camp during a famine in Africa. (Airplanes cannot be used.) You find that a second camp has even more refugees. If you tell the convoy to go to the second camp instead of the first, you will save 1000 people from death, but 100 people in the first camp will die as a result.– Would you send the convoy to the second camp?

– What is the largest number of deaths in the first camp at which you would send the convoy to the second camp?

Scenarios

• 12 moral decision making scenarios from Ritov and Baron (1999) were chosen as inputs– civil rights, nature preservers, combating traffic

accidents, Jewish settlements, Arab villages,…

• Manually translated into predicate calculus – Goal: Semi-automatically translate with EA NLU

– Recent Progress: The river scenario was automatically translated from simplified English

Simplification Example

• Original text:

"As a result of a dam on a river, 20 species of fish are threatened with extinction. By opening the dam for a month each year, you can save these species, but 2 species downstream will become extinct because of the changing water level."

• Simplified text:

"Because of a dam on a river, 20 species of fish will be extinct. You can save them by opening the dam. The opening would cause 2 species of fish to be extinct."

Example of EA NLU output

"Because of a dam on a river, 20 species of fish will be extinct."

(explains-Generic (thereExists dam44262 (thereExists river44314 (and (on-UnderspecifiedSurface dam44262 river44314) (isa river44314 River) (isa dam44262 Dam)))) (thereExists set-of-species44351 (and (isa set-of-species44351 Set-Mathematical) (cardinality set-of-species44351 20) (forAll species44351 (implies (elementOf species44351 set-of-species44351) (and (isa species44351 BiologicalSpecies) (generalizes species44351 Fish) (thereExists extinction44579 (and (isa extinction44579 Extinction) (objectActedOn extinction44579

species44351)))))))))

Results

• Out of the 12 scenarios, MoralDM makes decisions matching those of participants on 11 scenarios– In 8 scenarios, both first-principles reasoning and

analogical reasoning provide the correct answer– In 3 scenarios, first-principles reasoning fails, but

analogical reasoning provides the correct answer– In 1 scenario, both reasoning strategies fail

Next steps: MoralDM

• Test on wider range of examples• Scale up story libraries for different cultural

groups– Incorporate MAC /FAC for retrieval

– Currently using fables, folktales as sources

• Extend EA NLU and QRG-CE coverage to handle range of both cultural stories and interview stories – Story Workbench = EA NLU + MIT’s GUI

Future Work

• Use automatically constructed cultural models to make novel predictions– Experiment with two-phase interview structure

• Continue extending EA NLU to broader coverage– Needed to scale automatic model construction

• Extend automatic cultural model construction to other kinds of data– Fables, folk-tales, life stories, valuation rules: i.e., the

culturally-specific inputs to MoralDM.

• Test MoralDM model on wider range of problems and inputs from different groups– Data is the limiting factor right now

Details

Elements of QP Theory

• Physical Process– All causal changes stem from physical processes.– Example: heat flow between a brick and a room

• Parts of physical processes:– Participants

• Entities participating in a physical process• Example: the brick, the room

– Conditions• Determine when a process is active• Example: difference in temperature

– Consequences• Hold as long as a process is active• Direct influences (derivatives)• Indirect influences (functional relations)

Hot brick

Cool room

Modeling Blame Assignment

• Context: Computational version of attribution theory from psychology being developed at ICT by Gratch and Mao– Assigns credit/blame for a consequence C of an action

A to an agent P based on• Did P cause A?• Did P intend C?• Did P foresee that C would follow from A?• Was P coerced by another actor?

– Uses simple axioms to assign binary values of credit/blame to agents based on

• causal knowledge, expressed by plans• Simple axioms relating cause, intention, and knowledge• Rules for inferring knowledge and intent from dialogue

Mao & Gratch’s Computational Model of blame assignment

• Based on Shaver’s theory of moral responsibility (1985)– Attribution along dimensions of responsibility– Judgment of responsibility follows– Responsibility may lead to blame

• Observed behaviors in a simulation environment• Plan library, using Hierarchical Task Networks, for causal inference• Speech acts covering order negotiation for dialogue inference• Attribution variables as Boolean assignments• Infers which agent in the scenario is to blame

Observedbehaviors

Plans andSpeech Acts

AttributionVariables

ResponsibilityJudgment

QR Model (Tomai & Forbus, 2007)

• Same causal/dialogue input, different attribution process• Qualitative representation provides more rigorous modeling

method– Social science theories describe dimensions of responsibility are

described as continuous parameters– Predictions, experimental results cast as ordinal relationships– Qualitative modeling captures this directly, without ad hoc step of

constructing quantitative equations and postulating numerical parameters

– Also fits the data better

Observedbehaviors

Plans andSpeech Acts

AttributionVariables

ResponsibilityJudgment

Tomai, E., and Forbus, K. 2007. Plenty of Blame to Go Around:A Qualitative Approach to Attribution of Moral Responsibility

Proceedings of QR-07

Who’s to blame?

The chairman of Beta Corporation is discussing a new program with the vice president of the corporation.

The vice president says, “The new program will help us increase profits, but according to our investigation report, it will also harm the environment.”

The chairman answers, “I only want to make as much profit as I can. Start the new program!”

The vice president says, “Ok,” and executes the new program.

The environment is harmed by the new program.

(From Mao 2006, adapted from Knobe 2003)

Modes of Judgment

• Four distinct modes of judgment• Responsibility is strictly increasing• Translates to six views• Within each mode responsibility is qualitatively proportional

to an attribution variable

Increasing responsibility

NoYesYes

No Yes

Unforeseen Unintended VoluntaryCoerced

Foreseen? Intended? Coerced?

Mao’s Results

Human Data Mao Model

Chair VP Chair VP Degree

Scenario 1 3.00 3.73 Y Low

Scenario 4 4.13 5.20 Y High

Chair 1 3.00

Chair 4 4.13

Survey Results

VP 1 3.73

Chair 2 5.63VP 4

5.20 Chair 3 5.63

VP 2 3.77

VP 3 3.23

< < < < <

Chair 4 4.13

QR Model Results

VP 1 3.73

Chair 1 3.00 Chair 2

VP 4 5.20

Chair 3 5.63

VP 2 3.77

VP 3 3.23

< < < <

Chair 4 4.13

QR Model Results

VP 1 3.73

VP 4 5.20

Chair 3 5.63

VP 2 3.77

VP 3 3.23

< < < <

Unforeseen Coerced Voluntary

Chair 4 4.13

QR Model Results

VP 1 3.73

VP 4 5.20

Chair 3 5.63

VP 2 3.77

VP 3 3.23

< < < <

Unforeseen Voluntary

• Violates strict ordering of modes of judgment• Challenges an assumption of Shaver’s theory

Coerced

Qualitative and Analogical modeling of cultural reasoning

Documents