+ All Categories
Home > Documents > Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint...

Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint...

Date post: 01-Apr-2015
Category:
Upload: alia-goss
View: 218 times
Download: 2 times
Share this document with a friend
Popular Tags:
45
Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky
Transcript
Page 1: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Emergence of Gricean Maxims from Multi-agent Decision Theory

Adam VogelStanford NLP Group

Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky

Page 2: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Decision-Theoretic Pragmatics

Gricean cooperative principle:

Make your contribution such as it is required, at the stage at which it occurs, by the accepted purpose or

direction of the talk exchange in which you are engaged.

Page 3: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Decision-Theoretic Pragmatics

Gricean Maxims:• Be truthful: speak with evidence• Be relevant: speak in accordance with goals• Be clear: be brief and avoid ambiguity• Be informative: say exactly as much as needed

Page 4: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Emergence of Gricean Maxims

Co-operative principle

•Be truthful•Be relevant•Be clear•Be informative

???

Approach: Operationalize the co-operative principleTool: Multi-agent decision theoryGoal: Maxims emerge from rational behavior

Joint utility Rationality

Page 5: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Related Work

• One-shot reference tasks– Generating spatial referring expressions [Golland et al.

2010] – Predicting pragmatic reasoning in language games

[Stiller et al. 2011]• Interpreting natural language instructions– Learning to read help guides [Branavan et al. 2009]– Learning to following navigational directions [Vogel

and Jurafsky 2010] [Artzi and Zettlemoyer 2013] [Chen and Mooney 2011] [Tellex et al. 2011]

Page 6: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

CARDS Task

Page 7: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Outline

• Spatial semantics• ListenerBot: single-agent advice taker– Can accept advice, never gives it

• DialogBot: multi-agent decision maker– Gives advice by tracking the other player’s beliefs

Page 8: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Spatial Semantics“in the top left of the board”

“on the left side” “right in the middle”

BOARD(top;left) BOARD(left) BOARD(middle)

MaxEnt Classifier w/ Bag of Words

Estimated from Corpus Data

Page 9: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Complexity Ahoy

• Approximate decision making only feasible for problems with <10k states!

1001000

10000100000

100000010000000

1000000001000000000

10000000000

Page 10: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Semantic State Representation• Divide board into 16 regions• Cluster squares based on meanings

Page 11: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

• Spatial semantics• ListenerBot: single-agent advice taker– Can accept advice, never gives it

• DialogBot: multi-agent decision maker– Gives advice by tracking the other player’s beliefs

Outline

Page 12: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Partially Observable Markov Decision Process (POMDP)

Or: An HMM you get to drive!

Page 13: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

State space S: hidden configuration of the world• Location of card• Location of player

Page 14: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Action space A: what we can do• Move around the board• Search for the card

Page 15: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Observations : sensor information + messages• Whether we are on top of the card• BOARD(right;top) etc.

Page 16: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Observation Model : sensor model• We see the card if we search for it and are on it• For messages

Page 17: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Reward R(s,a): value of an action in a state • Large reward if in the same square as the card• Every action adds small negative reward

Page 18: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Transition T(s’|a,s): dynamics of the world• Travel actions change player location• Card never moves

Page 19: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Initial belief state : distribution over S• Uniform distribution over card location• Known initial player location

Page 20: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Belief Update: Action: SEARCHObservation: (Card not here, )

Page 21: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Belief Update:

Page 22: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Belief Update: Action: SEARCHObservation: (Card not here, “left side”)

Page 23: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Belief Update:

Page 24: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Decision Making

Choose policy

Goal: Maximize expected reward

Solution: Perseus, an approximate value iteration algorithm [Spaan et al. 2005]

Computational complexity: P-SPACE!

Immediate reward Future rewardExpected +

Page 25: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

• Spatial semantics• ListenerBot: single-agent advice taker– Can accept advice, never gives it

• DialogBot: multi-agent decision maker– Gives advice by tracking the other player’s beliefs

Outline

Page 26: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

DialogBot

• (Approximately) tracks beliefs of other player• Speech actions change beliefs of other player• Model: Decentralized POMDP (Dec-POMDP)– Problem: NEXP Hard!!

Top!

Page 27: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Each agent selects its own action

Page 28: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Each agent receives its own observation

Page 29: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Transition depends on both actions

Page 30: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Reward is shared between agentsFormalization of the co-operative principle

Page 31: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Exact Multi-agent Belief Update

Time

Page 32: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Approximate Multi-agent Belief Update

Time

Page 33: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Single-agent POMDP Approximation

Other agent belief transition model

World transition model

Resulting POMDP has states

Page 34: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

What to say?

Page 35: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

“Top”

Page 36: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

“Middle”

Page 37: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

“Right”

Page 38: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

“Right”

Page 39: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Return to Grice

• Be truthful• Be relevant• Be clear• Be informative

Page 40: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Cooperating DialogBots

Middle of the board

Page 41: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Cooperating DialogBots

Middle of the board

Page 42: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Adolescent DialogBots

Top

Page 43: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Return to Grice

• Be truthful: DialogBot speaks with evidence• Be relevant: DialogBot gives advice to help win

the game• Be clear• Be informative

Page 44: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Experimental Results• Evaluate pairs of agents from 197 random

initial states• Agents have 50 high-level moves to find the

cardBots % Success Average High

Level ActionsListenerBot & ListenerBot

84.4% 19.8

ListenerBot & DialogBot

87.2% 17.5

DialogBot & DialogBot

90.6% 16.6

Page 45: Emergence of Gricean Maxims from Multi-agent Decision Theory Adam Vogel Stanford NLP Group Joint work with Max Bodoia, Chris Potts, and Dan Jurafsky.

Emergent Gricean Behavior

• Be truthful: DialogBot speaks with evidence• Be relevant: DialogBot gives advice to help win• Be clear: need variable costs on messages• Be informative: requires levels of specificity

ACL 2013: Implicatures and Nested Beliefs in Approximate Decentralized-POMDPs

From joint reward, not hard coded

Future Work: intentions, joint plans, deeper belief nesting

Thanks!


Recommended