+ All Categories
Home > Documents > Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental...

Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental...

Date post: 21-Sep-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
33
Active Perception & Mental Models Nikolaos Mavridis Cognitive Machines MIT Media Lab
Transcript
Page 1: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Active Perception&

Mental Models

Nikolaos MavridisCognitive Machines

MIT Media Lab

Page 2: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Today’s Menu

I. VISIONII. ACTIVE PERCEPTION III. MENTAL MODELSIV. FUTURE STEPSV. CONCLUSION

Page 3: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

I. Our VisionI. Our VisionTo build intelligent devices that can

cooperate with humans in a natural manner

And also: learn about humans!

• Key prerequisites:

– Language

– Mental Models of the world

– Multimodal Active Sensing

• Early examples:

– Ripley the robot, Elvis the lighting system, Intelligent car

Page 4: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

General Setting:Internal model of world

"A greyhouse!"

S E N S O R YW O R LD

(Im perfec t, C hang ing)

E X TE R N A LR E A LITY(Fixed S pace tim e)

A C T IV EP E R C E P TIO N

(S ensory da ta & AC TIO N S !!)

M E N TA LM O D E LS(P artia l D escrip tions)

AC T IO N S

D ATA

Page 5: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

II. Active PerceptionII. Active PerceptionCAP TU RE

S EG M E NTATIO N(C O LO R-BA SED )

FAC EDE TE CTIO N

S ALIE NT PO INTDE TE CTIO N

O BJE CTREC O G NIT IO N

V IS O R :PRO PO SA LS FO R

O BJEC T INS TA NTIA TIO N /U PDA TE / D ELETIO N

P RO PR IO CEP TO R :P RO PO S ALS FO R

O B JECT INSTAN TIAT IO N /UP DATE / DELE TIO N

M EN TALM O D EL

STER EO DEP THC ALCULA TIO NCA PTURE

2D REG IO NP ERM ANE NCE

2D FACEP ERM ANE NCE

IM AG IN ATO R :P RO PO S ALS FO R

O B JECT INSTAN TIAT IO N /UP DATE / DELE TIO N

U N D E RC O N S TR U C TIO N

(S AM E A S LEFTCHA NNE L)

S PEE CHR ECO G N ITIO N

U N D ERC O N ST R U C T IO N

Ripley’s Perceptual System

Page 6: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Cameras

• ELMO

• Panasonic KX-HCM280 (Pan/Tilt/Zoom)

Page 7: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Segmentation

• Probabilistic color-based

• Requires uniform background & objects :-(• Replacement: Yair Ghitza’s method

Page 8: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Face Detection

• Paul Viola’s algorithm:Cascade of classifiers, simple features

Page 9: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Salient Point Detection

• Koch/Itti algorithm:(multiscale color/intensity/edge maps)

• Bottom-up human attention model, neurosc.

Page 10: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Object Recognition

• Andre Ribeiro’s algorithm

• Robust to rotations, background…• Andre will tell you more!

Page 11: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Stereo Depth Calculation

• SRI Small Vision System:Stereo engine using area corellation

• Calibration & filtering!

Page 12: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Region/Face Permanence

• “Objecter”: 2D permanence across frames

• Hysterisis before creation/deletion

• Finds optimal across-frame correspondence,

based on color/position/size metric

• Keeps indices across frames

Page 13: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Visor: Proposals for 3D objectinstantiation/update/deletion

• Gets state of the world from mental model

• Compares with evidence, proposes changes

• Stochastic / voxel descriptions, too…

…includes Voxeliser!

Page 14: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Voxeliser

• Shape estimation system using “sculpture”by multiple views (app: spatial domains)

Page 15: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Active Perception

Bottom-up feed-through vs. on-demand active!

(also integrating bottom-up with top-down)

• Theory: visual routines, next best view etc.

• Next Action: current cost & goal-based utility…

• Two models: Resolver, Spectator

Page 16: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Resolver: To ask or to sense?Planning to integrate Speech and Sensorimotor Acts (ICMI ’04)

Early motivation: Disambiguating referents

“Hand me the ball!”

Page 17: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Resolver

• Selects the next action:Question or sensory measurement

• Probabilistic model with one-step planning:Utility (goal-oriented information gain) vs. Cost

• Human-like performance, double matching, 25% cost gain!

Page 18: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Resolver: A screenshot

• After: “The heavy one” - “Is it small? No” - measuresize1-3 - “Is it medium?”

Page 19: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Spectator

• Bottom-up attention guiding camera movement

(Alexander Patrikalakis (UROP) & Nikolaos Mavridis)

• Finds & tracks interesting pointszooms in, marks on map, goes on!

Page 20: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

III. Mental ModelsIII. Mental ModelsMOTIVATION:

How are people able to think about things that are

not directly accessible to their senses at the moment?

What is required for a machine to able to talk about things that are:

out of sight,

happened in the past, or

view the world through somebody else’s eyes (and mind)?

What is the machinery required for the comprehension of:

“Give me the green beanbag

that was on my left!”

Page 21: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Mental models - why? (p.I)Goal: Provide an intermediate representation, mediating between perception, language and action.• In essence:

– an internalized representation of the state of the world as best known so far, in a form convenient for “hooking up” language

(shown below: the revisualisation of the rep)

– and a set of methods for updating this representation given further relevant sensory data, and predicting future states in the absence of such data

Page 22: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Mental models - why? (p.II)

• But also:– A useful decomposition of a complex problem:

a practical engineering methodology with reusable components

a theoretical framework (dynamical systems)

– A unified platform for the instantiation of hypothetical scenarios:

planning (goal state descriptions)

instantiation of situations communicated through language etc

– A starting point for experimental simulations of:

Multi-agent systems, Theory of mind, Learning

Page 23: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Ripley’s “Internalised World”(early version: IEEE SMC)

Object Permanence & Viewpoint Switching

RED

PROTOTYPES(Coding, tuition)

SPACETIME

SENSORYWORLD

WORLD(COMPOUND_AGENT)

AGENTs

...

AGENT_RELATIONs

...

AGENT AGENT_RELATION

BODY(COMPOUND_

OBJECT)

SOULINTERFACE

OBJECTs

OBJECT_RELATIONs

VIEWPOINT

MOVER

MENTALMODEL

GOALS

AFFECT

Page 24: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Objects & attributes

3 Layers of Attributes: (shape, color, weight… apparent/deep)– Stochastic – knowing how much you know!!!: for language, curiosity…

– Deterministic - maximum likelihood

– Categorical - quantized for language: “red”, learnt and ctxt-dependent!

EXAMPLE: STOCHASTIC RADIUS AND POSITION

Page 25: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

The Architecture

M E N T A L M O D E L& R E C O N C IL L IA T O R

(m e n ta l_ m o d e l.e x e )W [ t ] a n d F

M O D A L IT Y -S P E C IF IC

IN S T A T IA T O R S(v is o r .e x e e tc . )

(W [ t ] ,S [ t ] ) -> W s [ t ]

V IR T U A LO B J E C T

IN S T A N T IA T O R( im a g in e r .e x e )

(W [ t ] ,H [ t ] ) - > W h [ t ]

D Y N A M IC SP R E D IC T O R

(p re d ic to r .e x e )W p [ t ]

V IS U A L IS E R(v is u a lis e r .e x e )

S E N S E SS [ t ]

H Y P O T H E S ISG E N E R A T IO N

V IS U A LF E A T U R EA N A L Y S IS

L A N G U A G EU N D E R S T A N D IN G

(b is h o p )v ie w p o in ts e le c t io n

M E N T A L M O D E L S : R ip le y 's c a s e

P r e l im in a r y b lo c k d ia g r a m , S e p t '0 3N ik o la o s M a v r id is , M IT M e d ia L a b

• Modality-specific processes:

– Visor

– Proprioceptor

– Imaginator

• Central processes:

– MM: Processes proposals

– Predictor

• Recent Work: Goals,Affect

• Open Questions:

- Cognitive spacetime

- Comms etc.

Page 26: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Evaluating performance

• Ground truth: Flock of birds sensors

(Stephen Oney (UROP) & Nikolaos Mavridis)

• Measure systematic errors, noise, time delay, dynamics… & calibrate parameters!

Page 27: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

IV: Future StepsIV: Future Steps• Imaginator: Language to mental model!

• Voxelizer: Better shapes and categories

• Resolver: Full integration & active sensing

• Multiagent, Theory of Mind, Innate vs. learnt…

• Parts of soul: Affekt & goal modelling

Page 28: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Multiagent systems

• Prerequisites:– Action recognition across agents

(not strict prereq)– Thus, useful to start by embedding

everything in virtual world wrapper,and cheating on action recognition

– Also, mixed real/virtual agents (Ripleyconversing with a non-existent friend)

• Benefits:– Systematic external examination of effects of different partial world knowledge or

structure/methods of mental models (I.e. contents & form of MM), or even different sensory organs.

– For example, differing categorical boundaries and negotiated alignment (methods difference, I.e. update/prediction function)

– Prerequisite for Theory of Mind!• First preliminary examples:

– Ashwani’s demo for viewpoint-dependent description generation (using the generic MM)

Page 29: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Theory of mind

• Now, each agent’s MMalso contains an estimated mental model of each other agent as part of their descriptions…

• Prerequisites:– Uncertainty – Multi-agent models– Action recognition across agents (strict prereq now!, +gaze)

• Benefits:– Start playing with intention though action recognition – Interesting coupling with inferred goals etc.– “Mind reading” is an immense area for experimentation!– Collaborative tasks

Page 30: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

Innate vs. learnt• Now that we have a clean architecture to start with

how about learning parameters or structuresof the architecture, and experimenting withlearned vs. innate (predesigned or evolved) tradeoffs?

• Examples:– Learning predictive dynamics

• Where do I expect the object to be?• Learning “empirical” newtonian mechanics

– Learning senses-to-model maps• Which property of which object does this sensory signal

inform me about, and how do its contents alter the property?– Learning language-to-model maps (example: Deb’s thesis)

• Which property of which object does this utterance inform me about, and how does it alter the property?– Learning mental model structures

• Which properties should my object descriptions contain?• How can I get an empirical derivation of 3D position as a crucial non-apparent property of an object?

– Concatenating parts at the input-output equivalence level• Forget about all the internalised fuss. Can I get an equivalent structure without postulating and enforcing the exact

architecture?

• In essence: – How arbitrary is everything that was hardcoded? Are some things redundant? Can they be learnt? If so,

How?• FINALLY, FOR ALL PREVIOUSLY STATED FUTURE PLANS:

– Relation with how humans perform (cognitive modeling) - categorical level

Page 31: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

V: ConclusionV: Conclusion

The Picture!

Page 32: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

General SettingInternal model of world

"A greyhouse!"

S E N S O R YW O R LD

(Im perfec t, C hang ing)

E X TE R N A LR E A LITY(Fixed S pace tim e)

A C T IV EP E R C E P TIO N

(S ensory da ta & AC TIO N S !!)

M E N TA LM O D E LS(P artia l D escrip tions)

AC T IO N S

D ATA

Page 33: Mental Models & Active Perceptionalumni.media.mit.edu/~nmav/misc/ActivePerc_May2004.pdf · Mental models - why? (p.I) Goal: Provide an intermediate representation, mediating between

The ultimate goal:

Why Active Perception & Mental Models?The ultimate goal is clear:

• Let’s make Ripley and co. more fun to interact with!• And let’s learn more about us on the way…


Recommended