Extending Sensorimotor Contingencies to Cognition
Free-energy and active inferenceKarl Friston, Wellcome Centre for Neuroimaging, UCL
Abstract
Value-learning and perceptual learning have been an important focus over the past decade, attracting the concerted attention of experimental psychologists, neurobiologists and the machine learning community. Despite some formal connections; e.g., the role of prediction error in optimizing some function of sensory states, both fields have developed their own rhetoric and postulates. In work, we show that perception is, literally, an integral part of value learning; in the sense that it is necessary to integrate out dependencies on the inferred causes of sensory information. This enables the value of sensory trajectories to be optimized through action. Furthermore, we show that acting to optimize value and perception are two aspects of exactly the same principle; namely the minimization of a quantity (free-energy) that bounds the probability of sensations, given a particular agent or phenotype. This principle can be derived, in a straightforward way, from the very existence of biological agents, by considering the probabilistic behavior of an ensemble of agents belonging to the same class. Put simply, we sample the world to maximize the evidence for our existence
“Objects are always imagined as being present in the field of vision as would have to be there in order to produce the same impression on the nervous mechanism” - Hermann Ludwig Ferdinand von Helmholtz
Thomas Bayes
Geoffrey Hinton
Richard Feynman
From the Helmholtz machine to the Bayesian brain and self-organization
Hermann Haken
Richard Gregory
Overview
Ensemble dynamics Entropy and equilibriumFree-energy and surprise
The free-energy principle Action and perceptionHierarchies and generative models
Active inference Sensorimotor contingenciesGoal-directed reachingWritingAction-observationForward and inverse models
Policies and priors Control and attractorsThe mountain-car problem
tem
pera
ture
What is the difference between a snowflake and a bird?
Phase-boundary
…a bird can act (to avoid surprises)
What is the difference between snowfall and a flock of birds?
Ensemble dynamics, clumping and swarming
…birds (biological agents) stay in the same place
They resist the second law of thermodynamics, which says that their entropy should increase
This means biological agents must self-organize to minimise surprise. In other words, to ensure they occupy a limited number of (attracting) states.
0
( ) ( ) ln ( | )H L LT
dt t t p s m
But what is the entropy?
A
( )s g
…entropy is just average surprise
Low surprise (I am usually here) High surprise (I am never here)
But there is a small problem… agents cannot measure their surprise
But they can measure their free-energy, which is always bigger than surprise
This means agents should minimize their free-energy. So what is free-energy?
?
( ) ( )F Lt t
( )s g
What is free-energy?
…free-energy is basically prediction error
where small errors mean low surprise
sensations – predictions
= prediction error
Overview
Ensemble dynamics Entropy and equilibriaFree-energy and surprise
The free-energy principle Action and perceptionHierarchies and generative models
Active inference Sensorimotor contingenciesGoal-directed reachingWritingAction-observationForward and inverse models
Policies and priors Control and attractorsThe mountain-car problem
Action to minimise a bound on surprise Perception to optimise the bound
( ( ) || ( )) ln ( ( ) | , )
argmax
q
a
D q p p s a m
Complexity Accuracy
a Accuracy
F
Action
( )( ) ss g
argmin ( , )a
a s FExternal states in the world
Internal states of the agent (m)
Sensations
argmin ( , )s
F( )( , )a f
More formally,
( | ) ( ( | ) || ( | ))
argmin
F L s m D q p s
Surprise Divergence
Divergence
Free-energy is a function of sensations and a proposal density over hidden causes
and can be evaluated, given a generative model comprising a likelihood and prior:
So what models might the brain use?
( , ) lnq q
s Energy Entropy q F G
( , ) ln ( , | ) ln ( | , ) ln ( | )s p s m p s m p m G
Action
( )( ) ss g
argmin ( , )a
a s FExternal states in the world
Internal states of the agent (m)
Sensations
argmin ( , )s
F( )( , )a f
More formally,
Backward(modulatory)
Forward(driving)
lateral
)1(~x)1(
s
)2((2)
(1)
)2(~x
)2(~v
)1(~v
( 1) ( ) ( , )
( ) ( ) ( , )D
i i v i
i i x i
v g
x f
{ ( ), ( ), , }x t v t
Hierarchal models in the brain
1( ) ( ) ( ) ( ) ( 1) ( 1)
1
( ) ( ) ( ) ( ) ( , )
( ) ( 1) ( 1) ( 1) ( , )
, , | | , ,
, ( ) ( | ) ( | , )
| , ( , )
| , ( , )
D
D N
N
ni i i i i i
i
i i i i x i
i i i i v i
p s x v m p s x v p x v
p x v p x p x v p v x v
p x x v f
p v x v g
Structural priors
Dynamical priors
Likelihood and empirical priors
(1) (1) ( ,1)
(1) (1) (1) ( ,1)
( 1) ( ) ( ) ( , )
( ) ( ) ( ) ( , )
( , )
( , )
( , )
( , )
v
x
i i i v i
i i i x i
s g x v
x f x v
v g x v
x f x v
(1)
(1)( )
( )
( )
( )
( ) D
v
m
m
v
x
s g
v
g
v
v g
x f
Hierarchal form
1 12 2
ln , , |
ln
GT
p s x v m
Gibb’s energy - a simple function of prediction error
Prediction errors{ ( ), ( ), , }x t v t
( , )x v ( )
( )Synaptic gain
Synaptic activity Synaptic efficacy
Activity-dependent plasticity
Functional specialization
Attentional gain
Enabling of plasticity
( ) ( )( )
G
Perception and inference Learning and memory
The recognition density and its sufficient statistics
( ) ( )( )
G
( )( )
( )( )
G
G
xx
vv
( | ) ( , ( ))q NLaplace approximation:
Attention and salience
How can we minimize prediction error (free-energy)?
Change sensory input
sensations – predictions
Prediction error
Change predictions
Action Perception
…prediction errors drive action and perception to suppress themselves
Adjust hypotheses
sensory input
Backward connections return predictions
…by hierarchical message passing in the brain
prediction
Forward connections convey feedback
So how do prediction errors change predictions?
Prediction errors
Predictions
Backward predictions
Forward prediction error
( , )s i
( , )x i
( , )v i
( , 1)v i
( )s t
( , )v i( , 1)x i
( , 1)x i
( , 1)v i
( , 2)v i
Perception and message-passing
( , ) ( , ) ( ) ( ) ( , 1)
( , ) ( , ) ( ) ( )
D
D
v i v i i T i v iv
x i x i i T ix
( ) ( ) ( ) ( )12 ( ( ( )))T
i itr R ( ) ( ) ( )
i
Ti
Synaptic plasticity
( , ) ( , ) ( , ) ( , ) ( , 1) ( )
( , ) ( , ) ( , ) ( , ) ( , ) ( )
( )
( )
v i v i v i v i v i i
x i x i x i x i x i i
g
f
D
Synaptic gain
David Mumford
More formally,
predictions
Reflexes to action
action
( )s a
dorsal root
ventral horn
sensory error
What about action?
Action can only suppress (sensory) prediction error. This means action fulfils our (sensory) predictions
Taa ( ,1) ( ,1) ( ( ) ( ))v v s a g
a
Summary
Biological agents resist the second law of thermodynamics
They must minimize their average surprise (entropy)
They minimize surprise by suppressing prediction error (free-energy)
Prediction error can be reduced by changing predictions (perception)
Prediction error can be reduced by changing sensations (action)
Perception entails recurrent message passing in the brain to optimise predictions
Action makes predictions come true (and minimises surprise)
Overview
Ensemble dynamics Entropy and equilibriaFree-energy and surprise
The free-energy principle Action and perceptionHierarchies and generative models
Active inference Sensorimotor contingenciesGoal-directed reachingWritingAction-observationForward and inverse models
Policies and priors Control and attractorsThe mountain-car problem
Deep pyramidal cells
Superficial pyramidal cells
Forward connectionsbottom-up prediction error
Backward connectionstop-down predictions
Sensorimotor contingencies
Stimulus
Motor response
Exteroception
Classical reflex arc
Proprioception
Vs
J
1
2
xs
x
( ,1)v
1J
1x
2x2J
(0,0)
1 2 3( , , )V v v v
Descending predictions
visual input
proprioceptive input
Action, predictions and priors
Taa
( ,1) ( ,1) ( ( ) ( ))v v s a g
( ,1)v
( ,2)v( ,1)x
( ,1)x
( ,1)v
a
18
2
2
( , )
( , )
( , ) ( )
( , ) ( ) 1
1( )
1
( )
i
i
j
s
x
i x
x
i x
j
v g x v
x f x v
g x v x
f x v x x
xe
ex
e
W
A
0 0 0 0 1 1 1 1
0 0 0 1 0 1 1 1
0 0 0 1 1 0 1 1
0 0 0 1 1 1 0 1
0 0 0 1 1 1 1 0
v v
v v
v v
v v
v v
A
1 2 3 4 5
1 2
3
4
5
Lotka-Volterra dynamicsAttractor space
A generative model of itinerant movement
Stable heteroclinic orbitsPhysical space
( , ) ( )v x v xW
Sequence Winnerless competition
( ,2)x
Itinerant behavior and action-observation
0 0.2 0.4 0.6 0.8 1 1.2 1.4
0.4
0.6
0.8
1
1.2
1.4
action
position (x)po
sitio
n (y
)0 0.2 0.4 0.6 0.8 1 1.2 1.4
observation
position (x)
Taa
Descending predictions
hidden attractor states(Lotka-Volterra)
( ,1)x
-0.5 0 0.5 1 1.5
0
0.5
1
1.5
2
observation-0.5 0 0.5 1 1.5
0
0.5
1
1.5
2
violation
-400 -200 0 200 400 600-0.04
-0.02
0
0.02
0.04
0.06proprioceptive error
-400 -200 0 200 400 600-0.04
-0.02
0
0.02
0.04
0.06proprioceptive error
pred
ictio
n er
ror
-400 -200 0 200 400 600-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02error on hidden states
time (ms)
pred
ictio
n er
ror
-400 -200 0 200 400 600-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02error on hidden states
time (ms)
( )v
( )x
Violations and simulated ERPs
( , , )s x v a
s
,x v
a
,x vDesired and inferred
states
Sensory prediction error
Motor command (action)
Forward model (generative model){ , }x v s
Inverse models a
Desired and inferred states
Sensory prediction error
Forward model{ , , }x v a s
Motor command (action)
Environment{ , , }x v a s
s
,x v
a
,x v
Environment{ , , }x v a s
( , , )s x v a
Free-energy formulation Forward-inverse formulation
Inverse model (control policy){ , }x v a
Corollary dischargeEfference copy
Overview
Ensemble dynamics Entropy and equilibriaFree-energy and surprise
The free-energy principle Action and perceptionHierarchies and generative models
Active inference Sensorimotor contingenciesGoal-directed reachingWritingAction-observationForward and inverse models
Policies and priors Control and attractorsThe mountain-car problem
18( ) x
xx
a xx
f
True motion
-2 -1 0 1 20
0.1
0.2
0.3
0.4
0.5
0.6
0.7
position
( )x
heig
ht
The mountain car problem
position happiness
The cost-function
x
xxf
cxx
Policy (predicted motion)
( , )c x h
( )h( )x
The environment
Adriaan Fokker Max Planck
“I expect to move faster when cost is positive”
With cost (i.e., exploratory
dynamics)
Exploring & exploiting the environment
Using just the free-energy principle and itinerant priors on motion, we have solved a benchmark problem in optimal control theory (without any learning).
Policies and prior expectations
Summary
The free-energy can be minimized by action (by changing sensory input) or perception (by changing predictions of that input)
The only way that action can suppress free-energy is by reducing sensory prediction error (cf, the juxtaposition of motor and sensory cortex)
Action fulfils expectations, which manifests as suppression of prediction error by resampling sensory input;
Or as intentional movement, fulfilling expectations furnished by empirical priors (cf, sensorimotor contingencies)
Many adaptive and realistic behaviors can be formulated in terms of prior expectations about itinerant trajectories (though the autovitiation of fixed-point attractors)
Thank you
And thanks to collaborators:
Jean DaunizeauHarriet Feldman
Lee HarrisonStefan KiebelJames Kilner
Jérémie MattoutKlaas Stephan
And colleagues:
Peter DayanJörn DiedrichsenPaul Verschure
Florentin Wörgötter
And many others