conciousness as integrated information - Giulio Tononi.pdf

8/13/2019 conciousness as integrated information - Giulio Tononi.pdf

1/27

Consciousness as Integrated Information:a Provisional Manifesto

GIULIO TONONI

Department of Psychiatry, University of Wisconsin, Madison, Wisconsin

Abstract. The integrated information theory (IIT) starts

from phenomenology and makes use of thought experi-ments to claim that consciousness is integrated information.

Specifically: (i) the quantity of consciousness corresponds

to the amount of integrated information generated by a

complex of elements; (ii) the quality of experience is spec-

ified by the set of informational relationships generated

within that complex. Integrated information () is defined

as the amount of information generated by a complex of

elements, above and beyond the information generated by

its parts. Qualia space (Q) is a space where each axis

represents a possible state of the complex, each point is a

probability distribution of its states, and arrows between

points represent the informational relationships among its

elements generated by causal mechanisms (connections).

Together, the set of informational relationships within a

complex constitute a shape in Q that completely and univo-

cally specifies a particular experience. Several observations

concerning the neural substrate of consciousness fall natu-

rally into place within the IIT framework. Among them are

the association of consciousness with certain neural systems

rather than with others; the fact that neural processes un-

derlying consciousness can influence or be influenced by

neural processes that remain unconscious; the reduction of

consciousness during dreamless sleep and generalized sei-

zures; and the distinct role of different cortical architectures

in affecting the quality of experience. Equating conscious-

ness with integrated information carries several implications

for our view of nature.

INTRODUCTION

Everybody knows what consciousness is: it is what van-

ishes every night when we fall into dreamless sleep and

reappears when we wake up or when we dream. It is also all

we are and all we have: lose consciousness and, as far as

you are concerned, your own self and the entire world

dissolve into nothingness.

Yet almost everybody thinks that understanding con-

sciousness at the fundamental level is currently beyond the

reach of science. The best we can do, it is often argued, is

gather more and more facts about the neural correlates of

consciousnessthose aspects of brain function that change

when some aspects of consciousness changeand hope thatone day we will come up with an explanation. Others are

more pessimistic: we may learn all about the neural corre-

lates of consciousness and still not understand why certain

physical processes seem to generate experience while others

do not.

It is not that we do not know relevant facts about con-

sciousness. For example, we know that the widespread

destruction of the cerebral cortex leaves people permanently

unconscious (vegetative), whereas the complete removal of

the cerebellum, even richer in neurons, hardly affects con-

sciousness. We also know that neurons in the cerebral

cortex remain active throughout sleep, yet at certain times

during sleep consciousness fades, while at other times we

dream. Finally, we know that different parts of the cortex

influence different qualitative aspects of consciousness:

damage to certain parts of the cortex can impair the expe-

rience of color, whereas other lesions may interfere with the

perception of shapes. In fact, increasingly refined neurosci-

entific tools are uncovering increasingly precise aspects of

the neural correlates of consciousness (Koch, 2004). And

yet, when it comes to explaining why experience blossoms

in the cortex and not in the cerebellum, why certain stages

of sleep are experientially underprivileged, or why some

Received 20 August 2008; accepted 10 October 2008.

* To whom correspondence should be addressed. E-mail: gtononi@

wisc.edu

Abbreviations: , integrated information; IIT, integrated information

theory; MIP, minimum information partition.

Reference:Biol. Bull. 215: 216242. (December 2008) 2008 Marine Biological Laboratory

216


2/27

cortical areas endow our experience with colors and others

with sound, we are still at a loss.

Our lack of understanding is manifested most clearly

when scientists are asked questions about consciousness in

difficult cases. For example, is a person with akinetic

mutismawake with eyes open, but mute, immobile, and

nearly unresponsive conscious or not? How much con-

sciousness is there during sleepwalking or psychomotor

seizures? Are newborn babies conscious, and to what ex-

tent? Are animals conscious? If so, are some animals more

conscious than others? Can they feel pain? Does a bat feel

space the same way we do? Can bees experience colors, or

merely react to them? Can a conscious artifact be con-

structed with non-neural ingredients? I believe it is fair to

say that no consciousness expert, if there is such a job

description, can be confident about the correct answer to

such questions. This is a remarkable state of affairs. Just

consider comparable questions in physics: Do stars have

mass? Do atoms? How many different kinds of atoms and

elementary particles are there, and of what are they made?

Is energy conserved? And how can it be measured? Or

consider biology: What are species, and how do they

evolve? How are traits inherited? How do organisms de-

velop? How is energy produced from nutrients? How does

echolocation work in bats? How do bees distinguish among

colors? And so on. Obviously, we expect satisfactory an-

swers by any competent physicist and biologist.

Whats the matter with consciousness, then, and how

should we proceed? Early on, I came to the conclusion that

a genuine understanding of consciousness is possible only if

empirical studies are complemented by a theoretical analy-

sis. Indeed, neurobiological facts constitute both challeng-

ing paradoxes and precious clues to the enigma of con-

sciousness. This state of affairs is not unlike the one faced

by biologists when, knowing a great deal about similarities

and differences between species, fossil remains, and breed-

ing practices, they still lacked a theory of how evolution

might occur. What was needed, then as now, were not just

more facts, but a theoretical framework that could make

sense of them.

In what follows, I discuss the integrated information

theory of consciousness (IIT; Tononi, 2004)an attempt to

understand consciousness at the fundamental level. Topresent the theory, I first consider phenomenological

thought experiments indicating that subjective experience

has to do with the generation of integrated information.

Next, I consider how integrated information can be defined

mathematically. I then show how basic facts about con-

sciousness and the brain can be accounted for in terms of

integrated information. Finally, I discuss how the quality of

consciousness can be captured geometrically by the shape

of informational relationships within an abstract space

called qualia space. I conclude by examining some impli-

cations of the theory concerning the place of experience in

our view of the world.

A Phenomenological Analysis: Consciousness as

Integrated Information

The integrated information theory (IIT) of consciousnessclaims that, at the fundamental level, consciousness is inte-

grated information, and that its quality is given by the

informational relationships generated by a complex of ele-

ments (Tononi, 2004). These claims stem from realizing

that information and integration are the essential properties

of our own experience. This may not be immediately evi-

dent, perhaps because, being endowed with consciousness

most of the time, we tend to take its gifts for granted. To

regain some perspective, it is useful to resort to two thought

experiments, one involving a photodiode and the other a

digital camera.

Information: the photodiode thought experiment

Consider the following: You are facing a blank screen

that is alternately on and off, and you have been instructed

to say light when the screen turns on and dark when it

turns off. A photodiodea simple light-sensitive device

has also been placed in front of the screen. It contains a

sensor that responds to light with an increase in current and

a detector connected to the sensor that says light if the

current is above a certain threshold and dark otherwise.

The first problem of consciousness reduces to this: when

you distinguish between the screen being on or off, you

have the subjective experience of seeing light or dark. Thephotodiode can also distinguish between the screen being on

or off, but presumably it does not have a subjective expe-

rience of light and dark. What is the key difference between

you and the photodiode?

According to the IIT, the difference has to do with how

much information is generated when that distinction is

made. Information is classically defined as reduction of

uncertainty: the more numerous the alternatives that are

ruled out, the greater the reduction of uncertainty, and thus

the greater the information. It is usually measured using the

entropy function, which is the logarithm of the number of

alternatives (assuming they are equally likely). For exam-

ple, tossing a fair coin and obtaining heads corresponds to

log2(2) 1 bit of information, because there are just two

alternatives; throwing a fair die yields log2(6) 2.59 bits of

information, because there are six.

Let us now compare the photodiode with you. When the

blank screen turns on, the mechanism in the photodiode tells

the detector that the current from the sensor is above rather

than below the threshold, so it reports light. In performing

this discrimination between two alternatives, the detector in

the photodiode generates log2(2) 1 bit of information.

When you see the blank screen turn on, on the other hand,

217CONSCIOUSNESS AS INTEGRATED INFORMATION


3/27

the situation is quite different. Though you may think you

are performing the same discrimination between light and

dark as the photodiode, you are in fact discriminating

among a much larger number of alternatives, thereby gen-

erating many more bits of information.

This is easy to see. Just imagine that, instead of turning

light and dark, the screen were to turn red, then green, then

blue, and then display, one after the other, every frame from

every movie that was ever produced. The photodiode, in-

evitably, would go on signaling whether the amount of light

for each frame is above or below its threshold: to a photo-

diode, things can only be one of two ways, so when it

reports light, it really means just this way versus that

way. For you, however, a light screen is different not only

from a dark screen, but from a multitude of other images, so

when you say light, it really means this specific way

versus countless other ways, such as a red screen, a green

screen, a blue screen, this movie frame, that movie frame,

and so on for every movie frame (not to mention for asound, smell, thought, or any combination of the above).

Clearly, each frame looks different to you, implying that

some mechanism in your brain must be able to tell it apart

from all the others. So when you say light, whether you

think about it or not (and you typically wont), you have just

made a discrimination among a very large number of alter-

natives, and thereby generated many bits of information.

This point is so deceivingly simple that it is useful to

elaborate a bit on why, although a photodiode may be as

good as we are in detecting light, it cannot possibly see light

the way we doin fact, it cannot possibly see anything at

all. Hopefully, by realizing what the photodiode lacks, wemay appreciate what allows us to consciously see the

light.

The key is to realize how the many discriminations we

can do, and the photodiode cannot, affect themeaningof the

discrimination at hand, the one between light and dark. For

example, the photodiode has no mechanism to discriminate

colored from achromatic light, even less to tell which par-

ticular color the light might be. As a consequence, all light

is the same to it, as long as it exceeds a certain threshold. So

for the photodiode, light cannot possibly mean achro-

matic as opposed to colored, not to mention of which

particular color. Also, the photodiode has no mechanism to

distinguish between a homogeneous light and a bright

shapeany bright shapeon a darker background. So for

the photodiode, light cannot possibly mean full field as

opposed to a shapeany of countless particular shapes.

Worse, the photodiode does not even know that it is detect-

ing a visual attribute (the visualness of light) as it has no

mechanism to tell visual attributes, such as light or dark,

from non-visual ones, such as hot and cold, light or heavy,

loud or soft, and so on. As far as it knows, the photodiode

might just as well be a thermistorit has no way of know-

ing whether it is sensing light versusdark or hotversuscold.

In short, the only specification a photodiode can make is

whether things are this or that way: any further specification

is impossible because it does not have mechanisms for it.

Therefore, when the photodiode detects light, such light

cannot possibly mean what it means for us; it does not even

mean that it is a visual attribute. By contrast, when we see

light in full consciousness, we are implicitly being much

more specific: we simultaneously specify that things are this

way rather than that way (light as opposed to dark), that

whatever we are discriminating is not colored (in any par-

ticular color), does not have a shape (any particular one), is

visual as opposed to auditory or olfactory, sensory as op-

posed to thought-like, and so on. To us, then, light is much

more meaningful precisely because we have mechanisms

that can discriminate this particular state of affairs we call

light against a large number of alternatives.

According to the IIT, it is all this added meaning, pro-

vided implicitly by how we discriminate pure light from all

these alternatives, that increases the level of consciousness.This central point may be appreciated either by subtrac-

tion or by addition. By subtraction, one may realize that

our being conscious of light would degrade more and

morewould lose its non-coloredness, its non-shapedness,

would even lose its visualnessas its meaning is progres-

sively stripped down to just one of two ways, as with the

photodiode. By addition, one may realize that we can only

see light as we see it, as progressively more and more

meaning is added by specifying how it differs from count-

less alternatives. Either way, the theory says that the more

specifically ones mechanisms discriminate between what

pure light is and what it is not (the more they specify whatlight means), the more one is conscious of it.

Integration: the camera thought experiment

Informationthe ability to discriminate among a large

number of alternativesmay thus be essential for con-

sciousness. However, information always implies a point of

view, and we need to be careful about what that point of

view might be. To see why, consider another thought ex-

periment, this time involving a digital camera, say one

whose sensor chip is a collection of a million binary pho-

todiodes, each sporting a sensor and a detector. Clearly,

taken as a whole, the cameras detectors could distinguish

among 21,000,000 alternative states, an immense number,

corresponding to 1 million bits of information. Indeed, the

camera would easily respond differently to every frame

from every movie that was ever produced. Yet few would

argue that the camera is conscious. What is the key differ-

ence between you and the camera?

According to the IIT, the difference has to do with

integrated information. From the point of view of an exter-

nal observer, the camera may be considered as a single

system with a repertoire of 21,000,000 states. In reality, how-

218 G. TONONI


4/27

ever, the chip is not an integrated entity: since its 1 million

photodiodes have no way to interact, each photodiode per-

forms its own local discrimination between a low and a high

current completely independent of what every other photo-

diode might be doing. In reality, the chip is just a collection

of 1 million independent photodiodes, each with a repertoire

of two states. In other words, there is no intrinsic point of

view associated with the camera chip as a whole. This is

easy to see: if the sensor chip were cut into 1 million pieces

each holding its individual photodiode, the performance of

the camera would not change at all.

By contrast, you discriminate among a vast repertoire of

states as an integrated system, one that cannot be broken

down into independent components each with its own sep-

arate repertoire. Phenomenologically, every experience is

an integrated whole, one that means what it means by virtue

of being one, and that is experienced from a single point of

view. For example, the experience of a red square cannot be

decomposed into the separate experience of red and theseparate experience of a square. Similarly, experiencing the

full visual field cannot be decomposed into experiencing

separately the left half and the right half: such a possibility

does not even make sense to us, since experience is always

whole. Indeed, the only way to split an experience into

independent experiences seems to be to split the brain in

two, as in patients who underwent the section of the corpus

callosum to treat severe epilepsy (Gazzaniga, 2005). Such

patients do indeed experience the left half of the visual field

independently of the right side, but then the surgery has

created two separate consciousnesses instead of one. Mech-

anistically then, underlying the unity of experience must becausal interactions among certain elements within the brain.

This means that these elements work together as an inte-

grated system, which is why their performance, unlike that

of the camera, breaks down if they are disconnected.

A Mathematical Analysis: Quantifying Integrated

Information

This phenomenological analysis suggests that, to gener-

ate consciousness, a physical system must be able to dis-

criminate among a large repertoire of states (information)

and it must be unified; that is, it should be doing so as a

single system, one that is not decomposable into a collectionof causally independent parts (integration). But how can one

measure integrated information? As I explain below, the

central idea is to quantify the information generated by a

system, above and beyond the information generated inde-

pendently by its parts (Tononi, 2001, 2004; Balduzzi and

Tononi, 2008).1

Information

First, we must evaluate how much information is gener-

ated by the system. Consider the system of two binary units

in Figure 1, which can be thought of as an idealized version

of a photodiode composed of a sensor S and a detector D.

The system is characterized by a state it is in, which in this

case is 11 (first digit for the sensor, second digit for the

detector), and by a mechanism. This is mediated by a

connection (arrow) between the sensor and the detector that

implements a causal interaction: in this case, the elementary

mechanism of the system is that the detector checks the state

of the sensor and turns on if the sensor is on, and off

otherwise (more generally, the specific causal interaction

can be described by an input-output table).

Potentially, a system of two binary elements could be in

any of four possible states (00,01,10,11) with equal proba-

1 2

SENSOR DETECTOR

P

1/4

0 0 1 10 1 0 1

P

1/2

0 0 1 10 1 0 1

A.

B.

1

2

ei(X(mech,x1)) = H [p(X

0(mech, x

1)) ||p(X

0(maxH))] = 1 bit

p(X0(maxH))

p(X0(mech, x

1))

Figure 1. Effective information. (A) A photodiode consisting of a

sensor and detector unit. The photodiodes mechanism is such that the detector

unit turns on if the sensors current is above a threshold. Here both units are on

(binary 1, indicated in gray). (B) For the entire system (sensor unit, detector

unit) there are four possible states: (00,01,10,11). The potential distribution

p(X0(maxH)) (1/4,1/4,1/4,1/4) is the maximum entropy distribution on the

four states. Given the photodiodes mechanism and the fact that the detector is

on, the sensor must have been on. Thus, the photodiodes mechanism and its

current state specifies the following distribution: two of the four possible states

(00,01) are ruled out; the other two states (10,11) are equally likely since they

areindistinguishableto themechanism (the priorstate of the detector makes no

difference to the current state of the sensor). The actual distribution is therefore

p(X0(mech, x1)) (0,0,1/2,1/2). Relative entropy (Kullback-Leibler diver-

gence) between two probability distributions pandqis H[p|q] pilog2pi/qi,

so the effective information ei(X(mech, x1)) associated with output x1 11 is

1 bit (effective information is the entropy of the actual relative to the potential

distributions).



5/27

bility: p (1/4,1/4,1/4,1/4). Formally, this potential (a

priori) repertoire is represented by the maximum entropy or

uniform distribution of possible system states at time t0,

which expresses complete uncertainty (p(X0(maxH))). Con-

sidering the potential repertoire as the set of all possible

input states, the particular mechanism X(mech) of this sys-

tem can be thought of as specifying a forwardrepertoire

the probability distribution of output states produced by the

system when perturbed with all possible input states. But the

system is actually in a particular output state (in this case, at

time t1, x1 11). In actuality, a system with this mech-

anism being in state 11 specifies that the previous system

state x0must have been either 11 or 10, rather than 00 or 01,

corresponding to p (0,0,1/2,1/2) (in this system, there is

no mechanism to specify the detector state, which remains

uncertain). Formally, then, the mechanism and the state 11

specify anactual (a posteriori) distribution or repertoire of

system states p(X0(mech,x1)) at time t0 that could have

caused (led to) x1 at time t1, while ruling out (givingprobability zero to) states that could not. In this way, the

systems mechanism and state constitute information (about

the systems previous state), in the classic sense of reduction

of uncertainty or ignorance. More precisely, the systems

mechanism and state generate 1 bit of information by dis-

tinguishing between things being one way (11 or 10, which

remain indistinguishable to it) rather than another way (00

or 01, which also remain indistinguishable to it).

In general, the information generated when a system

characterized by a certain mechanism in a particular state

can be measured by the relative entropy H between the

actual and the potential repertoires (relative to is indicatedby ), captured by the effective information (ei):

eiXmech,x1 HpX0mech,x1pX0maxH

Relative entropy, also known as Kullback-Leibler diver-

gence, is a difference between probability distributions

(Cover and Thomas, 2006): if the distributions are identical,

relative entropy is zero; the more different they are, the

higher the relative entropy.2 Figuratively, the systems

mechanism and state generate information by sharpening

the uniform distribution into a less uniform onethis is

how much uncertainty is reduced. Clearly, the amount of

effective information generated by a system is high if it has

a large potential repertoire and a small actual repertoire,

since a large number of initial states are ruled out. By

contrast, the information generated is little if the systems

repertoire is small, or if many states could lead to the current

outcome, since few states are ruled out. For instance, if

noise dominates (any state could have led to the current

one), no alternatives are ruled out, and no information is

generated.

Since effective information is implicitly specified once a

mechanism and state are specified, it can be considered to be

an intrinsic property of a system. To calculate it explic-

itly, from an extrinsic perspective, one can perturb the

system in all possible ways (i.e., try out all possible input

states, corresponding to the maximum entropy distribution

or potential repertoire) to obtain the forward repertoire of

output states given the systems mechanism. Finally one can

calculate, using Bayes rule, the actual repertoire given the

systems state (Balduzzi and Tononi, 2008).3

Integration

Second, we must find out how much of the information

generated by a system is integrated information; that is, how

much information is generated by a single entity, as opposed

to a collection of independent parts. The idea here is to

consider the parts of the system independently, ask how

much information they generate by themselves, and compare it

with the information generated by the system as a whole.

This can be done by resorting again to relative entropy tomeasure the difference between the probability distribution

generated by the system as a whole (p(X0(mech,x1)), the

actual repertoire of the system x) with the probability dis-

tribution generated by the parts considered independently

(p(kM0(mech,1)), the product of the actual repertoire of

the parts kM). Integrated information is indicated with the

symbol (the vertical bar I stands for information, the

circle O for integration):

Xmech,x1

HpX0mech,x1pkM0mech,1for

kM0 MIP

That is, the actual repertoire for each part is specified by

causal interactions internal to each part, considered as a

system in its own right, while external inputs are treated as

a source of extrinsic noise. The comparison is made with the

particular decomposition of the system into parts that leaves

the least information unaccounted for. This minimum infor-

mation partition (MIP) decomposes the system into its

minimal parts.

To see how this works, consider two of the million

photodiodes in the digital camera (Fig. 2, left). By turning

on or off depending on its input, each photodiode generates

1 bit of information, just as we saw before. Considered

independently, then, two photodiodes generate 2 bits of

information, and 1 million photodiodes generate 1 million

bits of information. However, as shown in the figure, the

product of the actual distributions generated independently

by the parts is identical to the actual distribution for the

system. Therefore, the relative entropy between the two

distributions is zero: the system generates no integrated

information ( (X(mech,x1)) 0) above and beyond what

is generated by its parts.

Clearly, for integrated information to be high, a system

must be connected in such a way that information is gen-

220 G. TONONI


6/27

erated by causal interactions among rather than within its

parts. Thus, a system can generate integrated information

only to the extent that it cannot be decomposed into infor-

mationally independent parts. A simple example of such a

system is shown in Figure 2 (right). In this case, the inter-

action between the minimal parts of the system generates

information above and beyond what is accounted for by the

parts by themselves ( (X(mech,x1)) 0).

In short, integrated information captures the information

generated by causal interactions in the whole, over and

above the information generated by the parts.4

Complexes

Finally, by measuring values for all subsets of elements

within a system, we can determine which subsets form

complexes. Specifically, a complex X is a set of elements

that generate integrated information ( 0) that is not fully

contained in some larger set of higher

(Fig. 3). A com-plex, then, can be properly considered to form a single

entity having its own, intrinsic point of view (as opposed

to being treated as a single entity from an outside, extrinsic

point of view). Since integrated information is generated

withina complex and not outside its boundaries, experience

is necessarily private and related to a single point of view or

perspective (Tononi and Edelman, 1998; Tononi, 2004). A

given physical system, such as a brain, is likely to contain

more than one complex, many small ones with low

values, and perhaps a few large ones (Tononi and Edelman,

1998; Tononi, 2004). In fact, at any given time there may be

a singlemain complexof comparatively much higher that

underlies the dominant experience (a main complex is suchthat its subsets have strictly lower ). As shown in Figure

3, a main complex can be embedded into larger complexes

of lower . Thus, a complex can be casually connected,

throughports-inand ports-out, to elements that are not part

of it. According to the IIT, such elements can indirectly

influence the state of the main complex without contributing

directly to the conscious experience it generates (Tononi

and Sporns, 2003).

A Neurobiological Reality Check: Accounting for

Empirical Observations

Can this approach account, at least in principle, for some

of the basic facts about consciousness that have emerged

from decades of clinical and neurobiological observations?

Measuring and finding complexes is not easy for realistic

systems, but it can be done for simple networks that bear

some structural resemblance to different parts of the brain

(Tononi, 2004; Balduzzi and Tononi, 2008).

For example, by using computer simulations, it is possi-

ble to show that high requires networks that conjoin

functional specialization (due to its specialized connectiv-

ity; each element has a unique functional role within the

network) with functional integration (there are many path-

ways for interactions among the elements, Fig. 4A.). In very

rough terms, this kind of architecture is characteristic of the

mammalian corticothalamic system: different parts of the

cerebral cortex are specialized for different functions, yet a

vast network of connections allows these parts to interact

profusely. And indeed, as much neurological evidence in-

dicates (Posner and Plum, 2007), the corticothalamic system

is precisely the part of the brain that cannot be severely

impaired without loss of consciousness.

Conversely, is low for systems that are made up of

small, quasi-independent modules (Fig. 4B; Tononi, 2004;

Balduzzi and Tononi, 2008). This may be why the cerebel-

lum, despite its large number of neurons, does not contrib-

ute much to consciousness: its synaptic organization is such

that individual patches of cerebellar cortex tend to be acti-

vated independently of one another, with little interaction

between distant patches (Bower, 2002).

Computer simulations also show that units along multi-ple, segregated incoming or outgoing pathways are not

incorporated within the repertoire of the main complex (Fig.

4C; Tononi, 2004; Balduzzi and Tononi, 2008). This may

be why neural activity in afferent pathways (perhaps as far

as V1), though crucial for triggering this or that conscious

experience, does not contribute directly to conscious expe-

rience; nor does activity in efferent pathways (perhaps start-

ing with primary motor cortex), though it is crucial for

reporting each different experience.

The addition of many parallel cycles also generally does

not change the composition of the main complex, although

values can be altered (Fig. 4D). Instead, cortical andsubcortical cycles or loops implement specialized subrou-

tines that are capable of influencing the states of the main

corticothalamic complex without joining it. Such informa-

tionally insulated cortico-subcortical loops could constitute

the neural substrates for many unconscious processes that

can affect and be affected by conscious experience (Baars,

1988; Tononi, 2004), such as those that enable object rec-

ognition, language parsing, or translating our vague inten-

tions into the right words.

At this stage, it is hard to say precisely which cortical

circuits may work as a large complex of high , and which

instead may remain informationally insulated. Does the

dense mesial connectivity revealed by diffusion spectral

imaging (Hagmann et al., 2008) constitute the backbone

of a corticothalamic main complex? Do parallel loops

through basal ganglia implement informationally insulated

subroutines? Are primary sensory cortices organized like

massive afferent pathways to a main complex higher up in

the cortical hierarchy (Koch, 2004)? Is much of prefrontal

cortex organized like a massive efferent pathway? Do cer-

tain cortical areas, such as those belonging to the dorsal

visual stream, remain partly segregated from the main com-

plex? Unfortunately, answering these questions and prop-



7/27

INTEGRATED INFORMATION GENERATED BY THE SYSTEM ABOVE AND BEYOND THE PARTS

INFORMATION GENERATED BY THE SYSTEM

INFORMATION GENERATED BY THE PARTS

A

1

2

3

4

P

1/4

1/16

P

3/8

B

1/41/4

P

2/3

1/4

P

1/2

B

1

2

3

4

P

1/2

1/4

A P 1

1/16

C

1

2

3

4

P

1/4

1/4

C P 1

1/4

1

2

3

4

1

2

3

4

31

2 4

MIP

ei(X(mech,x1)) = 2 bits ei(X(mech,x

1)) = 4 bits

actual: p(X0(mech,x1))

potential: p(X0(maxH))

actual: p(X0(mech,x

1))

potential: p(X0(maxH))

ei(aM(mech,1))=1 bit

aM bM

aM bM

p(aM0(mech,

1))

aM bM

MIP

aM bM

MIP MIPp(kM

0(mech,

1))

K=1,2 K=1,2

(X(mech,x1))=H[p(X

0(mech,x

1))||p(kM

0(mech,

1))]=0 bits

K=1,2 K=1,2

p(bM0(mech,

1))

p(aM0(maxH)) p(bM0(maxH))

ei(bM(mech,1))=1 bit

p(aM0(mech,

1)) p(bM0(mech,1))

p(aM0(maxH)) p(bM0(maxH))

ei(aM(mech,1))=1.1 bits ei(bM(mech,

1))=1 bit

p(X0(mech,x

1)) p(X

0(mech,x

1))

p(kM0(mech,

1))

(X(mech,x1))=H[p(X

0(mech,x

1))||p(kM

0(mech,

1))]=2 bits

Figure 2. Integrated information. Left-hand side: two photodiodes in a digital camera. (A) Information

generated by the system as a whole. The system as a whole generates 2 bits of effective information by

specifying that n1 and n3must have been on. (B) Information generated by the parts. The minimum information

partition (MIP) is the decomposition of a system into (minimal) parts, that is, the decomposition that leaves the

least information unaccounted for. Here the parts are two photodiodes. (C) The information generated by the

system as a whole is completely accounted for by the information generated by its parts. In this case, the actual

repertoire of the whole is identical to the combined actual repertoires of the parts (the product of their

222 G. TONONI


8/27

erly testing the predictions of the theory requires a much

better understanding of cortical neuroanatomy than is cur-

rently available.

Other simulations show that the effects of cortical dis-

connections are readily captured in terms of integrated

information (Tononi, 2004): a callosal cut produces, out

of a large complex corresponding to the connected cortico-

thalamic system, two separate complexes, in line with many

studies of split-brain patients (Gazzaniga, 2005). However,

because there is great redundancy between the two hemi-

spheres, their value is not greatly reduced compared to

when they form a single complex. Functional disconnec-

tions may also lead to a restriction of the neural substrate of

consciousness, as is seen in neurological neglect phenom-

ena, in psychiatric conversion and dissociative disorders,

and possibly during dreaming and hypnosis. It is also likely

that certain attentional phenomena may correspond to

changes in the composition of the main complex underlying

consciousness (Koch and Tsuchiya, 2007). The attentionalblink,5 where a fixed sensory input may at times make it to

consciousness and at times not, may also be due to changes

in functional connectivity: access to the main corticotha-

lamic complex may be enabled or not based on dynamics

intrinsic to the complex (Dehaene et al., 2003). Similarly,

binocular rivalry6 may be related, at least in part, to dy-

namic changes in the composition of the main corticotha-

lamic complex caused by transient changes in functional

connectivity. Computer simulations confirm that functional

disconnection can reduce the size of a complex and reduce

its capacity to integrate information (Tononi, 2004). While

it is not easy to determine, at present, whether a particulargroup of neurons is excluded from the main complex

because of hard-wired anatomical constraints or is tran-

siently disconnected due to functional changes, the set of

elements underlying consciousness is not static, but form

a dynamic complex o r dynamic core (Tononi and

Edelman, 1998).

Computer simulations also indicate that the capacity to

integrate information is reduced if neural activity is ex-

tremely high and near-synchronous, due to a dramatic de-

crease in the repertoire of discriminable states (Fig. 4E;

Balduzzi and Tononi, 2008). This reduction in degrees of

freedom could be the reason that consciousness is reduced

or eliminated in absence seizure (petit mal) and other con-

ditions during which neural activity is both high and syn-

chronous (Blumenfeld and Taylor, 2003).

The most common example of a marked change in the

level of experience is the fading of consciousness that

occurs during certain periods of sleep. Subjects awakened in

deep NREM (nonrapid eye movement) sleep, especially

early in the night, often report that they were not aware of

themselves or of anything else, though cortical and thalamic

neurons remain active. Awakened at other times, mainly

during REM sleep or during lighter periods of NREM sleep

later in the night, they report dreams characterized by vivid

images (Hobson et al., 2000). From the perspective of

integrated information, a reduction of consciousness during

early sleep would be consistent with the bistability of cor-

tical circuits during deep NREM sleep. Due to changes in

intrinsic and synaptic conductances triggered by neuro-

modulatory changes (e.g., low acetylcholine), cortical neu-

rons cannot sustain firing for more than a few hundred

milliseconds and invariably enter a hyperpolarized down-

state. Shortly afterward, they inevitably return to a depolar-

ized up-state (Steriade et al., 2001). Indeed, computer sim-ulations show that values of are low in systems with such

bistable dynamics (Fig. 4F, Balduzzi and Tononi, 2008).

Consistent with these observations, studies using TMS, a

technique for stimulating the brain non-invasively, in con-

junction with high-density EEG, show that early NREM

sleep is associated either with a breakdown of the effective

connectivity among cortical areas, and thereby with a loss of

integration (Massiminiet al., 2005, 2007), or with a stereo-

typical global response suggestive of a loss of repertoire and

thus of information (Massimini et al., 2007). Similar

changes are seen in animal studies of anesthesia (Alkire et

al., 2008).Finally, consciousness not only requires a neural sub-

strate with appropriate anatomical structure and appropriate

physiological parameters, it also needs time (Bachmann,

2000). The theory predicts that the time requirement for the

generation of conscious experience in the brain emerges

directly from the time requirements for the build-up of an

integrated repertoire among the elements of the corticotha-

lamic main complex so that discriminations can be highly

informative (Tononi, 2004; Balduzzi and Tononi, unpubl.).

To give an obvious example, if one were to perturb half of

the elements of the main complex for less than a millisec-

ond, no perturbations would produce any effect on the other

half within this time window, and would be zero. After,

say, 100 ms, however, there is enough time for differential

effects to be manifested, and should grow.

respective probability distributions), so that relative entropy is zero. The system generates no information above and beyond the parts, so it cannot be

considered a single entity.Right-hand side: an integrated system. Elements in the system are on if they receive two or more spikes. The system is in state

x1 1000. (A) The mechanism specifies a unique prior state that can cause state x1, so the system generates 4 bits of effective information. All other initial

states are ruled out, since they cause different outputs. (B) Effective information generated by the two minimal parts, considered as systems in their own

right. External inputs are treated as extrinsic noise. (C) Integrated information is information generated by the whole (black arrows) over and above the

parts (gray arrows). In this case, the actual repertoire of the whole is different from the combined actual repertoires of the parts, and the relative entropy

is 2 bits. The system generates information above and beyond the parts, so it can be considered a single entity (a complex).



9/27

The Quality of Consciousness: Characterizing

Informational Relationships

If the amount of integrated information generated by

different brain structures (or by the same structure function-

ing in different ways) can in principle account for changesin the level of consciousness, what is responsible for the

quality of each particular experience? What determines that

colors look the way they do and are different from the way

music sounds? Once again, empirical evidence indicates

that different qualities of consciousness must be contributed

by different cortical areas. Thus, damage to certain parts of

the cerebral cortex forever eliminates our ability to experi-

ence color (whether perceived, imagined, remembered, or

dreamt), whereas damage to other parts selectively elimi-

nates our ability to experience visual shapes. There is ob-

viously something about different parts of the cortex that

can account for their different contribution to the quality of

experience. What is this something?

The IIT claims that, just as thequantityof consciousness

generated by a complex of elements is determined by the

amount of integrated information it generates above and

beyond its parts, the quality of consciousness is determined

by the set of all the informational relationships its mecha-

nisms generate. That is,how integrated information is gen-

erated within a complex determines not only the amount of

consciousness it has, but also what kind of consciousness.

Consider again the photodiode thought experiment. As I

discussed before, when the photodiode reacts to light, it can

only tell that things are one way rather than another way. On

the other hand, when we see light, we discriminate against

many more states of affairs, and thus generate much more

information. In fact, I argued that light means what it

means and becomes conscious lightby virtue ofbeing not

just the opposite of dark, but also different from any color,

any shape, any combination of colors and shapes, any frame

of every possible movie, any sound, smell, thought, and so on.

What needs to be emphasized at this point is that dis-

criminating light against all these alternatives implies not

just picking one thing out of everything else (an undif-

ferentiated bunch), but distinguishing at once, in a specific

way, between each and every alternative. Consider a very

simple example: a binary counter capable of discriminating

among the four numbers: 00, 01, 10, 11. When the counter

says binary 3, it is not just discriminating 11 from every-

thing else as an undifferentiated bunch, otherwise it would

not be a counter, but a 11 detector. To be a counter, the

system must be able to tell 11 apart from 00 as well as from10 as well as from 01 in different, specific ways. It does so,

of course, by making choices through its mechanisms; for

example: is this the first or the second digit? Is it a 0 or a 1?

Each mechanism adds its specific contribution to the dis-

crimination they perform together. Similarly, when we see

light, mechanisms in our brain are not just specifying light

with respect to a bunch of undifferentiated alternatives.

Rather, these mechanisms are specifying that light is what it

is by virtue of being different, in this and that specific way,

from every other alternativefrom dark to any color, to any

shape, movie frame, sound or smell, and so on.

In short, generating a large amount of integrated infor-mation entails having a highly structured set of mechanisms

that allow us to make many nested discriminations (choices)

as a single entity. According to the IIT, these mechanisms

working together generate integrated information by speci-

fying a set of informational relationships that completely

and univocally determine the quality of experience.

Experience as a shape in qualia space

To see how this intuition can be given a mathematical

formulation, let us consider again a complex of n binary

elements X(mech,x1) having a particular mechanism and

being in a particular state. The mechanism of the system is

implemented by a set of connections Xconn among its ele-

ments. Let us now suppose that each possible state of the

system constitutes an axis or dimension of a qualia space

(Q) having 2n dimensions. Each axis is labeled with the

probability p for that state, going from 0 to 1, so that a

repertoire (i.e., a probability distribution on the possible

states of the complex) corresponds to a point in Q (Fig. 5).

Let us now examine how the connections among the

elements of the complex specify probability distributions;

that is, how a set of mechanisms specifies a set of informa-

(x1)=

2

(a1)=

3

(b1)=

1

(s1)=

2

(x)= 11

()= 3

(s) = 1

() = 2

Figure 3. Complexes. In this system, the mechanism is that elements

fire in response to an odd number of spikes on their afferent connections

(links without arrows are bidirectional connections). Analyzing the systemin terms of integrated information shows that the system constitutes a

complex (x, light gray) that contains three smaller complexes (s,a,b, in

different shades of gray). Observe that (i) complexes can overlap; (ii) a

complex can interact causally with elements not part of it; (iii) groups of

elements with identical architectures (a and b) generate different amounts

of integrated information, depending on their ports-in and ports-out.

224 G. TONONI


10/27

0 2 4 6 80

1

2

3

4

Max

= 3.7

= .17

= 0

Elements firing

COMATOSE, BALANCED & EPILEPTIC SYSTEMS SLEEPING SYSTEM

= 4

= 3.6

= 1.9

= 3.6

= 1

CORTICOTHALAMIC SYSTEM

AFFERENT PATHWAYS CORTICAL-SUBCORTICAL LOOPS

A

f = 1.3

= .4

CEREBELLAR SYSTEM

f = 1.8

= 1.8

B

C D

time (ticks)

100

%a

ctivity

0

50

2

0

1

% active

0 20 40 60

INTEGRATED INFORMATION & NEUROANATOMY

E F

INTEGRATED INFORMATION & NEUROPHYSIOLOGY

Figure 4. Relating integrated information to neuroanatomy and neurophysiology. Elements fire in

response to two or more spikes (except elements targeted by a single connection, which copy their input); links

without arrows are bidirectional. (A) Computing in simple models of neuroanatomy suggests that a

functionally integrated and functionally specialized networklike the corticothalamic systemis well suited to

generating high values of . (B, C, D) Architectures modeled on the cerebellum, afferent pathways, and

cortical-subcortical loops give rise to complexes containing more elements, but with reduced compared to the

main corticothalamic complex. (E) peaks in balanced states; if too many or too few elements are active,

collapses. (F) In a bistable (sleeping) system (same as in (E)), collapses when the number of firing elements

(dotted line) is too high (high % activity), remains low during the DOWN state (zero % activity), and only

recovers at the onset of the next UP state.



11/27


12/27

formational relationship can be represented as an arrow in Q

(q-arrow) that goes from the point corresponding to the

maximum entropy distribution (p 1/2n) to the point cor-

responding to the actual repertoire specified by that connec-

tion. The length (divergence) of the q-arrow expresseshow

much the connection specifies the distribution (the effective

information it generates, i.e., the relative entropy between

the two distributions); the direction in Q expresses the

particular way in which the connection specifies the distri-

bution, i.e., a change in position in Q. Similarly, if one

considers all other connections taken in isolation, each will

specify another q-arrow of a certain length, pointing in a

different direction.

Next, consider all possible combinations of connections

(Fig. 5B). For instance, consider adding the contribution of

the second connection to that of the first. Together, the first

and second connections specify another actual repertoire

another point in Q-spaceand thereby generate more in-

formation than either connection alone as they shape theuniform distribution into a more specific distribution. To the

tip of the q-arrow specified by the first connection, one can

now add a q-arrow bent in the direction contributed by the

second connection, forming an edge of two q-arrows in

Q-space (the same final point is reached by adding the

q-arrow due to the first connection on top of the q-arrow

specified by the second one). Each combination of connec-

tion therefore specifies a q-edge made of concatenated q-

arrows (component q-arrows). In general, the more connec-

tions one considers together, the more the actual repertoire

will take shape and differ from the uniform (potential)

distribution.Finally, consider the joint contribution of all connections

of the complex (Fig. 5B). As was discussed above, all

connections together specify the actual repertoire of the

whole. This is the point where all q-edges converge. To-

gether, these q-edges in Q delimit a quale, that is, a shape

in Q, a kind of 2n-dimensional solid (technically, in more

than three dimensions, the body of a polytope). The

bottom of the quale is the maximum entropy distribution, its

edges are q-edges made of concatenated q-arrows, and its

top is the actual repertoire of the complex as a whole. The

shape of this solid (polytope) is specified by all informa-

tional relationships that are generated within the complex by

the interactions among its elements (the effective informa-

tion matrix; Tononi, 2004).7 Note that the same complex of

elements, endowed with the same mechanism, will typically

generate a different quale or shape in Q depending on the

particular state it is in.

It is worth considering briefly a few relevant properties of

informational relationships or q-arrows. First, informational

relationships are context-dependent (Fig. 6), in the follow-

ing sense. Acontextcan be any point in Q corresponding to

the actual repertoire generated by a particular subset of

connections. It can be shown that the q-arrow generated by

considering the effects of an additional connection (how it

further sharpens the actual repertoire) can change in both

magnitude and direction depending on the context in which

it is considered. In Figure 6, when considered in isolation

(null context), the connection r between elements 4 and 3

generates a short q-arrow (0.18 bits) pointing in a certain

direction. When considered in the full context provided by

all other connections (not-r or r), the same connection r

generates a longer q-arrow (1 bit) pointing in a different

direction.

Another property is how removing or adding a set of

connections folds or unfolds a quale. The portion of the

quale that is generated by a set of connections r (acting in all

contexts) is called aq-fold. If we remove connection r from

the system, all the q-arrows generated by that connection, in

all possible contexts, vanish, so the shape of the quale

folds along the q-fold specified by that connection. Con-

versely, when the connection is added to a system, the shape

of the quale unfolds.Another important property of q-arrows is entanglement

(, Balduzzi and Tononi, unpubl.). A q-arrow is entangled

( 0) if the underlying connections considered together

generate information above and beyond the information

they generate separately (note the analogy with ). Thus,

entanglement characterizes informational relationships (q-

arrows) that are more than the sum of their component

relationships (component q-arrows, Fig. 6B), just like

characterizes systems that are more than the sum of their

parts. Geometrically, entanglement warps the shape of the

quale away from a simple hypercube (where q-arrows are

orthogonal to each other). Entanglement has several rele-vant consequences (Balduzzi and Tononi, unpubl.). For

example, an entangled q-arrow can be said to specify a

concept, in that it groups together certain states of affairs in

a way that cannot be decomposed into the mere sum of

simpler groupings (see also Feldman, 2003). Moreover, just

as can be used to identify complexes, entanglement can

be used to identify modes. By analogy with complexes,

modesare sets of q-arrows that are more densely entangled

than surrounding q-arrows: they can be considered as clus-

ters of informational relationships constituting distinctive

sub-shapes in Q (see Fig. 8). By analogy with a main

complex, an elementary mode is such that its component

q-arrows have strictly lower . As will be briefly discussed

below, modes play an important role in understanding the

structure of experience.

Some properties of qualia space

What is the relevance of these constructs to understand-

ing the quality of consciousness? It is not easy to become

familiar with a complicated multidimensional space nearly

impossible to draw, so it may be useful to resort to some

metaphors. I have argued that the set of informational rela-



13/27

tionships in Q generated by the mechanisms of a complex in

a given state (q-arrows between repertoires) specify a shape

in Q (a quale). Perhaps the most important notion emerging

from this approach is that an experience is a shape in Q.

According to the IIT, this shape completely and univo-

cally8 specifies the quality of experience.

It follows that different experiences are, literally, differ-

ent shapes in Q. For example, when the same system is in a

different state (firing pattern), it will typically generate a

different shape or quale (even for the same value of ).

Importantly, if an element turns on, it generates information

and meaning not by signifying something (say red),

which in isolation it cannot, but by changing the shape of

the quale. Moreover, experiences are similar if their shape is

similar, and different to the extent that their shapes are

different. This means that phenomenological similarities

and differences can in principle be quantified as similarities

and differences between shapes. The set of all shapes gen-

erated by the same system in different states provides a

geometrical depiction of all its possible experiences.9

Note that a quale can only be specified by a mechanism

and a particular stateit does not make sense to ask about

the quale generated by a mechanism in isolation, or by a

state (firing pattern) in isolation. A consequence is that two

different systems in the same state can generate two differ-

ent experiences (i.e., two different shapes). As an extreme

example, a system that was to copy one by one the state of

the neurons in a human brain, but had no internal connec-

tions of its own, would generate no consciousness and no

quale (Tononi, 2004; Balduzzi and Tononi, 2008).

By the same token, it is possible that two different sys-

tems generate the same experience (i.e., the same shape).

.18 bits

1 bit

entanglement = .42 bits

r

r

r

r

r

r

1

2

3

4

1

2

3

4

1

2

3

4

1

2

3

4

B

A

NULL CONTEXTFULL CONTEXT

Figure 6. Context and entanglement. (A) Context. The same connection (black arrow between elements

3 and 4) considered in two contexts. At the bottom of the quale (null context, corresponding to the maximum

entropy distribution when no other connections are engaged), the connection r generates a q-arrow (called

down-set of r, or2r) corresponding to 0.18 bits of information pointing up-left in Q. Near the top of the quale(full context, corresponding to the actual distribution specified by all other connections except for r, indicated

as r), r generates a q-arrow (called up-set of non-red, or 1 r) corresponding to 1 bit of information pointingup-right in Q. (B) Entanglement. Left: the q-arrow generated by the connection r and the q-arrow generated

by the complementary connections r at the bottom of the quale (null context). Right: The product of the twoq-arrows (corresponding to independence between the informational relationships specified by the two sets of

connections) would be a point corresponding to the vertex of the dotted parallelogram opposite to the bottom.

However, r and r jointly specify the actual distribution corresponding to the top of the quale (black

triangle). The distance between the probability distribution in Q specified jointly by two sets of connections and

their product distribution (zigzag arrow) is the entanglement between the two corresponding q-arrows (how

much the composite q-arrow specifies above and beyond its component q-arrows).

228 G. TONONI


14/27

For example, consider again the photodiode, whose mech-

anism determines that if the current in the sensor exceeds a

threshold, the detector turns on. This simple causal interac-

tion is all there is, and when the photodiode turns on it

merely specifies an actual repertoire where states

(00,01,10,11) have, respectively, probability (0,0,1/2,1/2).

This corresponds in Q to a single q-arrow, one bit long,

going from the potential, maximum entropy repertoire (1/

4,1/4,1/4,1/4) to (0,0,1/2,1/2). Now imagine the light sensor

is substituted by a temperature sensor with the same thresh-

old and dynamic rangewe have a thermistor rather than a

photodiode. Although the physical device has changed,

according to the IIT the experience, minimal as it is, has to

be the same, since the informational relationship that is

generated by the two devices is identical. Similarly, an

AND gate when silent and an OR gate when firing also

generate the same shape in Q, and therefore must generate

the same minimal experience (it can be shown that the two

shapes are isomorphic, that is, have the same symmetries;Balduzzi and Tononi, unpubl.). In other words, different

physical systems (possibly in different states) generate the

same experience if the shape of the informational relation-

ships they specify is the same. On the other hand, more

complex networks of causal interactions are likely to create

highly idiosyncratic shapes, so systems of high are un-

likely to generate exactly identical experiences.

If experience is integrated information, it follows that

only the informational relationships within a complex (those

that give the quale its shape) contribute to experience.

Conversely, the informational relationships that exist out-

side the main complexfor example, those involving sen-sory afferents or cortico-subcortical loops implementing

informationally insulated subroutinesdo not make it into

the quale, and therefore do not contribute either to the

quantity or to the quality of consciousness.

Note also that informational relationships, and thus the

shape of the quale, are specified both by the elements that

are firing and by those that are not. This is natural consid-

ering that an element that does not fire will typically rule out

some previous states of affairs (those that would have made

it fire), and thereby it will contribute to specifying the actual

repertoire. Indeed, many silent elements can rule out, in

combination, a vast number of previous states and thus be

highly informative. From a neurophysiological point of

view, such a corollary may lead to counterintuitive predic-

tions. For example, take elements (neurons) within the main

complex that happen to be silent when one is having a

particular experience. If one were to temporarily disable

these neurons (e.g., make them incapable of firing), the

prediction is that, though the system state (firing pattern)

would remain the same, the quantity and quality of experience

would change (Tononi, 2004; Balduzzi and Tononi, 2008).

It is important to see what corresponds to in this

representation (Fig. 7A). The minimum information parti-

tion (MIP) is just another point in Q: the one specified by

the connections within the minimal parts only, leaving out

the contribution of the connections among the parts. This

point is the actual repertoire corresponding to the product of

the actual repertoires of the parts taken independently.

corresponds then to an arrow linking this point to the top of

the solid. In this view, the q-edges leading to the minimum

information bipartition provide the natural base upon

which the solid reststhe informational relationships gen-

erated within the parts upon which are built the informa-

tional relationships amongthe parts. The -arrow can then

be thought of as the height of the solidor rather, to

employ a metaphor, as the highest pole holding up a tent.

For example, if is zero (say a system decomposes into

two independent complexes as in Fig. 7B), the tent corre-

sponding to the system is flatit has no shapesince the

actual repertoire of the system collapses onto its base (MIP).

This is precisely what it means when 0. Conversely,

the higher the value of a complex (the higher the tent orsolid), the more breathing room there is for the various

informational relationships within the complex (the edges of

the solid or the seams of the tent) to express themselves.

In summary, and not very rigorously, the generation of an

experience can be thought of as the erection of a tent with

a very complex structure: the edges are the tension lines

generated by each subset of connections (the respective

q-arrow or informational relationship). The tent literally

takes shape when the connections are engaged and specify

actual repertoires. Perhaps an even more daring metaphor

would be the following: whenever the mechanisms of a

complex unfold and specify informational relationships, theflower of experience blooms.

From phenomenology to geometry

The notions just sketched aim at providing a framework

for translating the seemingly ineffable qualitative properties

of phenomenology into the language of mathematics, spe-

cifically, the language of informational relationships (q-

arrows) in Q. Ideally, when sufficiently developed, such

language should permit the geometric characterization of

phenomenological properties generated by the human brain.

In principle, it should also allow us to characterize the

phenomenology of other systems. After all, in this frame-

work the experience of a bat echo-locating in a cave is just

another shape in Q and, at least in principle, shapes can be

compared objectively.

At present, due to the combinatorial problems posed by

deriving the shape of the quale produced by systems of just

a few elements, and to the additional difficulties posed by

representing such high-dimensional objects, the best one

can hope for is to show that the language of Q can capture,

in principle, some of the basic distinctions that can be made

in our own phenomenology, as well as some key neuropsy-



15/27

chological observations (Balduzzi and Tononi, unpubl.). A

short list includes the following:

(i) Experience is divided into modalities, like the classic

senses of sight, hearing, touch, smell, and taste (and several

others), as well as submodalities, like visual color and visual

shape. What do these broad distinctions correspond to in Q?

According to the IIT, modalities are sets of densely entan-

gled q-arrows (modes) that form distinct sub-shapes in the

quale; submodalities are subsets of even more densely en-

tangled q-arrows (sub-modes) within a larger mode, thus

forming distinct sub-sub-shapes (Fig. 8). As a two-dimen-

sional analog, imagine a given multimodal experience as the

shape of the three-continent complex constituted by Europe,

Asia, and Africa. The three continents are distinct sub-

shapes, yet they are all part of the same landmass, just as

modalities are parts of the same consciousness. Moreover,

within each continent there are peninsulas (sub-sub-shapes),

like Italy in Europe, just as there are submodalities within

modalities.

(ii) Some experiences appear to be elementary, in that

they cannot be further decomposed. A typical example is

what philosophers call a quale in the narrow sensesay a

pure color like red, or a pain, or an itch: it is difficult, if not

impossible, to identify any further phenomenological struc-

ture within the experience of red. According to the IIT, such

elementary experiences correspond to sub-modes that do

not contain any more densely entangled sub-sub-modes

(elementary modes, Fig. 8).

C

A

=2 bits

MIP

1

1

1

1

0001

0010

0011

0100

0101

0110

0111

1010

10111

101

1001

110

011

10

0000

10

00

COPY

MIP{c

12,c

34}

{c12

}

{c34

}

{ }

COPY

1

2

3

4

B

1

1

1

1

0001

0010

0011

0100

0101

0110

0111

1000

1001

10101

011

1100

1110

11/21/1600000

1101

MIP1

2

3

4

D

Figure 7. The tent analogy. (A) The system of Fig. 2A / Fig. 5. (B) The q-edges converging on the

minimum information partition of the system (MIP) form the natural base on which the complex rests, depictedas a tent. The informational relationships among the parts are built on top of the informational relationships

generated independentlywithinthe minimal parts. From this perspective the q-arrow (in black) is simply the

tent pole holding the quale up above its base; the length (divergence) of the pole expresses the breathing room

in the system. The thick gray q-arrow represents the information generated by the entire system. (C) The system

of Fig. 2A. The quale (not) generated by the two photodiodes considered as a single system. As shown in Fig.

2A, the system reduces to two independent parts, so it does not exist as a single entity. (D) Note that in this case

the quale reduces to the MIP: the tent collapses onto its base, so there is no breathing room for informational

relationships within the system. The quale generated by each part considered in isolation does exist, corre-

sponding to an identical q-arrow for each couple.

230 G. TONONI


16/27

(iii) Some experiences are homogeneous and others are

composite: for example, a full-field experience of blue, as

when watching a cloudless sky, compared to that of a busymarket street. In Q, homogeneous experiences translate to a

single homogeneous shape, and composite ones into a com-

posite shape with many distinguishable sub-shapes (modes

and sub-modes).

(iv) Some experiences are hierarchically organized. Take

seeing a face: we see at once that as a whole it is some-

bodys face, but we also see that it has parts such as hair,

eyes, nose, and mouth, and that those are made in turn of

specifically oriented segments. The subjective experience is

constructed from informational relationships (q-arrows) that

are entangled (not reducible to a product of independent

components) across hierarchical levels. For example, infor-

mational relationships constituting face would be more

densely tangled than unnatural combinations such as seen in

certain Cubist paintings. The sub-shape of the quale corre-

sponding to the experience of seeing a face is then an

overlapping hierarchy of tangled q-arrows, embodying re-

lationships within and across levels.

(v) We recognize intuitively that the way we perceive

taste, smell, and maybe color, is organized phenomenolog-

ically in a categorical manner, quite different from, say,

the topographical manner in which we perceive space in

vision, audition, or touch. According to the IIT, these hard-

to-articulate phenomenological differences correspond to

different basic sub-shapes in Q, such as 2n-dimensional

grid-like structures and pyramid-like structures, which

emerge naturally from the underlying neuroanatomy.

(vi) Some experiences are more alike than others. Blue is

certainly different from red (and irreducible to red), but

clearly it seems even more different from middle C on the

oboe. In the IIT framework, in Q colors correspond to

different sub-shapes of the same kind (say pyramids point-

ing in different directions) and sounds to very different

sub-shapes (say tetrahedra). In principle, such subjective

similarities and differences can be investigated by employ-

ing objective measures ofsimilarity between shapes (e.g.,

considering the number and kinds of symmetries involved

in specifying shapes that are generated in Q by different

neuroanatomical circuits).

(vii) Experiences can be refined through learning and

changes in connectivity. Suppose one learns to distinguish

wine from water, then red wines from whites, then differentvarietals. Presumably, underlying this phenomenological

refinement is a neurobiological refinement: neurons that

initially were connected indiscriminately to the same affer-

ents become more specialized and split into sub-groups with

partially segregated afferents. This process has a straight-

forward equivalent in Q: the single q-arrow generated ini-

tially by those afferents splits into two or more q-arrows

pointing in different directions, and the overall sub-shape of

the quale is correspondingly refined.

(viii) Qualia in the narrow sense (elementary modes)

exist at the top of experience and not at its bottom.

Consider the experience of seeing a pure color, such as red.The evidence suggests that the neural correlate (Crick and

Koch, 2003) of color, including red, is probably a set of

neurons and connections in the fusiform gyrus, maybe in

area V8 (ideally, neurons in this area are activated whenever

a subject sees red and not otherwise, if stimulated trigger the

experience of red, and if lesioned abolish the capacity to see

red). Certain achromatopsic subjects with dysfunctions in

this general area seem to lack the feeling of what it is like

to see color, its coloredness, including the redness of

red. They cannot experience, imagine, remember, or even

dream of color, though they may talk about it, just as we

could talk about echolocation, from a third-person perspec-

tive (van Zandvoort et al., 2007). Contrast such subjects,

who are otherwise perfectly conscious, with vegetative pa-

tients, who are for all intents and purposes unconscious.

Some of these patients may show behavioral and neuro-

physiological evidence for residual function in an isolated

brain area (Posner and Plum, 2007). Yet it seems highly

unlikely that a vegetative patient with residual activity ex-

clusively in V8 should enjoy the vivid perceptions of color

just as we do, while being otherwise unconscious.

The IIT provides a straightforward account for this dif-

ference. To see how, consider again Figure 6A: call r the

Red

Color

Form

Sight

Quale

Sound

Figure 8. Modes. Schematic depiction of modes and sub-modes. A

mode, indicated by a polygon within the quale (light gray with black

border), is a set of q-arrows that are more densely entangled than surround-

ing q-arrows, and can be considered as clusters of informational relation-

ships constituting distinctive sub-shapes in Q. Two different modes

could correspond, for example, to the modalities of sight and sound. A

sub-mode within a mode is a set of q-arrows that is even more densely

entangled (a sub-sub-shape in Q). Color and form could correspond to two

sub-modes within the visual mode. The thin black polygon represents an

elementary mode, which does not contain more densely entangled q-arrows.

Elementary modes could correspond to experiential qualities that cannot be

further decomposed, such as the color red (qualia in the narrow sense.)



17/27

connections targeting the red neurons in V8 that confer

them their selectivity, and non-r (r) all the other connec-

tions within the main corticothalamic complex. Adding r in

isolation at the bottom of Q (null context) yields a small

q-arrow (called the down-set of redor 2r) that points in adirection representing how r by itself shapes the maximum

entropy distribution into an actual repertoire. Schematically,

this situation resembles that of a vegetative patient with V8

and its afferents intact but the rest of the corticothalamic

system destroyed. The shape of the experience or quale

reduces to this q-arrow, so its quantity is minimal ( for this

q-arrow is obviously low) and its quality minimally speci-

fied: as we have seen with the photodiode, r by itself cannot

specify whether the experience is a color rather than some-

thing else such as a shape, whether it is visual or not,

sensory or not, and so on.

By contrast, subtract r from the set of all connections, so

one is left with r. This lesion collapses the q-fold spec-

ified by r in all contexts, including the q-arrow, called theup-set of non-red(1r), which starts from the full contextprovided by all other connections r and reaches the top of

the quale.10 This q-arrow will typically be much longer and

point in a different direction than the q-arrow generated by

r at the bottom of the quale. This is because, the fuller the

context, the more r can shape the actual repertoire. Sche-

matically, removing r from the top resembles the situation

of an achromatopsic patient with a selective lesion of V8:

the bulk of the experience or quale remains intact ( re-

mains high), but a noticeable feature of its shape collapses

(the upset of non-red). According to the IIT, the feature of

the shape of the quale specified by the upset of non-redcaptures the very quality or redness of red.11

It is worth remarking that the last example also shows

why specific qualities of consciousness, such as the red-

ness of red, while generated by a local mechanism, cannot

be reduced to it. If an achromatopsic subject without the r

connections lacks precisely the redness of red, whereas a

vegetative patient with just the r connections is essentially

unconscious, then the redness of red cannot map directly to

the mechanism implemented by the r connections. How-

ever, the redness of red can map nicely onto the informa-

tional relationships specified by r, as these change dramat-

ically between the null context (vegetative patient) and the

full context (achromatopsic subject).

A Provisional Manifesto

To recapitulate, the IIT claims that the quantity of con-

sciousness is given by the integrated information () gen-

erated by a complex of interacting elements, and its quality

by the shape in Q specified by their informational relation-

ships. As I have tried to indicate here, this theoretical

framework can account for basic neurobiological and neu-

ropsychological observations. Moreover, the same frame-

work can be extended to begin translating phenomenology

into the language of mathematics.

At present, the very notion of a theoretical approach to

consciousness may appear far-fetched, yet the nature of the

problems posed by a science of consciousness requires a

combination of experiment and theory: one could say that

theories without experiments are lame, but experiments

without theories are blind. For instance, only a theoretical

framework can go beyond a provisional list of candidate

mechanisms or brain areas and provide a principled expla-

nation of why they may be relevant. Also, only a theory can

account, in a coherent manner, for key but puzzling facts

about consciousness and the brain, such as the association of

consciousness with the corticothalamic but not the cerebel-

lar system, the unconscious functioning of many cortico-

subcortical circuits, or the fading of consciousness during

certain stages of sleep or epilepsy.

A theory should also generate relevant corollaries. For

example, the IIT predicts that consciousness depends exclu-sively on the ability of a system to generate integrated

information: whether or not the system is interacting with

the environment on the sensory and motor side, it deploys

language, capacity for reflection, attention, episodic mem-

ory, a sense of space, of the body, and of the self. These are

obviously important functions of complex brains and help

shape its connectivity. Nevertheless, contrary to some com-

mon intuitions, but consistent with the overall neurological

evidence, none of these functions seems absolutely neces-

sary for the generation of consciousness here and now

(Tononi and Laureys, 2008).

Finally, a theory should be able to help in difficult casesthat challenge our intuition or our standard ways to assess

consciousness. For instance, the IIT says that the presence

and extent of consciousness can be determined, in principle,

also in cases in which we have no verbal report, such as

infants or animals, or in neurological conditions such as

minimally conscious states, akinetic mutism, psychomotor

seizures, and sleepwalking. In practice, of course, measur-

ing accurately in such systems will not be easy, but

approximations and informed estimates are certainly con-

ceivable. Whether these and other predictions turn out to be

compatible with future clinical and experimental evidence,

a coherent theoretical framework should at least help to

systematize a number of neuropsychological and neurobio-

logical results that might otherwise seem disparate (Albuset

al., 2007).

In the remaining part of this article, I briefly consider

some implications of the IIT for the place of experience in

our view of the world.

Consciousness as a fundamental property

According to the IIT, consciousness is one and the same

thing as integrated information. This identity, which is

232 G. TONONI


18/27

predicated on the phenomenological thought experiments at

the origin of the IIT, has ontological consequences. Con-

sciousness exists beyond any doubt (indeed, it is the only

thing whose existence is beyond doubt). If consciousness is

integrated information, then integrated information exists.

Moreover, according to the IIT, it exists as a fundamental

quantityas fundamental as mass, charge, or energy. As

long as there is a functional mechanism in a certain state, it

must existipso factoas integrated information; specifically,

it exists as an experience of a certain quality (the shape of

the quale it generates) and quantity (its height ).12

If one accepts these premises, a useful way of thinking

about consciousness as a fundamental property is as fol-

lows. We are by now used to considering the universe as a

vast empty space that contains enormous conglomerations

of mass, charge, and energygiant bright entities (where

brightness reflects energy or mass) from planets to stars to

galaxies. In this view (that is, in terms of mass, charge, or

energy), each of us constitutes an extremely small, dimportion of what existsindeed, hardly more than a speck of

dust.

However, if consciousness (i.e., integrated information)

exists as a fundamental property, an equally valid view of

the universe is this: a vast empty space that contains mostly

nothing, and occasionally just specks of integrated informa-

tion ()mere dust, indeedeven there where the mass-

charge energy perspective reveals huge conglomerates. On

the other hand, one small corner of the known universe

contains a remarkable concentration of extremely bright

entities (where brightness reflects high ), orders of mag-

nitude brighter than anything around them. Each bright-star is the main complex of an individual human being

(and most likely, of individual animals).13 I argue that such

-centric view i

Date post:	04-Jun-2018
Category:	Documents
Upload:	petercet
View:	218 times
Download:	0 times

conciousness as integrated information - Giulio Tononi.pdf

Documents