7/27/2019 BrainImagingandBehavior2011.pdf
1/17
ORIGINAL RESEARCH
The perception of harmonic triads: an fMRI study
Takashi X. Fujisawa & Norman D. Cook
# Springer Science+Business Media, LLC 2011
Abstract We have undertaken an fMRI study of harmony
perception in order to determine the relationship betweenthe diatonic triads of Western harmony and brain activation.
Subjects were 12 right-handed, male non-musicians. All
stimuli consisted of two harmonic triads that did not contain
dissonant intervals of 1 or 2 semitones, but differed
between them by 0, 1, 2 or 3 semitones and therefore
differed in terms of their inherent stability (major and minor
chords) or instability (diminished and augmented chords).
These musical stimuli were chosen on the basis of a
psychoacoust ical model of triadic harmony that has
previously been shown to explain the fundamental regularities
of traditional harmony theory. The brain response to the
chords could be distinguished within the right orbitofrontal
cortex and cuneus/posterior cingulate gyrus. Moreover, the
strongest hemodynamic responses were found for conditions
of rising pitch leading from harmonic tension to modal
resolution.
Keywords fMRI . Harmony . Major. Minor. Orbitofrontal
cortex . Psychoacoustics . Sound symbolism . Frequency
code . Harmony map
Introduction
The relative consonance (dissonance) of two-tone musical
intervals has been studied psychophysically since Helmholtz
(1877) and quantitative models have successfully explained
the experimental pattern of interval perception, as reported
by children and adults, musicians and non-musicians, and
peoples from the East and the West (e.g., Plomp and Levelt
1965; Kameoka and Kuriyagawa 1969). The key insight that
has allowed for successful modeling of interval perception is
consideration of the role of the higher harmonics (~upper
partials). Unfortunately, application of the same interval
perception model to three-tone triads has not been
successful in explaining the phenomena of musical harmony.
If only interval dissonance effects (including those entailed
by higher harmonics) are considered, the distinction between
resolved and unresolved triads cannot be explained, and the
different affective valence of major and minor chords
remains a mystery.
Such difficulties have led us to develop a model of
harmony perception that includes a three-tone tension
factor. Harmonic tension, as defined by Leonard Meyer
(1956), is a consequence of three-tone pitch patterns where
the middle-tone lies exactly midway between the upper and
lower tones. We have converted that musical insight into a
psychophysical model by proposing a theoretical tension
curve that can be used to calculate the tension effects of all
combinations upper partials. Details of the model can be
found in the literature (Cook2001, 2002, 2007, 2009, 2011;
Cook and Fujisawa 2006; Cook et al. 2006, 2007; Cook
and Hayashi 2008; Fujisawa 2004). Suffice it to say that,
when both 2-tone dissonance and 3-tone tension are
included in theoretical calculations, the well-known per-
ceptual regularities of the harmonic triads and the incidence
of their historical usage in classical music (Eberlein 1994)
No animals were used in this research, which was partially supported
by university research grants and does not involve any financial
relationship between the authors and those institutions.
T. X. Fujisawa
Graduate School of Biomedical Sciences, Nagasaki University,
Nagasaki, Japan
N. D. Cook (*)
Department of Informatics, Kansai University,
Takatsuki, Osaka, Japan 569-1095
e-mail: [email protected]
Brain Imaging and Behavior
DOI 10.1007/s11682-011-9116-5
7/27/2019 BrainImagingandBehavior2011.pdf
2/17
can be explained psychophysically without borrowing
qualitative notions from traditional harmony theory and
without resorting to cultural explanations of harmony
perception. On the basis of that model, we have undertaken
an fMRI study of harmony perception in order to determine
the relationship between the common harmonic triads
psychophysically-definedand brain activation.
The psychophysics of interval perception and harmony
perception
Psychophysical models from the 1960s coherently explain
the regularities of interval perception (Sethares 1999) by
postulating: (i) the presence of a critical band of roughness
(dissonance) in the vicinity of 12 semitones (Fig. 1a), and
(ii) the cumulative effects of the dissonance among all
combinations of fundamental frequencies and upper partials
(Fig. 1b). The resulting dissonance curve for all intervals
within one octave shows notable decreases in the total
dissonance at intervals corresponding to most of the tonesof the diatonic scales (explained in terms of the physiology
of the cochlear membrane rather than on the basis of
Renaissance ideas concerning integer ratios). Experimental
data on interval perception match this theoretical curve
reasonably well (e.g., Kameoka and Kuriyagawa 1969)
(Fig. 1b) and suggest why diatonic scales and their subsets
(principally, pentatonic scales) are used worldwide in so
many different musical traditions.
Although the dissonance curve (Fig. 1) continues to be
an important success in the science of music perception, the
total dissonance of triads (as calculated from all combina-
tions of partials in the triads) does not explain the results
from behavioral experiments on the perception of such
chords (Table 1; Fig. 2). The usual explanation of this
theoretical failure is that there is a firm (perhaps universal)
psychophysical basis for the perception of 2-tone intervals
(Fig. 1b), but that the perception of more complex musical
stimulistarting with 3-tone triadsis dominated by
learning effects (musical traditions, training, etc.) that make
the psychophysics of interval perception relatively unim-
portant in the perception of real music.
Empirically, the main objection to a cultural explana-
tion of harmony is the fact that normal subjects from
various musical cultures and very young children with only
minimal exposure to music and without musical training,
distinguish among the common triads, and perceive the
resolved/unresolved character and affective valence of
major and minor chords in a consistent way. Specifically,
augmented and diminished chords are perceived as beingrather unstable, as compared to major and minor chords
(Roberts 1986; Cook et al. 2007). Similarly, keeping all
other variables constant, major chords are perceived as
being relatively strong, happy and bright with a
positive affective valence, compared to the slight negative
affect of the minor chords (Kastner and Crowder 1990).
These common perceptions remain inexplicable on the
basis of the calculated consonance/dissonance of intervals.
We have consequently introduced a three-tone tension
factor to the psychophysical model specifically to solve the
theoretical difficulty of explaining diatonic chord percep-
tion solely on the basis of interval dissonance. Byconsidering the relative size of the two neighboring
intervals in any triad (Fig. 2d) (Meyer 1956), the total
instability can be calculated as the sum of two-tone
dissonance and three-tone tension, and the results com-
pared against experimental data on chord perception. As
shown in Table 1, the model predictions of the relative
stability of the most important triads in Western diatonic
music (last column) agree well with both the results of
behavioral experiments (columns 3 and 4) and historical
usage (column 2).
The circle of fifths and the cycle of modes
By making a distinction between interval dissonance
(Fig. 1a) and triadic tension (Fig. 2d), we have found that
it is possible to explain the regularities of traditional
diatonic harmony on a strictly psychophysical basis. That
is, any triad containing a whole-tone or semitone disso-
nance will be unstable (requiring harmonic resolution
through the movement of one or more tones to produce a
stable triad) solely as a consequence of the dissonant
Fig. 1 The psychophysical
model of 2-tone intervalperception (Plomp and Levelt
1965) a and the total dissonance
curve b obtained when upper
partials are also included in the
calculations. The theoretical
curve in (b) and the experimen-
tal data (filled circles) are from
Kameoka and Kuriyagawa
(1969)
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
3/17
interval (regardless of the position of the third tone).
Among the remaining ten non-dissonant triads in traditionalharmony theory (combinations of three tones within one
octave of the 12-tone scale without small intervals of 1 or 2
semitones and without pitch class repetition), some are
perceived as harmonically stable and resolved, while
others are perceived as unstable and unresolved. All ten
of these chords consist of intervals of 3~6 semitones, but
their harmonic qualities differ remarkably depending on the
interval substructure. Some have the sonority of the major
mode (triads with intervals of 43, 35 and 54 semitones),
some are minor (intervals of 34, 45 and 53 semitones)
and some are inherently tense, unresolved and amodal
(the diminished chords with intervals of 33, 36, 63semitones and the augmented chord with intervals of 44
semitones, i.e., the tension chords). Clearly, no interval
alone determines the major/minor/tension character of the
chord, nor is the total span of the triad (6~9 semitones) a
decisive factor. On the contrary, the factor that determines
the harmonic stability of consonant triads is the difference
in the magnitude of the two intervals contained in any triad.
In other words, harmonic sonority is a function of both 2-
tone effects (consonance and dissonance) and 3-tone effects
(tension and resolution).Aside from issues concerning the perception of conso-
nant and dissonant intervals, we have proposed a Cycle of
Modes that summarizes the acoustical relationships among
major, minor and tension chords. The Cycle expresses the
fact that, starting with any of the 10 non-dissonant triads, a
semitone increase or decrease in any tone will alter the
harmonic mode one step clockwise or counter-clockwise
(Fig. 3a). A semitone decrease from tension leads to a
major chord and a semitone increase leads to a minor
chord, regardless of which tone is raised or lowered.
Further semitone steps lead progressively to minor, tension
and major chords (moving counter-clockwise with semitonefalls) or to major, tension and minor chords (moving
clockwise with semitone rises) and continue indefinitely,
provided only that dissonant intervals of 12 semitones are
avoided (Fig. 3c). Although the regularities of harmony
perception are usually described in terms of the Circle of
Fifths from traditional harmony theory (Fig. 3b), we have
shown that the harmonic relations embodied in the Circle of
Fifths can themselves be explained in terms of the
Table 1 The empirical and theoretical sonority of the common triads
Empirical Sonority Theoretical Sonority Predicted by
Incidence in
classical music
Evaluation in laboratory
experiments
Various interval models Our model
Eberlein (1994) Roberts
(1986)
Cook et al.
(2007)
Plomp and
Levelt (1965)
Kameoka and
Kuriyagawa (1969)
Parncutt
(1989)
Sethares
(1999)
Cook and
Fujisawa (2006)
Major 1 (51%) 1 1 2 2 2 2 1
Minor 2 (37%) 2 2 2 2 3 2 2
Dim 3 (9%) 3 4 5 4 4 4 4
Sus4 4 (2%) 3 1 1 1 3
Aug 5 (
7/27/2019 BrainImagingandBehavior2011.pdf
4/17
psychophysically-defined Cycle of Modes. Specifically,
three consecutive semitone steps in the Cycle of Modeslead from, for example, one major chord to a second near-
by major chordand these correspond to transitions
among the tonic, dominant and subdominant chords in
any chosen key in traditional harmony theory (the neigh-
boring keys in the Circle of Fifths, Fig. 3b).
We have discussed the musical implications of the Cycle
of Modes elsewhere (Cook 2002, 2009, 2011; Cook and
Fujisawa 2006; Cook and Hayashi 2008); here we note
only thatin contrast to the complexities of traditional
harmony theorythe extreme simplicity of the relation-
ships among the triads (as summarized in Fig. 3a) means
that straightforward psychophysical experiments on three-tone harmonies and triadic cadences are possible. That is,
using the acoustical regularities summarized in the Cycle of
Modes, harmony perception can be reduced to two essential
phenomena: (i) the emotionally-ambivalent tension of
intervallic equivalence (Meyer 1956), and (ii) the resolution
of tension by either a semitone rise in pitch (to the
characteristic affect of the minor mode) or a semitone fall
in pitch (to the characteristic affect of the major mode).
Therefore, before proceeding to the genre-dependent
complexities of harmonic movement in real music, it is
both possible and desirable to explore the basic phenomena
of harmony through semitone steps around the Cycle of
Modes, with full control over the acoustical signal and its(minimal) musical context. The Cycle of Modes implies
that there are only three classes of (non-dissonant) triad
major (M), minor (m) and unresolved tension (T), so that
the simplest set of triad-to-triad cadences consists of 21
chord pairs: a first triad (M, m or T) followed by a second
triad (M, m or T) formed by raising or lowering the tones of
the first triad by 0, 1, 2 or 3 semitones (avoiding all
triads containing a dissonant interval). Given the three basic
modes and these seven nearest transitions, the triad
combinations shown in Table 2 constitute the basic set of
all (non-dissonant) harmonic transitions in diatonic music.
This set of triadic transitions was used as the stimuli in thepresent fMRI experiment.
The basic prediction from our psychophysical model was
that, even without a complex musical context (typical of
most brain-imaging studies of harmony), there should be
characteristic brain responses to the three distinct harmonic
modes. In light of previous work indicating particularly
strong activation of the right orbitofrontal and inferior
prefrontal cortex in response to affective stimuli and/or
musical harmony, we anticipated that the affective valence
for harmonic stability would be distinguishable in these
areas of the cerebral cortex of the right hemisphere.
Fig. 3 a The Cycle of Modes expresses the semitone relationships among all non-dissonant triads. b The Circle of Fifths in traditional harmony
theory. c A segment of the endless sequence of modality changes summarized in the Cycle of Modes
Table 2 The harmonic conditions used in the fMRI experiment
Magnitude of change (in st steps) 3 2 1 0 +1 +2 +3
Chord pairs ending in major M-M m-M T-M M-M m-M T-M M-M
Chord pairs ending in minor m-m T-m M-m m-m T-m M-m m-m
Chord pairs ending in tension T-T M-T m-T T-T M-T m-T T-T
The numerals in the top row indicate the number of semitones by which the second chord differs from the first chord. The two characters in each
cell below indicate the musical mode of the two triads: M (major), m (minor), and T (amodal tension, i.e., diminished or augmented chords)
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
5/17
Moreover, as we have argued in detail elsewhere (Cook
2002, 2009, 2011), the direction of pitch changes in moving
among the harmonic modes in the Cycle of Modes parallels
the affect of the so-called frequency code or sound
symbolismknown from ethology (Morton 1977) and
linguistics (Ohala 1983). That is, a rise in pitch in human or
animal vocalizations is typically associated with an emo-
tional state of weakness, deference or withdrawal, whereasa fall in pitch is associated with strength, assertion or
territorial dominance. In human languages, the universality
of sound symbolism is seen in the cross-cultural use of
pitch rises in interrogatives and pitch falls in commands
(Ohala 1983)again, with the affective connotation of
weakness or strength, respectively. In the realm of
music, the Cycle of Modes indicates that a fall in pitch from
a state of harmonic tension corresponds to the affective
valence of the major mode (the positive affect of
strength), while a rise in pitch from tension corresponds
to the affective valence of the minor mode (the negative
affect of weakness). Although we have no predictionabout the brain localization of the affect entailed by sound
symbolism, we anticipated that the unambiguous regulari-
ties of sound symbolism and the Cycle of Modes might be
reflected in related brain responses.
Experimental procedure
Materials and methods
Subjects 12 right-handed, undergraduate, Japanese males
between the ages of 20 and 24 served as subjects. None
were musically-trained, but all were familiar with both
traditional Japanese and popular Western music. Both
handedness and musical training were evaluated by self-
report. Subjects gave written informed consent in compli-
ance with the ethical procedures of BAIC at ATR (Ltd.,
Kyoto), and participated for a monetary reward.
Stimuli All musical stimuli consisted of two sequential
three-tone grand piano chords from an equitempered 12-
tone scale (triads); each chord was 1.5 s in duration. The
two chords were followed by a 3 s pause during which a
motor response was required. All of the chords were of
three types: major chords, minor chords and unresolved
chords that did not contain dissonant intervals (tension
chords). See Table 2. White noise conditions of 3 s
duration were also presented in control blocks.
All stimulus conditions were identical in including a 3.0 s
auditory stimulus, delivered over non-magnetic headphones
(Hitachi Advanced Systems AS-3000H, fMRI-use headsets)
at a comfortable auditory level (approximately 8590 dB
SPL). The headphones were designed to minimize noise
(above 20 dB at 1 kHz) and allow for accurate perception of
the stimulus over the noise of the MRI equipment (see
Behavioral Results, below). Conditions differed solely in
terms of the nature of the two chords presented and,
therefore, the nature of the pitch changes from the first to
the second chord. As shown in Table 2, the second chord can
be characterized as being a transition from the first chord due
to a rise or fall in pitch of 0, 1, 2 or 3 semitones. Because ofthe known regularities of traditional Western harmony, this
change in pitch between the two chords can also be
characterized in terms of a change in musical mode, as
specified in the Cycle of Modes (Fig. 3).
Stipulating only that chords with small dissonant
intervals (of 1 or 2 semitones) are excluded, it is a
regularity of traditional harmony theory that a semitone
increase in one of the pitches in a tension chord will result
in a minor chord, whereas a semitone decrease will lead to
a major chord. Using the augmented chord as an example,
the transition from tension to a resolved minor chord meant
a change from a triad consisting of two 4-semitone intervals(e.g., C-E-G#) to minor chords with intervals of 3 and 4
semitones, 4 and 5 semitones, or 5 and 3 semitones (e.g.,
C#-E-G#, C-F-G# or C-E-A). Contrarily, the transition from
a tension chord to a resolved major chord meant a change
from a 4-4 triad to a major chord with intervals of 4 and 3,
3 and 5, or 5 and 4 semitones (e.g., B-E-G#, C-D#-G# or
C-E-G). Musical definitions of these chords, their inver-
sions and their musical usage are extremely complex, but
their acoustical description in terms of interval size is
unambiguous (Fig. 3a & c).
Furthermore, following the Cycle of Modes, pitch
increases or decreases of two or three semitone steps
produce equally unambiguous changes in modality. For
example, an increase of 2 semitones from a pitch
combination that is a major chord results in a minor chord,
whereas an increase of 3 semitones implies a change from
one major chord to a different major chord. Similarly
unambiguous regularities are found for minor and tension
chords (see Cook 2009, for a full explication of the Cycle
of Modes and its relation to the Circle of Fifths of
traditional harmony theory).
Although traditional harmony theory is not usually
discussed in terms of its underlying psychophysics or the
number of semitone steps in cadences leading from one
chord to another, the regularity inherent to the 12-tone
equitempered scale and the triads implies a distinct pattern
of characteristic harmonies in relation to small (1~3
semitone) alterations in pitch. This regularity allows for
an extremely simple experimental design for the study of
harmony perceptiona design that can be described either in
the (complex, genre-dependent) terminology of traditional
music theory or in the relatively simple psychophysical terms
of interval size (Table 2).
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
6/17
Task The task for the experimental subjects was to indicate
with a button press whether the second chord differed from
the first chord due to an increase, a decrease or no change
in pitch(es). No stimuli had both pitch increases and
decreases. The major, minor or tension modality of the
chords was irrelevant to the subjects task, but was crucial
for data analysis in terms of how the brain responded to the
different harmonic stimuli.
fMRI technique
Procedure A standard block design was used in which 5
chord pairs from the same condition were presented at a
rate of 1 every 6-s (an inter-stimulus interval of 3 s). This
was followed by presentation of 3 white-noise control
stimuli. One block of each of the 21 chord types was
presented in rand om order for each of 3 runs that
constituted a complete session for each subject. One session
was approximately 50 min in duration (Fig. 4). To recordsubject responses, a 3-button response-pad was fitted to the
right hand. Subjects were required to respond with one button
press per stimulus during the three second ISI before the
presentation of the next stimulus: an index finger press
indicated a decrease in pitch, a middle finger press indicated
no pitch change, and a ring finger press indicated an increase
in pitch. Subjects were not informed of the block design and
responded to each stimulus in each block. For the white-noise
control stimuli, they were required to respond at random
with a button press. All subjects practiced with the various
types of stimuli prior to fMRI scanning.
Imaging Brain imaging was performed in a 1.5 Tesla
Marconi Magnex Eclipse scanner using an interleaved
sequence. First, high-resolution anatomical T2 weighted
images were acquired using a fast spin echo sequence.
These scans consisted of 50 contiguous axial slices with a
0.75 0.75 3 mm voxel resolution covering the cerebral
cortex and cerebellum. Secondly, functional T2* weighted
images were acquired using a gradient echo-planar imaging
sequence (echo time, 55 ms; repetition time, 6,000 ms; flip
angle, 90). A total of 50 contiguous axial slices were
acquired with a 333 mm voxel resolution.
Data analysis Images were preprocessed using programs
within the SPM2 software package (Wellcome Department of
Cognitive Neurology, London, UK). Differences in acquisition
time between slices were accounted for, movement artifact was
removed, and the images were then spatially normalized to a
standard space using a template EPI image (Bounding Box,
x=90 to 91 mm, y=126 to 91 mm, z=72 to 109 mm;
voxel size, 333 mm). Images were smoothed using an
8-mm FWHM Gaussian kernel.
Regional brain activity for the various conditions was
assessed on a voxel-by-voxel basis. A random effects
model was employed for group analysis in a second stagefollowing individual analysis). The data were modeled
using a box-car function convolved with the hemodynamic
response function. In addition, global normalization and
grand mean scaling were carried out.
In effect, the data for each of the 21 conditions were
collected from the three runs per subject and the grand
mean calculated using the data for all 12 subjects.
Results
Behavioral results
All 12 subjects performed above a chance level (33%) in
the detection of the direction of pitch change (up, down, or
Fig. 4 The experimental protocol. The session for each subject
consisted of three consecutive runs, divided into 21 randomized
blocks (consisting of 8 scans over 48 s), corresponding to the
harmonic conditions in Table 2. The 5 repetitions in each block were
of the same harmonic condition (e.g., M-m: with a semitone fall from
the initial major chord to the final minor chord; T-m: with a semitone
rise from the initial tension chord to the final minor chord; etc.) played
at different regions on the keyboard. This type of block design led to
relatively robust responses to a specific harmonic condition over a
time interval of 30 s. The sequence of conditions in the 21 blocks was
randomized separately in each of the three runs per subject
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
7/17
no change) between the first and second triad. Correct
responses ranged from 42 to 95%, with an average of 69%.
As shown in Fig. 5, there was consistently high detection of
the no change conditions. Conditions with 1 semitone
change were significantly more difficult than for cadences
entailing pitch changes of 2 or 3 semitones, but no
significant differences were found among any of the
harmonic conditions beginning (72%, 70% and 67%) orending (72%, 69% and 69%) with major, minor or tension
chords, respectively. Although the six most difficult
conditions were those in which the two chords differed by
one semitone, these included both conditions showing
strong hemodynamic increases and weak hemodynamic
increases (Sound Symbolism), suggesting that task difficulty
alone was not the factor determining the brain response.
There was a small improvement in performance over the
course of the three sessions (63, 70 and 76%), reaching
statistical significance (n=21, t=2.45, p
7/27/2019 BrainImagingandBehavior2011.pdf
8/17
allows one to determine the relative activation in paired
conditions for specific regions of interest (ROI). Here we
have defined ROIs as all areas in which there was
significantly increased activity relative to the white noise
condition, and made pair-wise comparisons among the
major, minor and tension chords. The result of ANOVA
using chord types as factors showed a main effect in two
regions: right orbitofrontal cortex (OFC, BA47) (F(2,22)=
5.93, p
7/27/2019 BrainImagingandBehavior2011.pdf
9/17
Discussion
Harmony conditions minus baseline (white noise)
Frontal cortex IFC (BA47/13) regions have been reported
to be involved in the processing of emotional responses
(Wright et al. 2004; Janata 2009) and musical priming
(Tillmann et al. 2003). As reported by Khalfa et al. (2005),
Koelsch et al. (2005), and Levitin and Menon (2005),
emotions based on anticipation in musical progressions
evoke IFC responses primarily in the right hemisphere.
Presumably, because we used two-chord progressions as
stimuli, this area was also activated in our study. Activa-
tions of dlPFC (BA46)/dorsal Brocas area (BA 44/45) have
been reported in various studies using musical stimuli.
Because this region in the right hemisphere corresponds to
Brocas area in the left, Koelsch et al. (2006) have
suggested that it is involved in processing musical syntax.
Brown and colleagues have suggested that this area is
involved in template matching for musical elements (Brown
et al. 2004; Brown and Martinez 2007).
Temporal cortex Contrary to our expectations, primary
auditory cortex (BA41/22) showed significant activation
in response to the white noise as a rest condition. Listening
to musical stimuli activates auditory cortex more strongly
(Brown et al. 2004; Brown and Martinez 2007) and
Heschls gyrus (BA22) is also involved in pitch processing
Table 3 Regions of significant brain activation (harmony minus white noise)
Lobe Region BA Left t-score Right t-score
Talairach coordinate (x, y, z) Talairach coordinate (x, y, z)
Frontal Inferior prefrontal gyrus 47/13 30 19 4 7.15 34 23 3 10.89
Middle frontal gyrus 8/9 44 12 40 6.78
Inferior frontal gyrus 9
42 7 27 6.30 46 11 31 5.48Middle frontal gyrus 6 24 6 44 6.54
34 2 50 5.69 34 1 53 6.75
Dorsolateral prefrontal cortex 46 40 18 19 8.57
48 34 19 7.40
Dorsal Brocas area 44/45 46 16 14 7.17
IFG/MFG 47 42 35 2 7.22
Orbitofrontal cortex 11/10 26 52 13 6.93
SFG Medial 8/6 6 20 49 8.06
2 14 56 7.24
4 27 37 6.92
Temporal Primary auditory cortex 41/22 57 21 8 9.51 61 21 5 7.10
Heschis gyrus 22
48
10
1 9.27 50
8
1 6.39
STG/Insula 13/22 44 17 6 7.79 46 2 5 10.30
Superior temporal gyrus 38 48 7 10 6.40 38 3 14 6.51
Parietal Inferior parietal lobule 40 44 39 41 6.56
44 31 42 6.16
40 46 43 5.87 36 43 41 6.54
Occipital Cuneus/posterior cingulate gyrus 18/31/23 10 69 13 9.38
18 0 75 11 6.01
Cerebellum Culmen 4 58 4 7.00
Declive 28 61 19 8.39 30 61 20 6.87
24 65 17 6.34
12 77 16 9.21
Others Caudate 10 3 17 7.89
16 5 13 6.21 14 7 14 5.82
Thalamus 6 3 13 7.81
Lentiform nucleus 20 4 2 7.22
Coordinates refer to standard stereotaxic space (Talairach and Tournoux 1988), t-scores are FDR-corrected, with a threshold set at p=0.005, and
voxel extent k=14
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
10/17
(Zatorre 2001). The musical stimuli have distinct pitches
while white noise does not, so that it is expected that this
region would show activation in response to harmony. The
activation of STG (BA38) has been reported in several
previous studies on harmony processing (e.g. Satoh et al.
2001, 2003; Brown et al. 2004; Brown and Martinez 2007).
We observed similar activation in the present study.
Cerebellum We have found small foci of activation in the
cerebellum in several different harmonic conditions, particu-
Fig. 8 The percent signal change detected using the MarsBar
algorithm for comparison of the brain responses to stimuli ending in
major, minor or tension chords. The statistics are not strong, but the
sequence is qualitatively the same as (a) or the inverse of (b) the
stability sequence found in behavioral studies (e.g., Roberts 1986)
Fig. 7 Comparison of the brain
responses to stimuli ending in
minor, tension or major chords
(FDR-corrected, threshold set
at p=0.02)
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
11/17
larly those involving harmonic tension (Figs. 9c and 10b~d).
Previous studies have also pointed out activation in the
cerebellum when listening to music (Levitin and Menon
2003; Tillmann et al. 2003; Pallesen et al. 2005). The
involvement of the cerebellum in rhythm production and
possibly perception is likely, but Levitin (2006) suggests that
it is implicated in emotional processing. This is a topic in
need of further study.
Fig. 9 Subtracting out the activation obtained in a white noise
condition, panels a, b and c show the brain activation in the three
conditions where there was a semitone rise between the first chord and
the second chord. a shows the weak activation in moving from a
major chord to a tension chord. b shows the modest activation in
moving from a minor chord to a major chord. And c shows the
stronger activation in moving from a tension chord to a minor chord.
Panel d demonstrates that the strong activation shown in (c) is not
simply a matter of the starting and ending chords (tension and minor),
but a combination of such chords and the direction of tonal change (a
semitone rise) (all images, uncorrected p
7/27/2019 BrainImagingandBehavior2011.pdf
12/17
Chord type (major, minor and tension)
Many previous studies have suggested that orbitofrontal
cortex or the cuneus/posterior cingulate gyrus is involved in
harmony perception (e.g. Satoh et al. 2001, 2003; Brown et
al. 2004; Brown and Martinez 2007), consistent with our
own findings. We have also studied the relationship
between the three general types of non-dissonant chords
and brain activation, and found that the right orbitofrontal
cortex has a negative and the cuneus/posterior cingulate
gyrus has a positive correlation with chordal instability. The
same tendency was found in a previous study in which the
orbitofrontal cortex has a negative and the precuneus has a
positive correlation with the dissonance level (Blood et al.
1999). Moreover, Pallesen et al. (2005) showed that minor
and dissonant chords elicited larger activation than did
major chords around the cuneus/posterior cingulate gyrus.
In general, chords containing dissonant intervals give an
impression of instability, but these brain regions showed
relatively strong activation even to the tension chords that
are, technically, not notably dissonant (Cook and Fujisawa
2006). From the similar trends seen in our study, we
suggest that these sites are activated by the higher-level
acoustical feature of 3-tone tension/non-tension rather than
the acoustical feature of 2-tone consonance/dissonance.
Although our findings on the relationship between chord
type and the affective impression of various harmonies are
yet preliminary, they clearly indicate that quantitative study
of the brain response to 2-tone vs. 3-tone acoustical
properties of chords is possible. In light of the importance
Table 4 (continued)
Lobe Region BA Left t-score Right t-score
Talairach coordinate (x, y, z) Talairach coordinate (x, y, z)
tension(+1)minor
Frontal Superior Frontal Gyrus 8 4 17 51 8.93
Precentral Gyrus/MFG 6/9 52 1 27 7.54 43 18 30 9.14
Middle Frontal Gyrus 6 32 2 51 5.02
MFG/IFG 10 30 49 11 4.76 41 48 3 5.72
Temporal Transverse Temporal Gyrus 41 57 21 9 5.14
Parietal Inferior Parietal Lobule 40 46 35 43 7.66 39 49 42 6.59
Sub-lobar Insula 13 46 8 1 3.92
Claustrum 29 18 2 4.41
Cerebellum Declive 31 71 20 4.60 31 67 21 5.14
Declive 7 78 18 7.48
1 37 27 6.71
tension(2)minor
Frontal Middle Frontal Gyrus 6 26 8 41 6.56
Precentral Gyrus 6
46
1 51 4.91Temporal Superior Temporal Gyrus 22 41 25 5 6.10
Parietal Inferior Parietal Lobule 40 43 28 38 6.81
Postcentral Gyrus 3 32 25 41 6.19
Postcentral Gyrus 43 57 12 17 5.64
Occipital Precuneus 31 4 69 21 5.10
Limbic Cingulate Gyrus 32/24 5 23 40 6.49 9 23 26 4.97
Cingulate Gyrus 31 19 22 39 6.20
Posterior Cingulate 30 19 66 6 6.87
Sub-lobar Insular 13/45 29 24 13 4.93 30 25 3 4.73
Cerebellum Declive 37 63 22 7.21
Declive 6 71 9 7.08
21
33
23 5.90
Uncorrected p
7/27/2019 BrainImagingandBehavior2011.pdf
13/17
of 3-tone psychophysics for understanding the regularities
of diatonic harmony, we conclude that further brain-
imaging studies in music perception should be undertaken
using psychophysically-defined tonal stimuli, in preference
to musically complex harmonic cadences whose description
in terms of acoustical physics is impossible.
General discussion
Music, together with language and tools, are among the
hallmarks of humanity, and are found in every known
human culture. Most music has affective valenceand
musical elements as simple as pitch triads often have
Fig. 10 Again subtracting out the activation obtained in a white noise
condition, the brain activation in the three conditions where there was
a semitone fall (1) between the first chord and the second chord are
shown in panels A, B and C. a shows the negligible activation in
moving from a major chord to a minor chord. b shows the weak
activation in moving from a minor chord to a tension chord. c shows
the strong activation in moving from a tension chord to a major chord.
Note that the brain activity in a similar condition (tension to major, but
where there was a two semitone rise (+2) from the first to the second
chord) is again weak dindicating that both the starting and finishing
chords and the direction of change are important factors (all images,
uncorrected p
7/27/2019 BrainImagingandBehavior2011.pdf
14/17
Table 5 (continued)
Lobe Region BA Left t-score Right t-score
Talairach coordinate (x, y, z) t-score Talairach coordinate (x, y, z)
Inferior Frontal Gyrus 45 54 13 23 4.35
Inferior Frontal Gyrus 47 34 18 3 4.82
Precentral Gyrus 44
53 9 11 4.96Inferior Frontal Gyrus 13 39 29 4 5.86
Temporal Transverse Temporal Gyrus 41 57 19 10 4.99
Parietal Inferior Parietal Lobule 40 38 34 38 4.62 33 47 41 4.48
Cerebellum 13 61 23 6.53
Declive 11 77 16 5.46 0 65 16 5.71
Declive 35 69 21 5.60 35 67 21 6.26
tension(1)major
Frontal Superior Frontal Gyrus 9 43 43 29 5.73
Superior Frontal Gyrus Gyrus/MFG 6 1 18 56 7.47
Middle Frontal Gyrus 6 32 1 55 5.61
Precentral Gyrus 6 41 0 34 4.69
Temporal Superior Temporal Gyrus 42 60
21 8 4.85
Superior Temporal Gyrus 22 48 10 0 5.21
Parietal Inferior Parietal Gyrus 40 37 41 43 5.45
Sublobar Insula 13 29 27 4 7.08
Cerebellum 2 45 23 3.93
Declive 5 73 16 6.45
tension(+2)major
Frontal Superior Frontal Gyrus 8 4 17 51 13.16
Middle Frontal Gyrus 9 50 14 30 7.15
Middle Frontal Gyrus 6 28 10 59 4.72
Middle Frontal Gyrus 47 38 38 6 5.20
Inferior Frontal Gyrus 9 43 4 32 7.37
Inferior Frontal Gyrus 9 54 8 32 4.42
Precentral Gyrus 6 48 5 51 4.88
Precentral Gyrus 44 53 9 6 5.74
Inferior Frontal Gyrus 13 34 9 13 4.75
Temporal Transverse Temporal Gyrus 41 57 21 10 7.84
Parietal Inferior Parietal Lobule 40 45 43 56 4.69
Superior Parietal Lobule 7 37 54 50 4.44
Occipital Precuneus 7 26 53 51 4.76
Postcentral Gyrus 5 35 46 61 5.74
Fusiform Gyrus 19 28 62 5 5.04
Sublobar Insula 13 36 20 0 5.60
Caudate
11
14 20 5.23Cerebellum Declive 31 65 18 7.11
Inferior Semi-Lunar Lobule 3 68 39 5.09
Culmen of Vermis 6 61 2 4.63
Uncorrected p
7/27/2019 BrainImagingandBehavior2011.pdf
15/17
emotional implications. It remains debatable to what extent
such affective associations are learned or are innate aspects
of the pitch structure in melodies and chords, but it is
certain that the major-minor dichotomy is pervasive (cross-
cultural if not universal), can be perceived by both adults
and young children (Kastner and Crowder1990), and is not
dependent on formal musical training.
As shown in Fig. 7, brain activation was found to besimilar in response to all harmonic modes in our experiment
and the pattern of activation is remarkably similar to that
reported by Koelsch et al. (2005). We interpret this finding
to be clear confirmation of their main conclusions
concerning the brain regions involved in the perception of
harmony. While our findings are supportive of the
conclusions drawn by Koelsch et al. (2005), we emphasize
a crucial difference in the selection of auditory stimuli in
their work and ours. That is, as stimuli they used relatively
complex harmonic cadences (consisting of 5 chords, each
consisting of 4 tones) that differed in terms of their
likelihood in traditional Western, classical music. Incontrast, our stimuli were much less acoustically complex,
but led to virtually identical brain activation. We therefore
conclude that stimuli that are musically sophisticated within
the Western idiom, but too complex to describe acoustically
are not necessary to elicit brain activations that are
characteristic of brain responses to musical harmony.
Similar to many other brain-imaging studies, the presenta-
tion of the musical stimuli occurred over 6 s, which is
considerably longer than done in most visual experiments.
Presentation of auditory stimuli over shorter intervals
should be studied in the future, but the replication of the
pattern of brain activity in response to well-known musical
cadences suggests that brain-imaging of harmony percep-
tion can be as consistent as imaging in visual, language and
motor tasks of various kinds.
In a previous fMRI study of harmony perception, we
compared the brain activation in response to three-tone
tension chords and three-tone chords containing a dissonant
interval (Cook et al. 2002; Cook2002). Despite the fact that
both types of chord are subjectively perceived as unstable
(rough, dissonant, not beautiful, etc.), relative to
major or minor chords, the dissonant chords produced
activation in the right parietal lobe (see, also, Suzuki et al.
2008), whereas the tension chords activated bilateral frontal
cortex (RH > LH). Having thus established the reality of adistinction between interval dissonance and triadic tension
in terms of brain activation, the present experiment was
undertaken to identify the sites involved in the three distinct
forms of non-dissonant harmony (major, minor, and
amodal tension).
A less robust, but interesting new finding of the present
study concerns the mapping of the three harmonic modes in
right orbitofrontal cortex (Fig. 11). Although we did not
predict the configuration of the cortical mapping of the
Cycle of Modes, in retrospect it is understandable that the
harmonic representation would be instantiated in 2D
cortical maps that distinguish between the dimensions ofresolved/unresolved harmonies (in a medial-lateral direc-
tion) and the dimension of positive/negative valence
(orthogonally in a ventrocaudal-dorsorostral direction). In
so far as (i) the cerebral neocortex makes abundant use of
2D maps for representing both sensory and motor informa-
tion, and (ii) the most salient perceptual features of musical
harmonies are their resolved/unresolved sonority and their
positive/negative valence, it is parsimonious that orthogonal
dimensions on 2D cortical maps would correspond to these
perceptual features. This is a topic in need of further study.
As shown in Fig. 11, there were also indications of a
harmony map in the cerebellum. In light of the known role
of the cerebellum in maintaining body equilibrium, the
rather large region of harmonic tension represented there is
of interest. By definition, the sense of harmonic tension is
concerned with the balance between two distinct pitch
intervals, suggesting that the cerebellum may also be
Fig. 11 Sites of activation in response to major (orange), minor (blue)
and tension (green) chords in right orbitofrontal cortex. The implied
2D cortical harmony map has dimensions of tension/resolution
(medial-lateral) and major/minor modality (ventrocaudal-dorsorostral).
The distinct foci of activation for major, minor and tension chords in
posterior regions are all cerebellar, suggesting the presence of a related
harmony map there. The data from the present study are not
statistically robust enough to draw firm conclusions, but suggest an
interesting direction of brain-imaging research
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
16/17
involved in detecting the equivalence/non-equivalence of
interval size in triads.
High-resolution mapping of the 2D representation of
harmony in the orbitofrontal cortex remains to be done. For
such a purpose, precise acoustical description of the
musical stimuli will be required and precludes the use of
musically-complex cadences that are necessarily a mixture
of many musical components (melody, rhythm and timbre,as well as harmony). On the one hand, we have previously
argued that pitch phenomena, in general, and harmonic
phenomena, in particular, cannot be reduced solely to 2-
tone interval consonance/dissonance effects. On the other
hand, provided that the 3-tone configurations of harmonic
triads are also brought into consideration, many of the
fundamental phenomena of traditional harmony theory can
be explained on a fully psychophysical basis (Cook 2009,
2011). In principle, it should be possible to determine the
relationships between acoustical properties, emotional
responses to auditory stimuli and activated brain regions
with a precision comparable to that already obtainable in,for example, visual neuroscience.
Acknowledgment This work was supported by a grant from the
Japanese Society for the Promotion of Science (JSPS. KAKENHI,
Grant No. 18800068) and CrestMuse project, CREST, JST. This
research was also supported in part by an Academic Frontiers Project
at Kansai University (20032007).
References
Blood, A. J., Zatorre, R. J., Bermudez, P., & Evans, A. C. (1999).
Emotional responses to pleasant and unpleasant music correlate
with activity in paralimbic brain regions. Nature Neuroscience, 2,
382387.
Brett, M., Anton, J.-L., Valabregue, R., & Poline, J.-B. (2002). Region
of interest analysis using an SPM toolbox [abstract] Presented at
the 8th International Conference on Functional Mapping of the
Human Brain, June 26, 2002, Sendai, Japan. Available on CD-
ROM in NeuroImage, Vol 16, No 2.
Brown, S., & Martinez, M. J. (2007). Activation of premotor vocal
areas during musical discrimination. Brain and Cognition, 63(1),
5969.
Brown, S., Martinez, M. J., Hodges, D. A., Fox, P. T., & Parsons, L.
M. (2004). The song system of the human brain. Cognitive Brain
Research, 20(3), 363375.
Cook, N. D. (2001). Explaining harmony: the roles of interval
dissonance and chordal tension. Annals of the New York Academy
of Science, 930, 382385.
Cook, N. D. (2002). Tone of voice and mind: The connections between
music, language, cognition and consciousness. Amsterdam: John
Benjamins.
Cook, N. D. (2007). The sound symbolism of major and minor modes.
Music Perception, 24(3), 315319.
Cook, N. D. (2009). Harmony perception: harmoniousness is more
than the sum of interval consonance. Music Perception, 27(1),
2541.
Cook, N. D. (2011). Harmony, perspective and triadic cognition. New
York: Cambridge University Press.
Cook, N. D., & Fujisawa, T. X. (2006). The psychophysics of
harmony perception: harmony is a 3-tone phenomenon. Empirical
Musicology Review, 1(2), 106126.
Cook, N. D., & Hayashi, T. (2008). The psychoacoustics of harmony
perception. American Scientist, 96(4), 311319.
Cook, N. D., Callan, D. E., & Callan, A. A. (2002). Frontal areas
involved in the perception of harmony. Eighth International
Conference on Functional Mapping of the Human Brain, Sendai,
Japan (June 26).
Cook, N. D., Fujisawa, T. X., & Takami, K. (2006). Evaluation of theaffective valence of speech using pitch substructure. IEEE
Transactions on Audio, Speech and Language Processing, 14,
142151.
Cook, N. D., Fujisawa, T. X., & Konaka, H. (2007). Why not study
polytonal psychophysics? Empirical Musicology Review, 2(1),
5864.
Eberlein, R. (1994). Die Entstehung der tonalen Klangsyntax.
Frankfurt: Lang.
Fujisawa, T. X. (2004). Unpublished PhD thesis, Kansai University.
Helmholtz, H. L. F. (1877/1954). On the sensations of tone as a
physiological basis for the theory of music. New York: Dover.
Janata, P. (2009). The neural architecture of music-evoked autobio-
graphical memories. Cerebral Cortex, 19, 25792594.
Kameoka, A., & Kuriyagawa, M. (1969). Consonance theory. Parts I
and II. Journal of the Acoustical Society of America, 45, 1452
1459; 45, 14601469.
Kastner, M. P., & Crowder, R. G. (1990). Perception of the major/
minor distinction: IV. Emotional connotations in young children.
Music Perception, 8, 189202.
Khalfa, S., Schon, D., Anton, J. L., & Liegeois-Chauvel, C. (2005).
Brain regions involved in the recognition of happiness and
sadness in music. NeuroReport, 16(18), 19811984.
Koelsch, S., Fritz, T., Schulze, K., Alsop, D., & Schlaug, G. (2005).
Adults and children processing music: an fMRI study. Neuro-
image, 25, 10681076.
Koelsch, S., Fritz, T., von Cramon, D. Y., Mller, K., & Friederici, A.
D. (2006). Investigating emotion with music: an fMRI study.
Human Brain Mapping, 27(3), 239250.
Levitin, D. J. (2006). This is your brain on music: The science of a
human obsession. New York: Dutton/Penguin.
Levitin, D. J., & Menon, V. (2003). Musical structure is processed in
language areas of the brain: a possible role for Brodmann area
47 in temporal coherence. Neuroimage, 20, 21422152.
Levitin, D. J., & Menon, V. (2005). The neural locus of temporal
structure and expectancies in music: evidence from functional
neuroimaging at 3 Tesla. Music Perception, 22(3), 563575.
Meyer, L. B. (1956). Emotion and meaning in music. Chicago:
Chicago University Press.
Morton, E. W. (1977) On the occurrence and significance of
motivation-structural roles in some bird and mammal sounds.
American Naturalist, 111, 855869.
Ohala, J. J. (1983) Cross-language use of pitch: An ethological view.
Phonetica, 40, 118.
Pallesen, K. J., Brattico, E., Bailey, C., Korvenoja, A., Koivisto, J.,
Gjedde, A., et al. (2005). Emotion processing of major, minor and
dissonant chords: a functional magnetic resonance imaging study.
Annals of the New York Academy of Sciences, 1060, 450453.
Parncutt, R. (1989). Harmony: A psychoacoustical approach. Berlin:
Springer.
Plomp, R., & Levelt, W. J. M. (1965). Tonal consonances and critical
bandwidth. The Journal of the Acoustical Society of America, 38,
548560.
Roberts, L. (1986). Consonant judgments of musical chords by
musicians and untrained listeners. Acustica, 62, 163171.
Satoh, M., Takeda, K., Nagata, K., Hatazawa, J., & Kuzuhara, S.
(2001). Activated brain regions in musicians during an
Brain Imaging and Behavior
7/27/2019 BrainImagingandBehavior2011.pdf
17/17
ensemble: a PET study. Cognitive Brain Research, 12(1), 101
108.
Satoh, M., Takeda, K., Nagata, K., Hatazawa, J., & Kuzuhara, S.
(2003). The anterior portion of the bilateral temporal lobes
participates in music perception: a positron emission tomography
study. American Journal of Neuroradiology, 24, 18431848.
Sethares, W. A. (1999). Tuning, timbre, spectrum, scale. Berlin: Springer.
Suzuki, M., Okamura, N., Kawachi, Y., Tashiro, M., Arao, H.,
Hoshishiba, T., et al. (2008). Discrete cortical regions associated
with the musical beauty of major and minor chords. Cognitive,Affective & Behavioral Neuroscience, 8(2), 126131.
Talairach, J., & Tournoux, P. (1988). Co-planar stereotaxic atlas of
the human brain. 3-dimensional proportional system: An
approach to cerebral imaging. Stuttgart: Thieme.
Tillmann, B., Janata, P., & Bharucha, J. J. (2003). Activation of the
inferior frontal cortex in musical priming. Cognitive Brain
Research, 16, 145161.
Wright, P., He, G., Shapira, N. A., Goodman, W. K., & Liu, Y.
(2004). Disgust and the insula: fMRI responses to pictures of
mutilation and contamination. NeuroReport, 15(15), 2347
2351.
Zatorre, R. J. (2001). Neural specializations for tonal processing.
In R. J. Zatorre & I. Peretz (Eds.), The biological foundations
of music (pp. 193210). New York: New York Academy ofScience.
Zatorre, R. J., Evans, A. C., & Meyer, E. (1994). Neural mechanisms
underlying melodic perception and memory for pitch. The
Journal of Neuroscience, 14, 19081919.
Brain Imaging and Behavior