Musical Time and Information Theory Entropy

University of IowaIowa Research Online

Theses and Dissertations

2010

Musical time and information theory entropySarah Elizabeth CulpepperUniversity of Iowa

Copyright 2010 Sarah Elizabeth Culpepper

This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/659

Follow this and additional works at: http://ir.uiowa.edu/etd

Part of the Music Commons

Recommended CitationCulpepper, Sarah Elizabeth. "Musical time and information theory entropy." MA (Master of Arts) thesis, University of Iowa, 2010.http://ir.uiowa.edu/etd/659.

MUSICAL TIME AND INFORMATION THEORY ENTROPY

by

Sarah Elizabeth Culpepper

A thesis submitted in partial fulfillment of the requirements for the Master of

Arts degree in Music in the Graduate College of

The University of Iowa

July 2010

Thesis Supervisor: Assistant Professor Robert C. Cook

Graduate College The University of Iowa

Iowa City, Iowa

CERTIFICATE OF APPROVAL

_______________________

MASTER'S THESIS

_______________

This is to certify that the Master's thesis of

Sarah Elizabeth Culpepper

has been approved by the Examining Committee for the thesis requirement for the Master of Arts degree in Music at the July 2010 graduation.

Thesis Committee: ___________________________________ Robert C. Cook, Thesis Supervisor

___________________________________

Nicole Biamonte

___________________________________

Jerry Cain

ii

Is their wish so unique To anthropomorphize the inanimate

With a love that masquerades as pure technique?

Donald Justice Nostalgia of the Lakefronts

iii

TABLE OF CONTENTS

LIST OF TABLES ............................................................................................................. iv

LIST OF FIGURES ........................................................................................................... vi

CHAPTER

I. INTRODUCTION ............................................................................................1

II. INFORMATION THEORY ENTROPY ..........................................................6

III. EXISTING MUSIC-THEORETIC SCHOLARSHIP ON INFORMATION THEORY ENTROPY ........................................................21

IV. ALPHABETS FOR ENTROPY-BASED ANALYSIS ..................................36

Interval Entropy ..............................................................................................36 CSEG Entropy ................................................................................................46 PC-Set Entropy ...............................................................................................61

V. INFORMATION AND TIME ........................................................................67

VI. ANALYSES ...................................................................................................80

Op. 16, no. 1: Christus factus est ................................................................80 Op. 5, no. 4 .....................................................................................................97

VII. CONCLUSION.............................................................................................115

BIBLIOGRAPHY ............................................................................................................117

iv

LIST OF TABLES

Table

3.1. Pitch entropies from Youngblood .............................................................................21

4.1. Pitch entropies in Webern works, compared with Babbitt and Schubert .................36

4.2. Interval class entropies comparing serial and non-serial works ...............................38

4.3. Vertical and horizontal entropy on one serial and one non-serial work ...................39

4.4. Registrally-ordered interval class entropy in Webern and Babbitt ...........................40

4.5. Ordered directional interval class entropy in serial and non-serial works ................42

4.6. Interval entropy in Webern and Babbitt ...................................................................44

4.7. CSEG entropies for random and motivic strings ......................................................49

4.8. CSEG entropies, random string versus Webern, op. 5, no. 1 ...................................51

4.9. CSEGs, random string versus Webern, op. 5, no. 1 .................................................52

4.10. CSEG entropies, op. 5, no. 1, versus op. 18 .............................................................54


4.12. CSEG entropies for serial works ..............................................................................60

4.13. Pc-set entropies in op. 16 and op. 25 using discrete segmentation algorithm ..........62

4.14. Pc-set entropies in op. 16 and op. 25 using window algorithm ................................64

4.15. Vertical pc-set entropy in op. 16 and op. 25, no. 1 ...................................................65

6.1. Pitch class entropy in op. 16, no. 1 ...........................................................................87

6.2. Pitch class entropy in the vocal line of op. 16, no. 1. ...............................................88

6.3. CSEG entropies in op. 16, no. 1 ...............................................................................89

6.4. Interval class entropy in op. 16, no. 1 .......................................................................91

6.5. Discrete pc-set entropies in op. 16, no. 1 ..................................................................91

6.6. Pitch entropy in sections of op. 5, no. 4 ..................................................................104

6.7. Interval class entropies in op. 5, no. 4 ....................................................................105


v

6.9. Pitch entropy in op. 5, no. 4, A and B.....................................................................108

6.10. Interval entropies in op. 5, no. 4, A and B ..............................................................109

vi

LIST OF FIGURES

Figure

2.1. A corrupted tonal work .........................................................................................16

2.2. A corrupted contextual work .................................................................................16

3.1. 95% confidence intervals for Youngbloods entropy calculations ...........................23

3.2. Passage with pitch class entropy 2.52 .......................................................................28

3.3. Passage with pitch class entropy 2.52 .......................................................................28

4.1. Interval class entropies comparing serial and non-serial works ...............................38

4.2. Registrally-ordered interval class entropy in Webern and Babbitt ...........................41

4.3. Ordered directional interval class entropy in serial and non-serial works ...............42

4.4. Interval entropy in Webern and Babbitt ...................................................................44

4.5. A randomly-generated string of pitches....................................................................48

4.6. A motivic string of pitches........................................................................................49

4.7. CSEG entropies for random and motivic strings ......................................................50

4.8. CSEG entropies, random string versus Webern, op. 5, no. 1 ...................................51

4.9. CSEGs, random string versus Webern, op. 5, no. 1 .................................................53



4.12. Melody generated using the CSEG distribution of Webern, op. 5, no. 1 .................57

4.13. Melody generated using the CSEG distribution of a string of random pitches. .......58

4.14. CSEG entropies for serial works ..............................................................................60

4.15. Op. 27, no. 1, mm. 20-21 ..........................................................................................61

4.16. Pc-set entropies in op. 16 and op. 25 using discrete segmentation algorithm ..........63

4.17. Pc-set entropies in op. 16 and op. 25 using window algorithm ................................64

4.18. Vertical pc-set entropy in op. 16 and op. 25, no. 1 ...................................................65

6.1. Vertical ic1s in op. 16, no. 1 ....................................................................................83

vii

6.2. Pitch class entropy in op. 16, no. 1 ...........................................................................88

6.3. Pitch class entropy in the vocal line of op. 16, no. 1 ................................................89

6.4. CSEG entropies in op. 16, no. 1 ...............................................................................90

6.5. Interval class entropy in op. 16, no. 1 .......................................................................91


6.7. Lewins depiction of the three flyaway motives. ......................................................98

6.8. Pc-set analysis of op. 5, no. 4 .................................................................................100

6.9. Clampitts analysis of op. 5, no. 4, mm. 1-6 ...........................................................101

6.10. Pitch entropy in op. 5, no. 4 ....................................................................................104

6.11. Interval class entropy in op. 5, no. 4. ......................................................................105

6.12. Registrally-ordered interval class entropy in op. 5, no. 4 .......................................106


6.14. Pitch entropy in op. 5, no. 4, A and B.....................................................................108

6.15. Interval class entropy in op. 5, no. 4, A and B........................................................109

6.16. Registrally-ordered interval class entropy in op. 5, no. 4, A and B ........................110

1

CHAPTER I INTRODUCTION

In the conclusion of The Time of Music (1988), Jonathan Kramer gives two anecdotes of his personal experiences with what he calls musical timelessness. The first

recalls a performance of the middle movement of Satie's Pages mystiques, a collection of

phrases repeated 840 times in succession:

For a brief time I felt myself getting bored, becoming imprisoned by a hopelessly repetitious piece. Time was getting slower and slower, threatening to stop. But then I found myself moving into a different listening mode. I was entering the vertical time of the piece. My present

expanded, as I forgot about the music's past and future.... After what seemed forty minutes I left. My watch told me that I had listened for three hours. I felt exhilarated, refreshed, renewed.1

The second anecdote concerns the opposite condition, a happening dense enough to

induce sensory overload:

The production began at 7:00 p.m. The noise level was consistently high, and the visual panorama was dizzying. I found myself, although performing, focusing my attention on one layer, then another, and then various combinations of layers.... After what seemed to be a couple of hours, everyone spontaneously agreed that it was time to stop... I loaded my tape and slides into my car. Only then did I glance at my watch. It was not yet 8:00! What had seemed like a two-hour performance must have lasted under 25 minutes by the clock.2

Kramer attributes the disparity between these temporal experiences to the amount and

density of information each performance contained. Music that is predictable and easily

chunked, he argues, takes up less mental storage space and seems shorter than music

that is less predictable; Thus a two-minute pop tune will probably seem shorter than a

1 Jonathan Kramer, The Time of Music (New York: Schirmer, 1988): 379. 2 Ibid., 380.

2

two-minute Webern movement.3

The connection between musical predictability and perception of musical time is a

common one. Kramer characterizes musical temporalities as directed, multiply-directed,

and non-directed based on their movement towards a predictable goal.4 Re-ordered

temporal progressions, such as the misplaced closing gestures Levy finds in Haydn and

the evolving themes Hatten finds in Beethoven, draw power from their violation of

listener expectations.5 Although complicating factors abound the audience's familiarity

with a musical idiom; tendency to disengage from overly predictable works; how

comfortable the chairs are the existence of a connection between time and predictability

is clear.

This thesis examines the relationship between time and predictability through the

lens of information theory entropy. Just as traditional entropy speaks to the degree of

randomness in a system, information theory entropy speaks to the randomness of a

message or, alternately, to that message's predictability. Although information theory

entropy was initially developed to determine the most efficient way to encode a message

for radio transmission, it has since been adopted as an analytical tool by a variety of

fields, including linguistics, literary criticism, and music theory.

In particular, information theory entropy seems relevant to Webern's music.

Adorno refers to Webern's work as possessing a skeletal simplicity, a comparative

economy of musical materials that seems well-suited for analysis in terms of information

3 Ibid., 337.

4 Ibid., 16ff.

5 Janet Levy, Gesture, Form, and Syntax in Haydn's Music, in Haydn Studies: Proceedings of the International Haydn Conference (New York: Norton, 1981), 355-362; Robert Hatten, The Troping of Temporality in Music, in Approaches to Meaning in Music, ed. Byron Almen and Edward Pearsall (Bloomington: Indiana University Press, 2006), 66ff.

3

theory in the sense that no pitch or gesture seems superfluous or reducible, as though its

omission would not have a marked effect on the passage, or as though it had only been

added to fill space before the beginning of the next phrase.6 (In Adornos words: Every single note in Webern fairly crackles with meaning.7) Literary applications of information theory entropy speak meaningfully to this economy as a feature of poetry, as

will be shown in a later chapter; I believe entropy can speak to these same qualities in

Webern's work.

Webern's music is also of interest to this project because of the relationship between information content and the listener's perception of time, as will be discussed in

chapter 5. Certainly perception of time is salient to analysis of Weberns work. As

Stockhausen writes, If we realise, at the end of a piece of music... that we have 'lost all

sense of time', then we have in fact been experiencing time most strongly. This is how we

always react to Webern's music.8 In a different vein, Ligeti describes Webern's music as

the spatialization of time.9 Perception and analysis of time in Webern is, at the very least,

complicated, but entropy provides a useful metaphor for its description and a useful tool

for its examination.

In the 2009 article Number Fetishism, Vanessa Hawes criticizes music-theoretic

use of information theory as... well, as number fetishism: as a component of the claim

6 Theodor Adorno, The Aging of the New Music, in Essays on Music, ed. Richard Leppert, trans. Susan Gillespie (Berkeley, Los Angeles: University of California Press, 2002), 187.

7 Theodor Adorno, Quasi una Fantasia: Essays on Modern Music, trans. Rodney Livingstone (New York: Verso, 1998), 180. 8

Karlheinz Stockhausen, Structure and Experiential Time, Die Reihe 2 (Bryn Mawr, PA: Presser, 1959), 65.

9 Gyorgy Ligeti, Metamorphoses of Musical Form, Die Reihe 7 (Bryn Mawr, PA: Presser, 1965), 16.

4

that music theorists can consider themselves scientists who refute or uphold hypotheses

based on empirical evidence, a notion she depicts as quaint and outdated.10 Indeed, early

uses of information theory often relied upon questionable assumptions, as Hessert (1971) claims, and were often divorced from diachronic perception of music.11 Nevertheless,

insofar as information theory entropy measures predictability a very salient factor in

diachronic perception of music it can be a relevant lens for the examination of musical

time.

Using information theory to quantify subjective musical temporality would be questionable indeed, but using information theory to analyze and discuss temporality

seems much less problematic. Writing about traditional entropy, Eddington clarifies the

situation:

Suppose that we were asked to arrange the following in two categories distance, mass, electric force, entropy, beauty, melody....

I think there are the strongest grounds for placing entropy alongside beauty and melody, and not with the first three. Entropy is only found when the parts are viewed in association, and it is by viewing or hearing the parts in association that beauty and melody are discerned. All three are features of arrangement. It is a pregnant thought that one of these three associates should be able to figure as a commonplace quantity of science. The reason why this stranger can pass itself off among the aborigines of the physical world is that it is able to speak their language, viz., the language of arithmetic.12

Entropy is discussed in terms of number but is not the fetishism of number; rather, it is a

10 Vanessa Hawes, Number Fetishism: The History of the Use of Information Theory as a Tool for Musical Analysis, in Music's Intellectual History 2009, ed. Zdravko Blazekovic and Barbara Dobbs Mackenzie (New York: RILM, 2009), esp. 836-838.

11 Norman Hessert, The Use of Information Theory in Musical Analysis (Ph.D diss., Indiana University, 1971).

12 A. Eddington, The Nature of the Physical World (Ann Arbor: University of Michigan Press, 1935), 105.

5

powerful and elegant principle that can be expressed quantitatively. Similarly,

information theory entropy need not be a formula divorced from musical experience, but

can instead be a analytical tool and metaphor for the discussion of something deeply

experiential and even as Meyer (1957) claims a way to approach the question of musical meaning.13

This thesis begins with an explication of information theory entropy (chapter 2) and a history of its use in music theory (chapter 3). In chapter 4, a variety of alternative approaches to entropy are developed, including entropy calculations based on CSEGs and

pc-sets (as opposed to single pitch classes). Chapter 5 makes a more in-depth argument for the relationship between information theory entropy and time, recasting analyses of

temporality in Webern in terms of entropy. Finally, in chapter 6, information theory

entropy will be used to analyze time in the first of the Fnf Canons, op. 16, and the fourth of the Fnf Stze, op. 5 two movements in which form is created by perceptible shifts among differing depictions of temporality, shifts prompted by varying degrees of

predictability in a variety of musical domains.

13 Leonard Meyer, Meaning in Music and Information Theory, Journal of Aesthetics and Art Criticism 15, no. 4 (1957): 412-424.

6

CHAPTER II INFORMATION THEORY ENTROPY

Information theory entropy is based on the idea that in most alphabets, some

letters communicate more information than other letters do, because they occur less

frequently. If a word has been corrupted during transmission and all that remains is q _ _

_ k, the recipient can easily guess what the original word was, since there are very few

words that contain both a q and a k. By contrast, if all that remains of the word is _ _ i c

_, the original word is much more difficult to guess. Since q and k are uncommon, they

communicate more information about the original message than more common letters

can.14

In general, the more unequal the frequencies of letters in an alphabet are, the

easier it is to determine what letters have been corrupted. If an alphabet only has two

letters, A and B, but the former occurs 90% of the time and the latter occurs 10% of the

time, the message recipient has an excellent chance of guessing any letters that have been

corrupted (since there is a 90% chance any given letter will be an A). By contrast, if A and B appear 50% of the time, our ability to guess a missing letter is diminished.

From the perspective of a person sending a telegram, the former language is very

inefficient. Assume, for simplicity, any message in this language must contain exactly

90% As and 10% Bs (although in a real language, these would be averages). If the

14 Some more in-depth sources on information theory entropy:

A. Khinchin, Mathematical Foundations of Information Theory (New York: Dover, 1957); Abraham Moles, Information Theory and Esthetic Perception, trans. Joel Cohen (Urbana and London: University of Illinois Press, 1966); Lawrence Rosenfield, Aristotle and Information Theory (Paris, The Hague: Mouton, 1971); Claude Shannon, A Mathematical Theory of Communication, Bell System Technical Journal 27 (1948), 379-423; Claude Shannon and Warren Weaver, A Mathematical Model of Communication (Urbana: University of Illinois Press, 1949). Information in this chapter is drawn heavily from these sources, as well as from the music-theoretic sources cited in Chapter 3.

7

transmitter is limited to ten characters, there are exactly ten words s/he can send:

AAAAAAAAAB, AAAAAAAABA, AAAAAAABAAA, and so on. The letter A is so

common that it is practically meaningless; only the position of the less common letter

differentiates between words, but it occurs very rarely. By contrast, in a language that is

50% A and 50% B, the transmitter would have 2^10 or 1024 word choices. By creating

an alphabet in which all letters occur with the same frequency, the efficiency of

transmission is maximized.

Of course, in addition to being more efficient, the latter alphabet is less resistant

to corruption. Ideally, one must find a balance between the most efficient language

possible and the most robust language possible, to be sure that the message arrives to its

recipient intact but without wasting time or resources during transmission. Finding this

balance generally for the purpose of data compression or encoding was one of the

first goals of the field of information theory, pioneered by Bell Labs engineer Claude

Shannon in the late 1940s.

The inequality of the amount of information contributed by each letter in an

alphabet is called the Shannon entropy of that alphabet. If Shannon entropy is low, the

language is inefficient but robust; a few letters occur very frequently and the rest are

uncommon. If Shannon entropy is high, the language is efficient; each letter occurs with

roughly the same frequency and therefore each letter conveys the most information

possible.

The Shannon entropy of a message or an alphabet is given by the following

formula:

Here, p(x) is the probability that a given event occurs; p(x=6) denotes the probability that

8

a randomly selected pitch will be an F#, for example.

The example of bits illustrates the purpose of the logarithm in this formula. Each

bit presents two choices; given six bits, the number of combinations that can be

communicated is two to the sixth power. The entropy formula can seen as taking the

number of possible choices (here, expressed in terms of probability) and returning the number of bits that would be necessary to communicate that much information.15

(Log base two is necessary to express these results in terms of bits. Another log base would create meaningful data if used consistently, but these data would be in terms of

other units of measurement.) Effectively, the use of logarithms in this formula ensures that the highest entropy is created when each possible outcome has an equal probability

of occurring, and that the lowest entropy is created when one event has a very high

probability of occurring. Consider an alphabet that has one letter, A, that occurs 100% of

the time. The entropy for this language is

that is, since we are absolutely certain every letter will be an A, the language has an

uncertainty of zero and an entropy of zero. The closer any probability gets to 1, the

smaller the language's entropy becomes. For example, if this language had three letters

instead, in which A occurred 98% of the time, and B and C each occurred 1% of the time,

the entropy of the language would be

15 See Khinchin or Shannon and Weaver for more information.

9

The logarithmic expression makes the contribution of the first term very small, whereas

the small probabilities make the contributions of the second and third terms very small as

well. By contrast, if each option occurs with roughly equal frequency, the entropy of the

language is

which is the highest possible entropy for an alphabet with three letters. Of course, the

more letters in an alphabet, the higher the maximal entropy becomes. If this same

equally-weighted alphabet had eight letters, its entropy would be

An alphabet with twenty-six letters has a maximal entropy of 4.7; an alphabet with a

hundred letters has a maximal entropy of 6.64.

It is clear from these examples that entropy is most useful for comparisons. The

claim that an alphabet with a hundred letters has a maximal entropy of 6.64 is not terribly

meaningful on its own; it only takes on meaning when paired with the statement that an

alphabet with three letters has a maximal entropy of 1.58, or with other entropy

calculations from hundred-letter alphabets.

To allow more meaningful comparisons between entropies of alphabets with

different cardinalities, we introduce the concept of relative entropy, which expresses

entropy values (as computed above) as a percentage of the maximal possible entropy for an alphabet of that cardinality. For example, the relative entropies of the cardinality three

and cardinality eight alphabets discussed above are

10

and

respectively. Thus, we can think of these two alphabets as having equivalent entropies,

even if their absolute entropies are not equal.

Relative entropy also allows entropy calculations to reflect unused letters in a

passage. Intuitively, a passage of English text that uses only thirteen letters should not

have the same entropy as a passage of Hawaiian. One imagines the former would seem

more stilted, more restricted than the latter, since a listener would hear it in the context of

a twenty-six letter alphabet, rather than a thirteen-letter alphabet. Similarly, a piece that

only uses the pitches C, C#, Eb, G#, A, A#, and B with given frequencies is very different

from a passage of chant that uses each of its seven tones with the same frequencies as the

above piece. While it is likely that the former piece will be heard as using a restricted

subset of a twelve-pitch alphabet, the latter piece exhausts its alphabet and would not be

heard as restricted in its materials in the same way as the former. The traditional entropy

formula is unable to reflect this distinction, because any unused letters carry with them a

probability of 0, effectively canceling out any entropic contribution from those letters, but

these unused letters are relevant to the computation of relative entropy through their

conclusion in the maximal entropy for an alphabet of a given cardinality.

Nevertheless, the use of relative entropy requires caution. A piece of music that

11

uses three pitches with equal frequency is much more predictable, mathematically and

aurally, than a piece of music that uses twelve pitches with equal frequencies, even

though their relative frequencies are equal. In other words, although relative entropy

allows for comparison between alphabets of different cardinalities, such a comparison

must always be considered alongside the alphabets respective absolute entropies. In this

paper, relative entropy will only be invoked in the presence of corresponding absolute

entropy figures or some sort of intuitive justification for hearing these alphabets as perceptually similar.

It is also clear from these examples that the entropy of a message at least,

entropy computed on the literal letters of an alphabet is indepedent of the meaning of

that message. Entropy only reflects characteristics of the language in which that message

is written or encoded. However, the meaning of a message may become relevant if the

'alphabet' in question is not a literal alphabet. For years, literary critics especially

modernists and post-modernists, and in particular those interested in the work of Thomas

Pynchon have completed information theory analyses of texts using words or images,

instead of literal letters, as the letters of an alphabet. In this case, the most commonly

occurring letters are connective words like articles and prepositions. Consider, for

example, a corrupted block of text from F. Scott Fitzgerald's The Great Gatsby, from

which every eighth word has been removed:

When we pulled out into the winter and the real snow, our snow, began stretch out beside us and twinkle against windows, and the dim lights of small Wisconsin echoed by, a sharp wild brace came into the air. We drew in deep of it as we walked back from through the cold vestibules,

unutterably aware of identity with this country for one hour before we melted indistinguishably into it again.

Although the result is disjointed in places, it is certainly still intelligible; in places the reader cannot tell the message has been corrupted at all. Every image found in this

12

excerpt is repeated; if the word 'winter' were corrupted, 'snow' and 'cold' would still

convey its meaning. Additionally, the passage contains many connective words and

when these words are missing (as in, our snow began stretch out beside us) the blanks can be filled in easily. We conclude that this passage has low word-based entropy,

regardless of any entropy figures computed on the basis of individual letter frequency.

For comparison, an excerpt from Flann O'Brien's At Swim-Two-Birds (considered the first Irish post-modern novel) has been similarly corrupted below.

I will relate, said Finn. Till a man accomplished twelve books of poetry, the same is not for want of poetry but is forced away. man is taken till a black hole is in the world to the depth of his oxters and he put into it to gaze it with his lonely head and nothing to but his shield and a stick of. Then must nine warriors fly their at him, one with the other and together.

From this, we can gather that we are listening to a narrator named Finn; word repetition

clues us in that poetry and war are somehow involved, but there is little else we can say

about this passage. This same lack of repetition makes the original, non-corrupted

passage more difficult to understand than the non-corrupted Fitzgerald.

I will relate, said Finn. Till a man has accomplished twelve books of poetry, the same is not taken for want of poetry but is forced away. No man is taken till a black hole is hollowed in the world to the depth of his two oxters and he put into it to gaze from it with his lonely head and nothing to him but his shield and a stick of hazel. Then must nine warriors fly their spears at him, one with the other and together.

In examining the original passage, we are in fact examining different sources of

corruption. The Fitzgerald is robust against the corruption of readers lacking context, or

readers being sleepy; in the presence of these forms of corruption the passage is still

readable. The O'Brien is much less robust by comparison. We conclude the passage has

higher entropy.

13

Alternately, we can conclude that the O'Brien passage is more efficient than the

Fitzgerald, since each individual word communicates more information. If the reader can

easily guess the meaning of a missing word, as in the Fitzgerald, then that word has a

very low information content; with these words removed, the passage becomes less

elegant but not much less intelligible. This is the same quality that makes this passage

easy to summarize. Since removing words from the O'Brien limits the reader's ability to

comprehend the passage, we can conclude that the missing words had a higher

information content that overall, there are fewer redundant or repeated words, and that

therefore, the O'Brien is a more efficient communication.

Finally, we examine a passage from Todtnauberg by Paul Celan.16

Arnica, Eyebright, the drink from the with the star-die on top,

in the

into the book whose name did it in before mine? the line written into this about a hope, today, for a thinker's (un- coming) word in the

The result is nearly unintelligible; the reader cannot guess the original narrator, subject, or purpose of this passage. What remains is interpretable, certainly, but the reader cannot

16 Although this passage is shorter than the others, the same percentage of words has been removed in each case.

14

be certain of the original text based on this except. Consequently, this passage has high

entropy.

As expected, lack of repeated words and lack of connective words contribute to

higher entropy. Shared meaning contributes to lower entropy as well, as seen in the

Fitzgerald example dealing with 'snow,' 'winter,' and 'cold.' From these examples, though,

we can also see that clear syntactical structures reduce entropy. If the reader can perceive

the sentence structure underlying We drew in deep ____ of it as we walked back from

____ through the cold vestibules, the reader can make more educated guesses as to what

the missing words could be. The second missing word appears to be some sort of place;

the first missing word is a noun that can be an object of the verb 'to draw in,' so perhaps the missing word is 'breaths' or 'gasps' or something along those lines. The general import

of the sentence is still clear. Similarly, in the O'Brien sentence Then must nine warriors

fly their ____ at him, as long as the reader can parse that warriors are throwing things at

an unhappy target, the meaning of the sentence is clear.

By contrast, poetry especially the works of Paul Celan is characterized by

economy of words and imagery, in that every word contributes a great deal of meaning to

a passage. This is why the Celan example is the least intelligible of the above: there are

few redundant words, and the associations between words are specifically designed to be

unexpected and novel. In other words, each word is intended to convey the greatest

possible amount of information.

One could conceptualize this new, more meaning-sensitive interpretation of

entropy as occurring on a higher level than entropy computed based on literal letters of an

alphabet. If this Fitzgerald sample were encoded in a different alphabet if it were

written in binary, or encrypted for secure transmission without changing its vocabulary,

its low-level, alphabet-based entropy would be quite different but its higher-level, word-

15

based entropy would be the same. To achieve a word-based equivalent to encryption, one

would need a paraphrase of this text by another author, or a similar text that

communicates the same images (snowfall; evening; solitude) or the same themes (introspection; nostalgia; the notion that a persons actions and mindsets are influenced by that persons home17) using thriftier vocabulary.

These corrupted blocks of text can be seen as analogous to hearing music in a

static-filled radio broadcast. Listening to a Haydn string quartet in such a broadcast, one

would still be able to identify the key, the time signature, and the instrumentation; one

could make an educated guess as to which movement the quartet was playing, and

probably one could even hum the missing notes. By contrast, listening to such a broadcast

of the Webern Concerto, op. 24, one might not even be able to determine the

orchestration of the piece, let alone guess the missing notes. One can imagine a similar

corruption of the original musical signal being created by a poor ensemble; in this

situation, the Haydn can be considered to have a higher entropy because ensemble

mistakes, whether wrong notes or dynamic mismatches or harmonic misalignments, are

generally much more recognizable than the corresponding mistakes would be in the

Webern. Because the listener is (usually) able to form more confident predictions for upcoming events in the Haydn, violations of these predictions (including mistakes) are more striking.18

Alternately, consider the (comically) corrupted piece of music shown in Figure 2.1.

17 From the next paragraph: I see now that this has been a story of the West, after all Tom and Gatsby, Daisy and Jordan and I, were all Westerners, and perhaps we possessed some deficiency in common which made us subtly unadaptable to Eastern life.

18 This is a generalization, of course. Many Webern compositions can be considered to have low entropy in terms of dynamics in which case a mistake in terms of dynamics would be immediately recognizable as such.

16

Figure 2.1: A corrupted tonal work

Despite the corruption, the identity of this piece is readily apparent. Even a listener who

had never heard this piece before could make a reasonable guess at every missing note,

based on typical harmonic progressions, repetition, and motive. By comparison, a

similarly corrupted, non-tonal work, shown in Figure 2.2, is less easy to identify.

Figure 2.2: A corrupted contextual work

17

A listener already familiar with the piece might be able to identify this as the third

movement of Webern's Variations for Piano, op. 27, but a listener unfamiliar with the

piece would not even be able to guess which of the corrupted objects were pitches and which were rests. A listener who expects a serial work based on a derived row may be

able to fill in the blanks surmising in retrospect that the first missing pitch must be a

Bb, creating the ordered interval series to match the of the inverted row

form beginning in m. 5 but probably not on first hearing without a score, and certainly

not as readily as in the Bach. In other words, the second work is more efficient, more

condensed. Because the missing pitches cannot be determined easily based on the

surrounding material, these pitches carry a high information content.

Other potential sources of corruption beyond literal transmission factors like

radio static, a corrupted score, or poor acoustics, and figurative transmission factors like

poor performance raise larger questions about the nature of entropy in music. One can

interpret an imprecise piano reduction of an orchestral work as a corruption of that

orchestral work, in roughly the same way one could consider a poorly executed English

translation of a German text a corruption of the original. However, if one considers

corruption as something that can happen within the music itself, as opposed to something

imposed upon the music by external factors (things like radio static or performers mistakes), it becomes difficult to decide which musical features are the original signal and which are corruption: is a theme an original signal and its variations corruption? is

the original A section of a ternary form an original signal and its altered A corruption?

Since entropy is defined as a messages ability to resist corruption, what can entropy be

said to measure in these cases? It may be meaningful to say that a theme resists

variation or that a melody resists ornamentation, if the former is not very memorable

or if the latter is already very elaborate, but these states may or may not coincide with

18

entropy figures generated for these passages (in that a very elaborate melody may still be very predictable and therefore have a low entropy, for example).

More to the point, this approach makes questionable implications about the nature

of musical meaning in such a work. Is it reasonable to consider a Stokowski transcription

as necessarily subsidiary to the work it transcribes, as opposed to an independent work in

its own right even if the aesthetic of the transcription is meaningfully different from the

aesthetic of the original? If so, is it still reasonable to consider a Webern transcription of

Bach, or for that matter a Wendy Carlos performance of Bach, in the same light? In cases

of music not governed by a score, which performance is the canonical performance and

which is the corrupted performance?

Meyer also raises the issue of cultural noise: corruption that occurs in

transmission as the result of a time-lag between the habit responses which the audience

actually possess and those which the more adventurous composer envisages for it.19 This

can be understood as avant-garde music whose language an audience has not yet

internalized, or as pre-modern music heard differently by modern or post-modern

audiences. In this case, the music is not corrupted by any external factors, but the

audience's perception is; the issue is not signal transmission, but signal reception.

It seems most reasonable, for the purposes of this project, to consider each score as an uncorrupted signal, accepting publisher and performer mistakes as corruption but

accepting changes that arise through arrangement as part of an original signal. (That is to say, this project accepts Shelleys philosophy of translation: that a translation is or should be a new artwork unto itself rather than a derivative work dependent upon an original.20)

19 Meyer, Meaning in Music and Information Theory, 420.

20 Percy Shelley, A Defence of Poetry and Other Essays (1840; Project Gutenberg, 2005), http://www.gutenberg.org/etext/5428. See Part I.

19

The issue of cultural noise is important, because it is important in every work of analysis;

an information-theoretic analysis cannot assume an audience will hear a work the way an

ideal listener would, but nor can any other kind of analysis that wishes to reflect a

practical perceptual reality.

In any case, the factors that lead to high or low entropy in a musical example are

the same as in the excerpted Fitzgerald, O'Brien, and Celan texts. If we analyzed these

texts using literal letters as an alphabet, we would be able to identify the texts as English,

and we would probably be able to make general statements about the author's style for

example, one could determine the average entropy for a passage saturated with Latinate

vocabulary and the average entropy for Anglo-Saxon vocabulary, based on which letters

occur the most frequently and which letters do not occur at all (such as w and j in Latin), and from this make predictions about the loftiness or folksiness of the author's writing

style. Similarly, if we accept pitch as an alphabet, we can make predictions about how

diatonic or how chromatic a musical excerpt is, based on which pitches occur the most

frequently. However, loftiness of vocabulary does not result from avoiding the letters w

and j, any more than tonality results from using scale degrees 1 and 5 frequently. Low entropy (on a pitch-by-pitch basis) is generally symptomatic of tonality, but does not speak to the harmonic progressions that bring tonality into being.

It may be inappropriate to claim that entropy created by pitches is directly

analogous to low-level, letter-based entropy in text. In some contexts a pitch may be

operating as a part of a word (for example, a single pitch within an arpeggiation), while in other contexts that pitch may be a word unto itself. For this reason, pitch-based entropy

may be more relevant to musical analysis than letter-based entropy is to literary analysis.

Nevertheless, it seems reasonable to claim that the analysis of more complex musical

alphabets may strengthen the link between musical style or predictability and entropy

20

calculations, creating something more broadly comparable to word-based entropy in

text. In both music and in text, entropy (as perceived intuitively by the listener or reader) is lowered by the presence of connective material (arpeggiations, passing tones, parsimonious voice leading), repetition (motivic material, canons, imitation), and larger structures (a T-P-D-T phrase structure, a serial row). If alphabets are built that can address the existence or nonexistence of these elements and structures, a more intuitive

interpretation of entropy will result.

Generally speaking, entropy is less of a commentary on musical meaning than it is

a commentary on musical style, and the degree of redundancy or predictability with

which that meaning is communicated. With that said, though, it is impossible to divorce

the two concepts, just as the meaning of a text cannot be separated from the words with which it is conveyed or, arguably, from the audience's interpretative creation of

meaning. As Meyer writes,

Both meaning and information are thus related through probability to uncertainty. For the weaker the probability of a particular consequent in any message, the greater the uncertainty (and information) involved in the antecedent-consequent relationship.21

Earlier, Meyer highlights this same relationship as the source of musical meaning:

Musical meaning arises when an antecedent situation, requiring an estimate as to the

probable modes of pattern continuation, produces uncertainty as to the temporal-tonal

nature of the expected consequent.22 Although this relationship has not always been the

focus of music theory's use of information theory entropy, Meyer's comments imply that

information theory entropy has potential insight into musical meaning as well as musical

style.

21 Meyer, Meaning in Music and Information Theory, 416

22 Ibid.

21

CHAPTER III EXISTING MUSIC-THEORETIC SCHOLARSHIP ON INFORMATION THEORY

ENTROPY

Use of entropy in music theory is generally thought to begin with Youngblood's

1958 article Style as Information, in which entropies are calculated for eight songs

from Schuberts Die Schne Mllerin, six arias from Mendelssohn's St. Paul, and six

songs from Schumanns Frauen-Liebe und Leben. Only melodies in major keys are considered. In each case, a modified system of scale degrees is used as an alphabet; 1

indicates tonic, 2 indicated a raised tonic or a lowered subtonic, and so forth up to 12. His

zero-order results for these composers can be summarized as follows:

Composer Zero-order Entropy Zero-order Relative Entropy Mendelssohn 3.03 84.60% Schumann 3.05 85.00% Schubert 3.13 87.00%

Table 3.1: Pitch entropies from Youngblood

Youngblood finds the Mendelssohn sample to have the lowest entropy (or, alternately, the greatest redundancy/inefficiency) of the three, although he finds all three composers to have very similar entropies overall.23

Youngblood also compares the entropy values for these composers to the

23 Joseph Youngblood, Style as Information, Journal of Music Theory 2, no. 1 (1958): 24-35.

22

entropies of a collection of randomly chosen Mode I chants. When these chants are

considered as representatives of a seven-note alphabet, they are found to have a much

higher relative entropy than the lieder and arias (HR=96.7%). Youngblood attributes this to the chants' more regular use of non-final and non-tenor tones, as compared to the

lieder's marked preference for diatonic pitches over chromatic ones. Of course, when

considered as representative of a twelve-note alphabet, the chant selections have a lower

entropy than the works of all three later composers (H=2.72, HR=76%).24 Knopoff and Hutchinson question Youngblood's non-chant results for statistical

reasons, claiming that Youngblood's sample size is too small for the differences he finds

in Mendelssohn's and Schuberts entropies to be significant. In support for this argument,

they construct confidence intervals for Youngblood's data, shown in Figure 3.1. When

Youngblood says the entropy of his Mendelssohn sample is 3.03, he makes the implicit

claim that this sample is representative of all of Mendelssohn that if an analyst were to

compute a total entropy for all extant Mendelssohn works, that result would be fairly

close to Youngblood's. Confidence intervals measure how certain we are that the sample's

entropy is comparable to that of Mendelssohns complete body of work. Figure 3.1 shows

Knopoff and Hutchinsons confidence intervals for Youngblood's entropy calculations.

In this example, Knopoff and Hutchinson state with 95% confidence that

Mendelssohn's entropy falls between 2.895 and 3.183, and that Schubert's entropy falls

between 3.016 and 3.244. Since the confidence intervals overlap, one cannot conclude

based on this data that Schubert's and Mendelssohn's total entropies differ; it is entirely

possible, based on this data, that Schubert's total entropy is in fact lower than

24 Although most listeners probably hear chant in terms of a seven-note alphabet, one can

imagine factors that would lead the listener to hear chant in terms of a twelve-note alphabet, such as placement of the chant between or within tonal works (as in the fragment of chant that concludes Bruckners Os Justi), or a listeners lack of familiarity with the repertoire.

23

Mendelssohn's, or that the two are equal.25

For simple random samples, confidence intervals are generally computed using

some variant of the following formula

25 Leon Knopoff and William Hutchinson, Entropy as a Measure of Style: The Influence of Sample Length, Journal of Music Theory 27, no. 1 (1983): 75-97.

Figure 3.1: 95% confidence intervals for Youngblood's entropy calculations

24

where x-bar is the mean of the sample (in this case, the sample's entropy), s is the sample's standard deviation, n is the sample size, and is the quantity we wish to

establish: the predicted entropy for the musical style or composer in question.26 As is

clear from this formula, there are two factors that influence the size of a confidence

interval: sample size (Knopoff and Hutchinson's focus) and sample variance (the focus of a 1990 Snyder article). The former is reasonably intuitive; a very small sample could be a fluke, but if a large sample of Mendelssohn's work supports the conclusion that his total

entropy is 3.03, then it seems much more probable that Mendelssohn's overall entropy

really is close to 3.03. Snyder adds that variance within the sample can also make us

more or less confident. If an analyst looks at four Mendelssohn samples of comparable

length and finds them to have entropy values of 2, 4.98, 3.6, and 1.4, that analyst would

have difficulty predicting Mendelssohn's total entropy, since the samples are so disparate.

By contrast, if the first four samples came back as 3.01, 3.08, 2.97, and 2.93 the

conclusions drawn about these data would seem much more reasonable, even if the

sample were smaller.27

In addition to statistical concerns, Snyder and Knopoff and Hutchinson highlight

26 The multiplier 1.96 specifies a 95% confidence interval that is, if we take 100 samples, the means of 95 of the samples will fall within this interval. Multiplying by 1.645 instead would return a 90% confidence interval. This formula is provided only to illustrate the concept; confidence intervals in this paper were calculated for binomial proportions, taking propagation of error into account. See Appendix A of Knopoff and Hutchinson, 1993, for more information.

27 John Snyder, Entropy as a Measure of Musical Style: The Influence of A Priori Assumptions, Music Theory Spectrum 12, no. 1 (1990): 121-160.

25

several methodological problems, the clearest of which is modulation. In a piece that

modulates from C minor to Eb major, one would expect the pitches C, G, Eb, and Bb to occur with the greatest frequency which increases that piece's entropy quite sharply,

since such a piece contains four pitches that occur frequently, instead of the two such

pitches found in a nonmodulatory work. (This logic could be expanded to include scale degree 7 of each key as well as scale degree 5, but the result would be the same.) In a piece that modulates from C minor to G major, the shift to a new diatonic collection would result in a higher entropy, as well. Youngblood's analyses make no accommodation

for this; although he computes entropies based on a scale-degree system, these scale

degrees are never adjusted for modulations. He notes that this lack of regard for modulation may have disguised the differences between his Schumann and Schubert

samples, noting that (at least in these samples) Schubert's chromatic pitches tended to arise from modulation, whereas chromatic pitches in his Schumann samples tended to be

more ornamental very different phenomena that lead to similar results.28

In their analyses, Knopoff and Hutchinson compensate for modulations by

normalizing all passages to C major or A minor, although this normalization is only initiated by changes in written key signature. Snyder finds this disregarding of implied

modulations quite problematic, as well as the implied prioritization of la-minor. Since a

modulation between relative keys is never accompanied by a change in key signature, a

piece that modulates from, say, F major to D minor would register as having higher entropy insofar as the latter tonal area deviates from its la-tonic. Snyder also questions

the assumption that modulations should be normalized away, arguing that a piece that

begins and ends in distantly related keys ought to have a higher entropy than a piece that

28 Youngblood, 78.

26

begins and ends in the same key.29 Alternately, one could argue that a piece that

modulates from I to V ought to have a lower entropy than a piece that modulates from,

say, I to bII that the predictability (and perhaps even the smoothness) of a modulation ought to be a factor in that piece's entropy calculations.

Unfortunately, there are few solutions to the problem of accurate representation of

modulations in entropy. Arguably, Youngblood's system does associate distant keys with

higher entropies, since a modulation from C major to G major would contribute much less to a piece's entropy than a modulation from C major to Ab major would, based on the number of pitches held in common between the two respective diatonic collections. One

could imagine combining this method with a weighting system, in which pitches

belonging to passages in non-tonic keys contribute less to the piece's total entropy than

pitches in the tonic key do. Ideally, these weights would be determined in part by the

amount of time spent in the new key (as the listener's ability to remember the home key diminishes over time), but any such system would almost certainly be criticized as arbitrary.

The debate over how best to represent modulation in entropy calculations for

tonal repertoire highlights the concern that underlies most if not all entropy-based

analyses: what alphabet best reflects listeners' perceptions of musical language?

Uninterpreted pitch or pitch class is rejected as an alphabet because it is a poor reflection of listeners' interpretative hearings, since it shows no connection between the roles of C

and G in C minor and Eb and Bb in Eb major. Similarly, when Snyder adopts a twenty-eight-letter alphabet in which enharmonic spellings are taken as separate up to double

29 Snyder, 126-128. While the average listener may not realize that a piece has ended in C# major instead of C major, this same listener would probably notice if the piece begins in C major and ends in F# minor, if only for reasons of mode and register a distinction that cannot be made within this system.

27

flats (of scale degrees 7, 3, 6, and 2) and double sharps (of scale degrees 1, 4, and 5), his motivation is the notion that listeners hear F and E# as distinct pitches in certain contexts,

rather than the creation of an exhaustive system.

The variety of options available to analysts (even in terms of pitch alone) speaks to the expressive potential of these alphabets, since they can be altered to best reflect

listeners' perceptions of any given repertoire. This same flexibility can limit the analyst's

ability to compare samples from sufficiently different styles, though. It would seem

unfair to compare Wagner's entropy within a twenty-eight-note system with late serial

Schoenberg's, for example, since Schoenberg's disuse of double sharps does not speak to

any increased predictability in his music as compared to Wagner's, nor would it be fair to

say the listener finds Schoenberg's style more constricted because these letters are

omitted. One of the unstated goals of such analysis, then, is the selection of an alphabet

that is sensitive to perceptual concerns for specific repertoires but also general enough in

its applicability that its use on music from other repertoires seems reasonable.

This challenge is even greater for contextual music. The most common and most

universally applicable alphabet, either pitch names or scale degrees accepting octave and

enharmonic equivalence, is all but useless for serial music or any sort of music that

exhausts the aggregate regularly. Any such piece will have maximal entropy for that

alphabet cardinality, regardless of whether the piece is based on a derived row or an all-

interval row and, indeed, regardless of whether or not the piece is atonal at all. This

entropic equality implies that Webern's Variations for Piano is exactly as predictable as

Boulez's Piano Sonata no. 2, which would be in turn just as predictable as the first few bars of Coltrane's Giant Steps an unintuitive claim to say the least.

One potential solution is the incorporation of higher-order entropies, often

accomplished through the guise of Markov chains. Such constructs would allow the

28

analyst to look for patterns in the ordering of pitches, rather than relying on their

frequency alone. With a simple pitch alphabet, Markov chains could not differentiate

between serial rows, but could at least distinguish between a serial piece and a non-serial

piece that happens to use each pitch equally. Higher-order constructs have even clearer

applicability in entropic analyses of tonal music, since they measure predictability of

succession something of particular importance if entropy is taken to be a measure of

tonality, since entropy on its own is order-blind. Thus, from the perspective of zero-order

entropy, the progressions in Figures 3.2 and 3.3 are exactly the same, although certainly

one is more predictable than the other within a tonal paradigm, and certainly one is more

tonal than the other. By contrast, Markov chains could differentiate between these two

strings easily.

Figure 3.2: Passage with pitch-class entropy 2.52

Figure 3.3: Passage with pitch-class entropy 2.52

29

In his 1958 analyses, Youngblood computes entropies on first-order combinations,

in addition to entropies based on zero-order data. That is, rather than accepting C, D, and

E as the most basic units of music, Youngblood accepts C followed by C, C followed by

C#, C followed by D, and so forth as individual letters, creating an alphabet with 144

letters. However, Hessert notes that the continued effectiveness of this strategy is limited;

an alphabet built from consecutive pitch pairs is almost reasonable at 144 letters, but

three consecutive letters lead to 1728 possibilities, which leads to unwieldy

calculations.30 One can only imagine the complexity of a higher-order alphabet that does

not accept octave equivalence.

Hessert advocates the use of an alphabet based on intervals as a potential solution

to this problem, since a computation based on intervals is effectively first-order without

requiring any first-order computations. He also notes that an alphabet based on intervals

avoids the issue of modulation quite nicely, while reflecting motivic content more

accurately than pitch-based analysis can and potentially allowing for more meaningful

comparisons across disparate repertoire.31 Rhodes advocates a similar solution: an

alphabet that combines each pitch with its preceding interval.32 Potentially, such an

alphabet would allow the analyst to distinguish between typical and non-typical

resolutions of dissonant tones; a piece in which any scale degree can be left by any

interval is probably less tonal than a piece in which certain scale degrees (4 and 7, e.g.) can usually only be left by certain intervals (down by step and up by step, respectively). Of course, in the eyes of this computation, a composer who always resolves 7 to b5

30 Hessert, 16ff.

31 Ibid., 43-44.

32 James Rhodes, Musical Data as Information: A General-Systems Perspective on Musical Analysis, Computing in Musicology 10 (1995-1996): 165-180.

30

would be no more or less predictable than a composer who always resolves 7 to 1, or

even a composer whose 7s can resolve anywhere but whose b3s always resolve to b6.

Through its reliance on pitch, this sort of analysis nullifies many of the benefits Hessert

ascribes to intervallic analysis.

Lewin conceptualizes the importance of higher-order analytical capacity in terms

of charge, defined as the listener's degree of uncertainty as to what an upcoming

interval will be based on the intervals directly preceding it. His analysis goes up to sixth-

order strings (that is, fifth-order computations based on intervals), but it occurs within a highly idealized environment: a twelve-tone row independent of musical context, and

therefore independent of irregularities (e.g., partial presentations or reorderings of a row) or complications (e.g., the division of a row into verticalities, leading to the creation of melodic intervals not present in the original row) that would make such higher-order analysis impractical.33

Based on this ideal environment, Lewin determines that if a listener is able to

remember the previous five intervals of Schoenberg's String Quartet, no. 4, the listener can predict the sixth interval with complete certainty (assuming the row form in question has not been altered or truncated). This certainty is not an accurate reflection of listeners' perceptions of this row, though, even under ideal circumstances; if it were, Lewin argues,

the associated musical experience would be quite dull. Therefore, he concludes, the

listener probably only hears back two or three intervals perhaps more or fewer,

depending on motivic structure, complexity of the line's presentation, repetitiveness of

the line, and other factors, but probably not six. Thus, even if such higher-order analyses

were practical, they may not be a reasonable reflection of the listener's experience.

33 David Lewin, Some Applications of Communication Theory to the Study of Twelve-Tone Music, Journal of Music Theory 12, no. 1 (1968): 50-84.

31

Of course, one can imagine situations in which a less literal sixth-order analysis

would be appropriate. Although it seems unreasonable to expect a listener to remember

and consider six successive intervals, it seems quite reasonable for a listener to remember

a six-element contour, or six intervals expressed in the form of two or three verticalities.

To date, this form of entropy chunking has had almost no mention in the relevant

literature.

Hessert also raises the question of duration, finding it problematic that a half-note

chord root C is treated as equal to a sixteenth-note neighbor tone D in most entropic

analyses.34 Hiller and Bean reflect this same concern in their 1966 analyses of sonata

expositions, in which longer notes are weighted more heavily than shorter notes but

Hessert criticizes this approach for its lack of attention to attack, arguing that sixteen

sixteenth-note Cs are quite different perceptually from a single whole-note C.35 One

imagines that any method of computation that addresses both concerns would be

prohibitively complex; it seems most likely that any interested analyst must choose

whichever approach seems least inappropriate for that analyst's particular repertoire.

In any case, the most salient of Hessert's concerns that an ornamental tone is

treated as equal to a chord tone seems to be more an issue of interval than of duration,

since ornamental tones are approached by step more often than not (and since an ornamental tone approached by a large leap is probably aurally surprising enough that it

ought well to contribute as much to the piece's entropy as the chord tone it ornaments). It seems perceptually reasonable to claim that a whole step is a whole step, regardless of

whether that whole step connects a C and a passing D or a chord tone C and an adjacent

34 Hessert, 68.

35 Lejaren Hiller and Calvert Bean, Information Theory Analyses of Four Sonata Expositions, Journal of Music Theory 10, no. 1 (1966): 96-137.

32

chord tone D. From a pitch-based perspective, the distance a line must travel to arrive at

the next pitch may be more relevant from the perspective of musical predictability than

how long the line stays on that pitch but even in the absence of pitch, it seems the

primary determinant of predictability is not the duration of each individual pitch, but

instead either the rhythmic pattern in which these pitches present themselves or the

presence or absence of attacks at certain metric positions.

Hessert cites one example of entropy calculations based on rhythmic patterns, an

unpublished 1959 Master's thesis by John Brawley (Indiana University). Hessert finds this analysis problematic, since it relies upon an implicit invocation of an alphabet of

infinite cardinality, which makes the computation of relative entropy and redundancy

impossible. Additionally, Brawley sets forth no predetermined limits to what constitutes a

pattern. Does a dotted quarter followed by an eighth note constitute a rhythmic pattern?

If this configuration begins on a weak beat or is preceded by an eighth note, is it the same

pattern? Is pattern perceptually the same at M.M.=160 as it is at M.M.=40?36

Snyder advocates the exploration of duration-sensitive entropy calculations, but

he notes that such calculations almost necessarily conflate clock time with perceptual

time.37 In other words, by creating calculations based on the notated tempo we implicitly

privilege the former, which is less defensible given the degree to which analysis based on

entropy is meant to be a measure of listeners' perceptions of predictability. Of course, any

analysis that claims to be a reflection of perceptual time must almost certainly encompass

multiple musical domains beyond rhythm, tempo, and duration. In priviliging longer

notes over shorter ones, we run the risk of (for example) privileging extended neighbor notes over the shorter chord tones they ornament.

36 Ibid., 45-50.

37 Snyder, 125-126.

33

Other than Rhodes, few analysts have attempted to deal with more than one

musical alphabet simultaneously. The notable exception is Hiller and Fuller's 1967

analysis of the op. 21 Symphony, in which pitch (not pitch class) is combined with the number of eighth notes between successive attacks. Entropies are also computed on

various types of intervals. These entropy calculations are then used to draw conclusions

about formal sections of each section of the first movement. When pitch is considered

alone, results between zero-order entropy and first- or higher-order chains are

inconclusive; although the development is (as one would expect) the least predictable in terms of individual pitches, its higher-order results are more predictable than either the

exposition or the recapitulation.38 These inconsistencies carry over into interval-based

and attack-point-based entropies.39

As mentioned, entropy is the quantity of information (measured in the number of bits the message would require to store or transmit) that each letter of an alphabet conveys. Hiller and Fuller also express their entropy in terms of bits per second (based on the notated tempo) that is, examining entropy in terms of the rate at which information is presented. Their hope is to distinguish between the listener's experience of a great deal

of information presented quickly, and the same amount of information presented over a

longer timespan. Interpreting entropy in terms of bits per second does not change the

entropy results for op. 21, but the idea bears investigation: that the speed with which

information is presented influences the audience's perception of its complexity.

Unfortunately, this measure cannot describe how evenly distributed information is across

this passage distinguishing, for example, a burst of information followed by silence

38 Lejaren Hiller and Ramon Fuller, Structure and Information in Webern's Symphonie, op. 21, Journal of Music Theory 11, no. 1 (Spring 1967): 78. 39 Ibid., 84ff.

34

from a passage with a continuous information rate. Of course, the accuracy of Webern's

notated tempos is problematic in any case, and the frequent ritardandos in his music make

it less plausible that a calculation of this type could be relevant to a performance. Despite

any practical limitations, though, the fact that entropy was considered in terms of the rate

at which information is received hints at an early connection between entropy and

diachronic analysis and arguably, an early connection between entropy and time, as

well.

Hessert gives four criteria for effective entropy-based analyses:

1. An alphabet should be finite; 2. Elements in an alphabet should be discrete; 3. Sample sizes should be as large as possible; 4. Analysis should be based on as many musical domains as possible.

The first two are basic criteria without which entropy calculations are impossible; the

second two are desiderata but not necessarily requirements. To these one can add that

entropy can most effectively analyze samples with low variances, since entropy is in

some sense a decontextualized measure of central tendency. Smaller sample sizes may

assist in analysis, if they serve to reduce variance; it is more effective to analyze a small

sample that possesses a given characteristic uniformly than to combine this sample with

another sample lacking this characteristic. Imagine a bimodal grade distribution in which

many students have a 90% average and many have a 60% average. Considering these

students in terms of two smaller sample sizes allows one to generalize about the data

easily, but combining the two samples yields both an unhelpful overall average and a

much higher degree of uncertainty. The same logic ought to apply to musical domains.

Considering data across multiple domains is useful, but considering multiple domains

simultaneously that is, combining entropies of different domains into a single entropy

35

measure may disguise tendencies in the data.

The cautions one can draw from the history of entropy in music are, for the most

part, no different from the cautions that apply to all analysis. In particular, entropy-based

analyses are problematic when they do not reflect musical experience. If one accepts that

all music analysis is necessarily metaphor that quantitative analyses are simply a

different way of exploring metaphor then the most important caution is that these

metaphors must be apt, rather than relying upon their quantitative nature to make their

arguments. If an analyst is careful to ensure that conclusions based on entropy are

reflective of musical experience and perception diachronic or synchronic then entropy

can prove a useful tool for analysis.

36

CHAPTER IV ALPHABETS FOR ENTROPY-BASED ANALYSIS

Interval Entropy

As discussed previously, pitch class entropy is rarely useful for analysis of post-

tonal music. The table below gives pitch class entropy figures for a collection of post-

tonal vocal works; Youngbloods results for Schuberts pitch entropy provides a baseline

from tonal repertoire.

Work Style Pitch class entropy Relative entropy Webern op. 15 (Fnf geistliche Lieder), without no. 540

Freely atonal 3.58 100%

Webern op. 16 (Fnf Canons) and op. 15, no. 5

Freely atonal canons 3.57 99.7%

Webern op. 25 (Drei Lieder)

Serial, based on a derived row

3.58 100%

Babbitt, Widow's Lament in Springtime

Serial, based on an all-interval row

3.58 100%

Youngblood's Schubert sample

Tonal 3.13 87.4%

Table 4.1: Pitch entropies in Webern works, compared with Babbitt and Schubert

No measures of statistical significance are necessary to interpret these results.

40 Since op. 15, no. 5 is a canon, it is included with the op. 16 canons throughout this section.

37

Although pitch entropy is able to distinguish Schubert from Webern, it is unable to

distinguish between serial and freely atonal works, or derived rows and all-interval rows.

Even canons are seen as maximally unpredictable, although one imagines the second and

third voices are quite predictable indeed.

Intuitively, it seems entropy based on interval class should be able to distinguish

between these styles. Pitch class entropy can only recognize canons iterated at the same

pitch level, but a canon interpreted as a series of intervals should be recognizable at any

pitch level. Although the order-blindness of entropy somewhat limits its effectiveness for

canons, interval class entropy can at least distinguish a canon from a non-canonic piece in

the same style. The same logic applies to serial works; a serial work will generally have

lower entropy than a freely atonal work since any interval appearing in the row would be

repeated many times, while any interval not appearing in the row would be heard very

infrequently. (Similarly, a serial work based on an all-interval row should have roughly the same number of all interval classes, whereas a work based on a derived row would

have roughly proportionate numbers of a few interval classes and very few of any others.) Such a measure could be unable to distinguish between a serial work based on a derived

row and a freely atonal work saturated with the pitch class set that forms the basis of the

former's derived row, or a serial work based on an all-interval row and a freely atonal

work that simply exhausts the aggregate of interval classes regularly, but arguably, most

listeners would not be able to make this distinction, either.

These intuitions are somewhat flawed, in that they assume an idealized linear

presentation of a serial row. Vertical presentation of a portion of a row or the division of a

row amongst several voices will almost certainly create new intervals not represented in

the original row. Nevertheless, entropy is at heart a measure of predictability, and it seems

reasonable that it should reflect the listener's surprise at hearing an interval not linearly

38

present in the row, even if reflection of this surprise comes at the expense of the

construct's ability to identify a work as serial or non-serial.

Horizontal interval class analysis of the same works from Table 4.1 provides the

results shown in Table 4.2 and Figure 4.1.

Work Interval class entropy

Deviation at a 95% confidence level

Relative entropy

Webern op. 15 2.57 .04 91.5% Webern op. 16 2.48 .04 88.3% Webern op. 25 2.35 .05 83.6% Babbitt, Widow's Lament 2.72 .06 96.8%

Table 4.2: Interval class entropies comparing serial and non-serial works

Figure 4.1: Interval class entropies comparing serial and non-serial works

39

These data indicate that interval class entropy is able to distinguish between

derived and all-interval rows, and between canons and non-canons from approximately

the same period. These are both important tests of the construct's effectiveness; its ability

to make these distinctions speaks toward its ability to reflect musical saturation and

predictability.

These distinctions are not retained when vertical intervals are included.

Work Entropy (vertical and horizontal intervals)

Deviation Relative entropy

op. 16 3.42 .05 95.5% op. 25 3.34 .06 93.3%

Table 4.3: Vertical and horizontal interval entropy on one serial and one non-serial work

Although op. 25's entropy is still lower than op. 16's, the difference is no longer

significant. In other words, based on these calculations we cannot posit a distinction

between Webern's use of verticalities in op. 16 and op. 25; if both pieces were played as

block chords, it is unlikely the listener would be able to distinguish between them based

solely on intervallic content.

Returning to the question of horizontal intervals, then, we find that removing

inversional equivalence eliminates many of the distinctions between these works, as

shown in Table 4.4 and Figure 4.2. Without inversional equivalence, relative entropies are

higher across the board since what was originally an emphasis on interval class 1

becomes a dual emphasis on registrally-ordered interval classes 1 and 11. Variances

increase for the same reason, which makes statistically significant distinctions less likely.

40

Nevertheless, registrally-ordered interval class entropy can still distinguish

meaningfully between canons and non-canons (Webern op. 15 vs. op. 16) and between derived rows and all-interval rows (Webern op. 25 vs. Babbitt). The most interesting difference between Table 4.2 and Table 4.4 is op. 25, which has a lower interval class

entropy than op. 16 but a higher registrally-ordered interval class (ric) entropy. This distinction speaks to a fundamentally different approach to inversion between these two

works. In op. 16, an ric1 is not the same as an ric11, since a melodic ric1 in the clarinet

line could not be answered with an ric11 in the vocal line without breaking the canon.

Assumptions of inversional equivalence seem much more reasonable in op. 25, since the

juxtaposition of prime rows with inversional rows leads the listener to hear intervals and their inversions as at least related, if not equivalent.

Work Registrally-ordered interval class entropy


Webern op. 15 3.40 .06 95.0% Webern, op. 16 3.24 .06 90.5% Webern, op. 25 3.34 .06 93.3% Babbitt, Widow's Lament in Springtime

3.50 .07 97.8%

Table 4.4: Registrally-ordered interval class entropy in Webern and Babbitt

41

The remaining oddity in these data is the similarity between Webern op. 15 and

Babbitt. To investigate this similarity, we expand intervallic entropy into interval entropy

(-72 < x < 72) and ordered directional interval class entropy (-12 < x < 12).41

Ordered directional interval class entropy bears few surprises. The data in Table 4.5 and

Figure 4.3 show the expected distinction between Webern and Babbitt, but from these

data no conclusions can be drawn about any of the Webern works examined almost the

opposite of the results generated by registrally-ordered interval class entropy.

41 72 (or six octaves) is a number chosen out of convenience the distance between the highest and lowest pitch in any of these pieces, rounded up to the nearest octave.

Figure 4.2: Registrally-ordered interval class entropy in Webern and Babbitt

42

Work Ordered directional interval class


Webern op. 15 3.71 .08 82.1% Webern op. 16 3.64 .09 80.1% Webern op. 25 3.52 .11 77.9% Babbitt, Widow's Lament

4.03 .12 89.2%

Table 4.5: Ordered directional interval class entropy in serial and non-serial works

Figure 4.3: Ordered directional interval class entropy in serial and non-serial works

43

The inferences to be made from this apparent inconsistency either that ordered interval

class is less a relevant structure in these Webern works, or that Webern's predictability in

terms of ordered interval class remains consistent across a variety of post-tonal styles

are at first alarming. Either conclusion makes suspect Hessert's claim that interval-based

entropy is capable of dealing meaningfully with works from disparate periods and styles

given that registrally-ordered interval class entropy lacks the generality to distinguish

between Babbitt and freely-atonal Webern, while another lacks the generality to

distinguish between Webern works of different styles and time periods. Perhaps the more

useful claim to draw from this perceived lack of generality is that any invocation of

intervallic entropy must be nuanced that in computing intervallic entropy we make

implicit assumptions about a given composer's approach to the interval, assumptions that

should be examined and argued.

One must also keep in mind that although statistically significant differences

between works imply differences in style, the lack of statistically significant differences

does not imply stylistic similarities. The lack of distinction between Webern's op. 15 and

Babbitt's Widow's Lament in terms of registrally-ordered interval class entropy does

not imply a fundamental similarity between these works' use of registrally-ordered

interval classes; rather, the differences between the works are simply not profound

enough for us to be certain that they imply a genuine stylistic difference. In short, an

unexpected significant difference between two works is noteworthy, but an unexpected

similarity need not be.

At the very least, these results demonstrate the utility of examining repertoire

from multiple perspectives on the interval. These results also hint at the possibility of

using various types of intervallic entropy as evidence in an argument against, for

44

example, accepting inversional equivalence as a given in analysis of a particular work.

Entropy computations for pure intervals, as opposed to interval classes, provide

the following results:

Work Interval entropy Deviation Webern op. 15 4.92 .11 Webern op. 16 4.75 .11 Webern op. 25 4.98 .16 Babbitt, Widow's Lament

4.91 .16

Table 4.6: Interval entropy in Webern and Babbitt

Figure 4.4: Interval entropy in Webern and Babbitt

45

Relative entropy is omitted here, because the maximal entropy of a 144-letter

alphabet is extraordinary large. As a result, these results would have extraordinarily small

relative entropies, which would give an impression of predictability not audible in the

music.

Although op. 16 seems to have a much smaller entropy than all other works

considered, this deviation is not statistically significant. Even if it were, the conclusions

drawn would be slightly problematic. One could not conclude even from significantly

smaller interval entropy that Webern op. 16 relies more upon smaller interval

Date post:	02-Oct-2015
Category:	Documents
Upload:	ransompaycheque
View:	27 times
Download:	3 times

Musical Time and Information Theory Entropy

Documents