+ All Categories
Home > Documents > Musical Time and Information Theory Entropy

Musical Time and Information Theory Entropy

Date post: 02-Oct-2015
Category:
Upload: ransompaycheque
View: 27 times
Download: 3 times
Share this document with a friend
Description:
Musical Time and Information Theory Entropy
Popular Tags:
131
University of Iowa Iowa Research Online eses and Dissertations 2010 Musical time and information theory entropy Sarah Elizabeth Culpepper University of Iowa Copyright 2010 Sarah Elizabeth Culpepper is dissertation is available at Iowa Research Online: hp://ir.uiowa.edu/etd/659 Follow this and additional works at: hp://ir.uiowa.edu/etd Part of the Music Commons Recommended Citation Culpepper, Sarah Elizabeth. "Musical time and information theory entropy." MA (Master of Arts) thesis, University of Iowa, 2010. hp://ir.uiowa.edu/etd/659.
Transcript
  • University of IowaIowa Research Online

    Theses and Dissertations

    2010

    Musical time and information theory entropySarah Elizabeth CulpepperUniversity of Iowa

    Copyright 2010 Sarah Elizabeth Culpepper

    This dissertation is available at Iowa Research Online: http://ir.uiowa.edu/etd/659

    Follow this and additional works at: http://ir.uiowa.edu/etd

    Part of the Music Commons

    Recommended CitationCulpepper, Sarah Elizabeth. "Musical time and information theory entropy." MA (Master of Arts) thesis, University of Iowa, 2010.http://ir.uiowa.edu/etd/659.

  • MUSICAL TIME AND INFORMATION THEORY ENTROPY

    by

    Sarah Elizabeth Culpepper

    A thesis submitted in partial fulfillment of the requirements for the Master of

    Arts degree in Music in the Graduate College of

    The University of Iowa

    July 2010

    Thesis Supervisor: Assistant Professor Robert C. Cook

  • Graduate College The University of Iowa

    Iowa City, Iowa

    CERTIFICATE OF APPROVAL

    _______________________

    MASTER'S THESIS

    _______________

    This is to certify that the Master's thesis of

    Sarah Elizabeth Culpepper

    has been approved by the Examining Committee for the thesis requirement for the Master of Arts degree in Music at the July 2010 graduation.

    Thesis Committee: ___________________________________ Robert C. Cook, Thesis Supervisor

    ___________________________________

    Nicole Biamonte

    ___________________________________

    Jerry Cain

  • ii

    Is their wish so unique To anthropomorphize the inanimate

    With a love that masquerades as pure technique?

    Donald Justice Nostalgia of the Lakefronts

  • iii

    TABLE OF CONTENTS

    LIST OF TABLES ............................................................................................................. iv

    LIST OF FIGURES ........................................................................................................... vi

    CHAPTER

    I. INTRODUCTION ............................................................................................1

    II. INFORMATION THEORY ENTROPY ..........................................................6

    III. EXISTING MUSIC-THEORETIC SCHOLARSHIP ON INFORMATION THEORY ENTROPY ........................................................21

    IV. ALPHABETS FOR ENTROPY-BASED ANALYSIS ..................................36

    Interval Entropy ..............................................................................................36 CSEG Entropy ................................................................................................46 PC-Set Entropy ...............................................................................................61

    V. INFORMATION AND TIME ........................................................................67

    VI. ANALYSES ...................................................................................................80

    Op. 16, no. 1: Christus factus est ................................................................80 Op. 5, no. 4 .....................................................................................................97

    VII. CONCLUSION.............................................................................................115

    BIBLIOGRAPHY ............................................................................................................117

  • iv

    LIST OF TABLES

    Table

    3.1. Pitch entropies from Youngblood .............................................................................21

    4.1. Pitch entropies in Webern works, compared with Babbitt and Schubert .................36

    4.2. Interval class entropies comparing serial and non-serial works ...............................38

    4.3. Vertical and horizontal entropy on one serial and one non-serial work ...................39

    4.4. Registrally-ordered interval class entropy in Webern and Babbitt ...........................40

    4.5. Ordered directional interval class entropy in serial and non-serial works ................42

    4.6. Interval entropy in Webern and Babbitt ...................................................................44

    4.7. CSEG entropies for random and motivic strings ......................................................49

    4.8. CSEG entropies, random string versus Webern, op. 5, no. 1 ...................................51

    4.9. CSEGs, random string versus Webern, op. 5, no. 1 .................................................52

    4.10. CSEG entropies, op. 5, no. 1, versus op. 18 .............................................................54

    4.11. CSEG entropies, op. 5, no. 1, versus op. 15 .............................................................56

    4.12. CSEG entropies for serial works ..............................................................................60

    4.13. Pc-set entropies in op. 16 and op. 25 using discrete segmentation algorithm ..........62

    4.14. Pc-set entropies in op. 16 and op. 25 using window algorithm ................................64

    4.15. Vertical pc-set entropy in op. 16 and op. 25, no. 1 ...................................................65

    6.1. Pitch class entropy in op. 16, no. 1 ...........................................................................87

    6.2. Pitch class entropy in the vocal line of op. 16, no. 1. ...............................................88

    6.3. CSEG entropies in op. 16, no. 1 ...............................................................................89

    6.4. Interval class entropy in op. 16, no. 1 .......................................................................91

    6.5. Discrete pc-set entropies in op. 16, no. 1 ..................................................................91

    6.6. Pitch entropy in sections of op. 5, no. 4 ..................................................................104

    6.7. Interval class entropies in op. 5, no. 4 ....................................................................105

    6.8. Discrete pc-set entropies in op. 5, no. 4 ..................................................................106

  • v

    6.9. Pitch entropy in op. 5, no. 4, A and B.....................................................................108

    6.10. Interval entropies in op. 5, no. 4, A and B ..............................................................109

  • vi

    LIST OF FIGURES

    Figure

    2.1. A corrupted tonal work .........................................................................................16

    2.2. A corrupted contextual work .................................................................................16

    3.1. 95% confidence intervals for Youngbloods entropy calculations ...........................23

    3.2. Passage with pitch class entropy 2.52 .......................................................................28

    3.3. Passage with pitch class entropy 2.52 .......................................................................28

    4.1. Interval class entropies comparing serial and non-serial works ...............................38

    4.2. Registrally-ordered interval class entropy in Webern and Babbitt ...........................41

    4.3. Ordered directional interval class entropy in serial and non-serial works ...............42

    4.4. Interval entropy in Webern and Babbitt ...................................................................44

    4.5. A randomly-generated string of pitches....................................................................48

    4.6. A motivic string of pitches........................................................................................49

    4.7. CSEG entropies for random and motivic strings ......................................................50

    4.8. CSEG entropies, random string versus Webern, op. 5, no. 1 ...................................51

    4.9. CSEGs, random string versus Webern, op. 5, no. 1 .................................................53

    4.10. CSEG entropies, op. 5, no. 1, versus op. 18 .............................................................55

    4.11. CSEG entropies, op. 5, no. 1, versus op. 15 .............................................................56

    4.12. Melody generated using the CSEG distribution of Webern, op. 5, no. 1 .................57

    4.13. Melody generated using the CSEG distribution of a string of random pitches. .......58

    4.14. CSEG entropies for serial works ..............................................................................60

    4.15. Op. 27, no. 1, mm. 20-21 ..........................................................................................61

    4.16. Pc-set entropies in op. 16 and op. 25 using discrete segmentation algorithm ..........63

    4.17. Pc-set entropies in op. 16 and op. 25 using window algorithm ................................64

    4.18. Vertical pc-set entropy in op. 16 and op. 25, no. 1 ...................................................65

    6.1. Vertical ic1s in op. 16, no. 1 ....................................................................................83

  • vii

    6.2. Pitch class entropy in op. 16, no. 1 ...........................................................................88

    6.3. Pitch class entropy in the vocal line of op. 16, no. 1 ................................................89

    6.4. CSEG entropies in op. 16, no. 1 ...............................................................................90

    6.5. Interval class entropy in op. 16, no. 1 .......................................................................91

    6.6. Discrete pc-set entropies in op. 16, no. 1 ..................................................................92

    6.7. Lewins depiction of the three flyaway motives. ......................................................98

    6.8. Pc-set analysis of op. 5, no. 4 .................................................................................100

    6.9. Clampitts analysis of op. 5, no. 4, mm. 1-6 ...........................................................101

    6.10. Pitch entropy in op. 5, no. 4 ....................................................................................104

    6.11. Interval class entropy in op. 5, no. 4. ......................................................................105

    6.12. Registrally-ordered interval class entropy in op. 5, no. 4 .......................................106

    6.13. Discrete pc-set entropies in op. 5, no. 4 ..................................................................107

    6.14. Pitch entropy in op. 5, no. 4, A and B.....................................................................108

    6.15. Interval class entropy in op. 5, no. 4, A and B........................................................109

    6.16. Registrally-ordered interval class entropy in op. 5, no. 4, A and B ........................110

  • 1

    CHAPTER I INTRODUCTION

    In the conclusion of The Time of Music (1988), Jonathan Kramer gives two anecdotes of his personal experiences with what he calls musical timelessness. The first

    recalls a performance of the middle movement of Satie's Pages mystiques, a collection of

    phrases repeated 840 times in succession:

    For a brief time I felt myself getting bored, becoming imprisoned by a hopelessly repetitious piece. Time was getting slower and slower, threatening to stop. But then I found myself moving into a different listening mode. I was entering the vertical time of the piece. My present

    expanded, as I forgot about the music's past and future.... After what seemed forty minutes I left. My watch told me that I had listened for three hours. I felt exhilarated, refreshed, renewed.1

    The second anecdote concerns the opposite condition, a happening dense enough to

    induce sensory overload:

    The production began at 7:00 p.m. The noise level was consistently high, and the visual panorama was dizzying. I found myself, although performing, focusing my attention on one layer, then another, and then various combinations of layers.... After what seemed to be a couple of hours, everyone spontaneously agreed that it was time to stop... I loaded my tape and slides into my car. Only then did I glance at my watch. It was not yet 8:00! What had seemed like a two-hour performance must have lasted under 25 minutes by the clock.2

    Kramer attributes the disparity between these temporal experiences to the amount and

    density of information each performance contained. Music that is predictable and easily

    chunked, he argues, takes up less mental storage space and seems shorter than music

    that is less predictable; Thus a two-minute pop tune will probably seem shorter than a

    1 Jonathan Kramer, The Time of Music (New York: Schirmer, 1988): 379. 2 Ibid., 380.

  • 2

    two-minute Webern movement.3

    The connection between musical predictability and perception of musical time is a

    common one. Kramer characterizes musical temporalities as directed, multiply-directed,

    and non-directed based on their movement towards a predictable goal.4 Re-ordered

    temporal progressions, such as the misplaced closing gestures Levy finds in Haydn and

    the evolving themes Hatten finds in Beethoven, draw power from their violation of

    listener expectations.5 Although complicating factors abound the audience's familiarity

    with a musical idiom; tendency to disengage from overly predictable works; how

    comfortable the chairs are the existence of a connection between time and predictability

    is clear.

    This thesis examines the relationship between time and predictability through the

    lens of information theory entropy. Just as traditional entropy speaks to the degree of

    randomness in a system, information theory entropy speaks to the randomness of a

    message or, alternately, to that message's predictability. Although information theory

    entropy was initially developed to determine the most efficient way to encode a message

    for radio transmission, it has since been adopted as an analytical tool by a variety of

    fields, including linguistics, literary criticism, and music theory.

    In particular, information theory entropy seems relevant to Webern's music.

    Adorno refers to Webern's work as possessing a skeletal simplicity, a comparative

    economy of musical materials that seems well-suited for analysis in terms of information

    3 Ibid., 337.

    4 Ibid., 16ff.

    5 Janet Levy, Gesture, Form, and Syntax in Haydn's Music, in Haydn Studies: Proceedings of the International Haydn Conference (New York: Norton, 1981), 355-362; Robert Hatten, The Troping of Temporality in Music, in Approaches to Meaning in Music, ed. Byron Almen and Edward Pearsall (Bloomington: Indiana University Press, 2006), 66ff.

  • 3

    theory in the sense that no pitch or gesture seems superfluous or reducible, as though its

    omission would not have a marked effect on the passage, or as though it had only been

    added to fill space before the beginning of the next phrase.6 (In Adornos words: Every single note in Webern fairly crackles with meaning.7) Literary applications of information theory entropy speak meaningfully to this economy as a feature of poetry, as

    will be shown in a later chapter; I believe entropy can speak to these same qualities in

    Webern's work.

    Webern's music is also of interest to this project because of the relationship between information content and the listener's perception of time, as will be discussed in

    chapter 5. Certainly perception of time is salient to analysis of Weberns work. As

    Stockhausen writes, If we realise, at the end of a piece of music... that we have 'lost all

    sense of time', then we have in fact been experiencing time most strongly. This is how we

    always react to Webern's music.8 In a different vein, Ligeti describes Webern's music as

    the spatialization of time.9 Perception and analysis of time in Webern is, at the very least,

    complicated, but entropy provides a useful metaphor for its description and a useful tool

    for its examination.

    In the 2009 article Number Fetishism, Vanessa Hawes criticizes music-theoretic

    use of information theory as... well, as number fetishism: as a component of the claim

    6 Theodor Adorno, The Aging of the New Music, in Essays on Music, ed. Richard Leppert, trans. Susan Gillespie (Berkeley, Los Angeles: University of California Press, 2002), 187.

    7 Theodor Adorno, Quasi una Fantasia: Essays on Modern Music, trans. Rodney Livingstone (New York: Verso, 1998), 180. 8

    Karlheinz Stockhausen, Structure and Experiential Time, Die Reihe 2 (Bryn Mawr, PA: Presser, 1959), 65.

    9 Gyorgy Ligeti, Metamorphoses of Musical Form, Die Reihe 7 (Bryn Mawr, PA: Presser, 1965), 16.

  • 4

    that music theorists can consider themselves scientists who refute or uphold hypotheses

    based on empirical evidence, a notion she depicts as quaint and outdated.10 Indeed, early

    uses of information theory often relied upon questionable assumptions, as Hessert (1971) claims, and were often divorced from diachronic perception of music.11 Nevertheless,

    insofar as information theory entropy measures predictability a very salient factor in

    diachronic perception of music it can be a relevant lens for the examination of musical

    time.

    Using information theory to quantify subjective musical temporality would be questionable indeed, but using information theory to analyze and discuss temporality

    seems much less problematic. Writing about traditional entropy, Eddington clarifies the

    situation:

    Suppose that we were asked to arrange the following in two categories distance, mass, electric force, entropy, beauty, melody....

    I think there are the strongest grounds for placing entropy alongside beauty and melody, and not with the first three. Entropy is only found when the parts are viewed in association, and it is by viewing or hearing the parts in association that beauty and melody are discerned. All three are features of arrangement. It is a pregnant thought that one of these three associates should be able to figure as a commonplace quantity of science. The reason why this stranger can pass itself off among the aborigines of the physical world is that it is able to speak their language, viz., the language of arithmetic.12

    Entropy is discussed in terms of number but is not the fetishism of number; rather, it is a

    10 Vanessa Hawes, Number Fetishism: The History of the Use of Information Theory as a Tool for Musical Analysis, in Music's Intellectual History 2009, ed. Zdravko Blazekovic and Barbara Dobbs Mackenzie (New York: RILM, 2009), esp. 836-838.

    11 Norman Hessert, The Use of Information Theory in Musical Analysis (Ph.D diss., Indiana University, 1971).

    12 A. Eddington, The Nature of the Physical World (Ann Arbor: University of Michigan Press, 1935), 105.

  • 5

    powerful and elegant principle that can be expressed quantitatively. Similarly,

    information theory entropy need not be a formula divorced from musical experience, but

    can instead be a analytical tool and metaphor for the discussion of something deeply

    experiential and even as Meyer (1957) claims a way to approach the question of musical meaning.13

    This thesis begins with an explication of information theory entropy (chapter 2) and a history of its use in music theory (chapter 3). In chapter 4, a variety of alternative approaches to entropy are developed, including entropy calculations based on CSEGs and

    pc-sets (as opposed to single pitch classes). Chapter 5 makes a more in-depth argument for the relationship between information theory entropy and time, recasting analyses of

    temporality in Webern in terms of entropy. Finally, in chapter 6, information theory

    entropy will be used to analyze time in the first of the Fnf Canons, op. 16, and the fourth of the Fnf Stze, op. 5 two movements in which form is created by perceptible shifts among differing depictions of temporality, shifts prompted by varying degrees of

    predictability in a variety of musical domains.

    13 Leonard Meyer, Meaning in Music and Information Theory, Journal of Aesthetics and Art Criticism 15, no. 4 (1957): 412-424.

  • 6

    CHAPTER II INFORMATION THEORY ENTROPY

    Information theory entropy is based on the idea that in most alphabets, some

    letters communicate more information than other letters do, because they occur less

    frequently. If a word has been corrupted during transmission and all that remains is q _ _

    _ k, the recipient can easily guess what the original word was, since there are very few

    words that contain both a q and a k. By contrast, if all that remains of the word is _ _ i c

    _, the original word is much more difficult to guess. Since q and k are uncommon, they

    communicate more information about the original message than more common letters

    can.14

    In general, the more unequal the frequencies of letters in an alphabet are, the

    easier it is to determine what letters have been corrupted. If an alphabet only has two

    letters, A and B, but the former occurs 90% of the time and the latter occurs 10% of the

    time, the message recipient has an excellent chance of guessing any letters that have been

    corrupted (since there is a 90% chance any given letter will be an A). By contrast, if A and B appear 50% of the time, our ability to guess a missing letter is diminished.

    From the perspective of a person sending a telegram, the former language is very

    inefficient. Assume, for simplicity, any message in this language must contain exactly

    90% As and 10% Bs (although in a real language, these would be averages). If the

    14 Some more in-depth sources on information theory entropy:

    A. Khinchin, Mathematical Foundations of Information Theory (New York: Dover, 1957); Abraham Moles, Information Theory and Esthetic Perception, trans. Joel Cohen (Urbana and London: University of Illinois Press, 1966); Lawrence Rosenfield, Aristotle and Information Theory (Paris, The Hague: Mouton, 1971); Claude Shannon, A Mathematical Theory of Communication, Bell System Technical Journal 27 (1948), 379-423; Claude Shannon and Warren Weaver, A Mathematical Model of Communication (Urbana: University of Illinois Press, 1949). Information in this chapter is drawn heavily from these sources, as well as from the music-theoretic sources cited in Chapter 3.

  • 7

    transmitter is limited to ten characters, there are exactly ten words s/he can send:

    AAAAAAAAAB, AAAAAAAABA, AAAAAAABAAA, and so on. The letter A is so

    common that it is practically meaningless; only the position of the less common letter

    differentiates between words, but it occurs very rarely. By contrast, in a language that is

    50% A and 50% B, the transmitter would have 2^10 or 1024 word choices. By creating

    an alphabet in which all letters occur with the same frequency, the efficiency of

    transmission is maximized.

    Of course, in addition to being more efficient, the latter alphabet is less resistant

    to corruption. Ideally, one must find a balance between the most efficient language

    possible and the most robust language possible, to be sure that the message arrives to its

    recipient intact but without wasting time or resources during transmission. Finding this

    balance generally for the purpose of data compression or encoding was one of the

    first goals of the field of information theory, pioneered by Bell Labs engineer Claude

    Shannon in the late 1940s.

    The inequality of the amount of information contributed by each letter in an

    alphabet is called the Shannon entropy of that alphabet. If Shannon entropy is low, the

    language is inefficient but robust; a few letters occur very frequently and the rest are

    uncommon. If Shannon entropy is high, the language is efficient; each letter occurs with

    roughly the same frequency and therefore each letter conveys the most information

    possible.

    The Shannon entropy of a message or an alphabet is given by the following

    formula:

    Here, p(x) is the probability that a given event occurs; p(x=6) denotes the probability that

  • 8

    a randomly selected pitch will be an F#, for example.

    The example of bits illustrates the purpose of the logarithm in this formula. Each

    bit presents two choices; given six bits, the number of combinations that can be

    communicated is two to the sixth power. The entropy formula can seen as taking the

    number of possible choices (here, expressed in terms of probability) and returning the number of bits that would be necessary to communicate that much information.15

    (Log base two is necessary to express these results in terms of bits. Another log base would create meaningful data if used consistently, but these data would be in terms of

    other units of measurement.) Effectively, the use of logarithms in this formula ensures that the highest entropy is created when each possible outcome has an equal probability

    of occurring, and that the lowest entropy is created when one event has a very high

    probability of occurring. Consider an alphabet that has one letter, A, that occurs 100% of

    the time. The entropy for this language is

    that is, since we are absolutely certain every letter will be an A, the language has an

    uncertainty of zero and an entropy of zero. The closer any probability gets to 1, the

    smaller the language's entropy becomes. For example, if this language had three letters

    instead, in which A occurred 98% of the time, and B and C each occurred 1% of the time,

    the entropy of the language would be

    15 See Khinchin or Shannon and Weaver for more information.

  • 9

    The logarithmic expression makes the contribution of the first term very small, whereas

    the small probabilities make the contributions of the second and third terms very small as

    well. By contrast, if each option occurs with roughly equal frequency, the entropy of the

    language is

    which is the highest possible entropy for an alphabet with three letters. Of course, the

    more letters in an alphabet, the higher the maximal entropy becomes. If this same

    equally-weighted alphabet had eight letters, its entropy would be

    An alphabet with twenty-six letters has a maximal entropy of 4.7; an alphabet with a

    hundred letters has a maximal entropy of 6.64.

    It is clear from these examples that entropy is most useful for comparisons. The

    claim that an alphabet with a hundred letters has a maximal entropy of 6.64 is not terribly

    meaningful on its own; it only takes on meaning when paired with the statement that an

    alphabet with three letters has a maximal entropy of 1.58, or with other entropy

    calculations from hundred-letter alphabets.

    To allow more meaningful comparisons between entropies of alphabets with

    different cardinalities, we introduce the concept of relative entropy, which expresses

    entropy values (as computed above) as a percentage of the maximal possible entropy for an alphabet of that cardinality. For example, the relative entropies of the cardinality three

    and cardinality eight alphabets discussed above are

  • 10

    and

    respectively. Thus, we can think of these two alphabets as having equivalent entropies,

    even if their absolute entropies are not equal.

    Relative entropy also allows entropy calculations to reflect unused letters in a

    passage. Intuitively, a passage of English text that uses only thirteen letters should not

    have the same entropy as a passage of Hawaiian. One imagines the former would seem

    more stilted, more restricted than the latter, since a listener would hear it in the context of

    a twenty-six letter alphabet, rather than a thirteen-letter alphabet. Similarly, a piece that

    only uses the pitches C, C#, Eb, G#, A, A#, and B with given frequencies is very different

    from a passage of chant that uses each of its seven tones with the same frequencies as the

    above piece. While it is likely that the former piece will be heard as using a restricted

    subset of a twelve-pitch alphabet, the latter piece exhausts its alphabet and would not be

    heard as restricted in its materials in the same way as the former. The traditional entropy

    formula is unable to reflect this distinction, because any unused letters carry with them a

    probability of 0, effectively canceling out any entropic contribution from those letters, but

    these unused letters are relevant to the computation of relative entropy through their

    conclusion in the maximal entropy for an alphabet of a given cardinality.

    Nevertheless, the use of relative entropy requires caution. A piece of music that

  • 11

    uses three pitches with equal frequency is much more predictable, mathematically and

    aurally, than a piece of music that uses twelve pitches with equal frequencies, even

    though their relative frequencies are equal. In other words, although relative entropy

    allows for comparison between alphabets of different cardinalities, such a comparison

    must always be considered alongside the alphabets respective absolute entropies. In this

    paper, relative entropy will only be invoked in the presence of corresponding absolute

    entropy figures or some sort of intuitive justification for hearing these alphabets as perceptually similar.

    It is also clear from these examples that the entropy of a message at least,

    entropy computed on the literal letters of an alphabet is indepedent of the meaning of

    that message. Entropy only reflects characteristics of the language in which that message

    is written or encoded. However, the meaning of a message may become relevant if the

    'alphabet' in question is not a literal alphabet. For years, literary critics especially

    modernists and post-modernists, and in particular those interested in the work of Thomas

    Pynchon have completed information theory analyses of texts using words or images,

    instead of literal letters, as the letters of an alphabet. In this case, the most commonly

    occurring letters are connective words like articles and prepositions. Consider, for

    example, a corrupted block of text from F. Scott Fitzgerald's The Great Gatsby, from

    which every eighth word has been removed:

    When we pulled out into the winter and the real snow, our snow, began stretch out beside us and twinkle against windows, and the dim lights of small Wisconsin echoed by, a sharp wild brace came into the air. We drew in deep of it as we walked back from through the cold vestibules,

    unutterably aware of identity with this country for one hour before we melted indistinguishably into it again.

    Although the result is disjointed in places, it is certainly still intelligible; in places the reader cannot tell the message has been corrupted at all. Every image found in this

  • 12

    excerpt is repeated; if the word 'winter' were corrupted, 'snow' and 'cold' would still

    convey its meaning. Additionally, the passage contains many connective words and

    when these words are missing (as in, our snow began stretch out beside us) the blanks can be filled in easily. We conclude that this passage has low word-based entropy,

    regardless of any entropy figures computed on the basis of individual letter frequency.

    For comparison, an excerpt from Flann O'Brien's At Swim-Two-Birds (considered the first Irish post-modern novel) has been similarly corrupted below.

    I will relate, said Finn. Till a man accomplished twelve books of poetry, the same is not for want of poetry but is forced away. man is taken till a black hole is in the world to the depth of his oxters and he put into it to gaze it with his lonely head and nothing to but his shield and a stick of. Then must nine warriors fly their at him, one with the other and together.

    From this, we can gather that we are listening to a narrator named Finn; word repetition

    clues us in that poetry and war are somehow involved, but there is little else we can say

    about this passage. This same lack of repetition makes the original, non-corrupted

    passage more difficult to understand than the non-corrupted Fitzgerald.

    I will relate, said Finn. Till a man has accomplished twelve books of poetry, the same is not taken for want of poetry but is forced away. No man is taken till a black hole is hollowed in the world to the depth of his two oxters and he put into it to gaze from it with his lonely head and nothing to him but his shield and a stick of hazel. Then must nine warriors fly their spears at him, one with the other and together.

    In examining the original passage, we are in fact examining different sources of

    corruption. The Fitzgerald is robust against the corruption of readers lacking context, or

    readers being sleepy; in the presence of these forms of corruption the passage is still

    readable. The O'Brien is much less robust by comparison. We conclude the passage has

    higher entropy.

  • 13

    Alternately, we can conclude that the O'Brien passage is more efficient than the

    Fitzgerald, since each individual word communicates more information. If the reader can

    easily guess the meaning of a missing word, as in the Fitzgerald, then that word has a

    very low information content; with these words removed, the passage becomes less

    elegant but not much less intelligible. This is the same quality that makes this passage

    easy to summarize. Since removing words from the O'Brien limits the reader's ability to

    comprehend the passage, we can conclude that the missing words had a higher

    information content that overall, there are fewer redundant or repeated words, and that

    therefore, the O'Brien is a more efficient communication.

    Finally, we examine a passage from Todtnauberg by Paul Celan.16

    Arnica, Eyebright, the drink from the with the star-die on top,

    in the

    into the book whose name did it in before mine? the line written into this about a hope, today, for a thinker's (un- coming) word in the

    The result is nearly unintelligible; the reader cannot guess the original narrator, subject, or purpose of this passage. What remains is interpretable, certainly, but the reader cannot

    16 Although this passage is shorter than the others, the same percentage of words has been removed in each case.

  • 14

    be certain of the original text based on this except. Consequently, this passage has high

    entropy.

    As expected, lack of repeated words and lack of connective words contribute to

    higher entropy. Shared meaning contributes to lower entropy as well, as seen in the

    Fitzgerald example dealing with 'snow,' 'winter,' and 'cold.' From these examples, though,

    we can also see that clear syntactical structures reduce entropy. If the reader can perceive

    the sentence structure underlying We drew in deep ____ of it as we walked back from

    ____ through the cold vestibules, the reader can make more educated guesses as to what

    the missing words could be. The second missing word appears to be some sort of place;

    the first missing word is a noun that can be an object of the verb 'to draw in,' so perhaps the missing word is 'breaths' or 'gasps' or something along those lines. The general import

    of the sentence is still clear. Similarly, in the O'Brien sentence Then must nine warriors

    fly their ____ at him, as long as the reader can parse that warriors are throwing things at

    an unhappy target, the meaning of the sentence is clear.

    By contrast, poetry especially the works of Paul Celan is characterized by

    economy of words and imagery, in that every word contributes a great deal of meaning to

    a passage. This is why the Celan example is the least intelligible of the above: there are

    few redundant words, and the associations between words are specifically designed to be

    unexpected and novel. In other words, each word is intended to convey the greatest

    possible amount of information.

    One could conceptualize this new, more meaning-sensitive interpretation of

    entropy as occurring on a higher level than entropy computed based on literal letters of an

    alphabet. If this Fitzgerald sample were encoded in a different alphabet if it were

    written in binary, or encrypted for secure transmission without changing its vocabulary,

    its low-level, alphabet-based entropy would be quite different but its higher-level, word-

  • 15

    based entropy would be the same. To achieve a word-based equivalent to encryption, one

    would need a paraphrase of this text by another author, or a similar text that

    communicates the same images (snowfall; evening; solitude) or the same themes (introspection; nostalgia; the notion that a persons actions and mindsets are influenced by that persons home17) using thriftier vocabulary.

    These corrupted blocks of text can be seen as analogous to hearing music in a

    static-filled radio broadcast. Listening to a Haydn string quartet in such a broadcast, one

    would still be able to identify the key, the time signature, and the instrumentation; one

    could make an educated guess as to which movement the quartet was playing, and

    probably one could even hum the missing notes. By contrast, listening to such a broadcast

    of the Webern Concerto, op. 24, one might not even be able to determine the

    orchestration of the piece, let alone guess the missing notes. One can imagine a similar

    corruption of the original musical signal being created by a poor ensemble; in this

    situation, the Haydn can be considered to have a higher entropy because ensemble

    mistakes, whether wrong notes or dynamic mismatches or harmonic misalignments, are

    generally much more recognizable than the corresponding mistakes would be in the

    Webern. Because the listener is (usually) able to form more confident predictions for upcoming events in the Haydn, violations of these predictions (including mistakes) are more striking.18

    Alternately, consider the (comically) corrupted piece of music shown in Figure 2.1.

    17 From the next paragraph: I see now that this has been a story of the West, after all Tom and Gatsby, Daisy and Jordan and I, were all Westerners, and perhaps we possessed some deficiency in common which made us subtly unadaptable to Eastern life.

    18 This is a generalization, of course. Many Webern compositions can be considered to have low entropy in terms of dynamics in which case a mistake in terms of dynamics would be immediately recognizable as such.

  • 16

    Figure 2.1: A corrupted tonal work

    Despite the corruption, the identity of this piece is readily apparent. Even a listener who

    had never heard this piece before could make a reasonable guess at every missing note,

    based on typical harmonic progressions, repetition, and motive. By comparison, a

    similarly corrupted, non-tonal work, shown in Figure 2.2, is less easy to identify.

    Figure 2.2: A corrupted contextual work

  • 17

    A listener already familiar with the piece might be able to identify this as the third

    movement of Webern's Variations for Piano, op. 27, but a listener unfamiliar with the

    piece would not even be able to guess which of the corrupted objects were pitches and which were rests. A listener who expects a serial work based on a derived row may be

    able to fill in the blanks surmising in retrospect that the first missing pitch must be a

    Bb, creating the ordered interval series to match the of the inverted row

    form beginning in m. 5 but probably not on first hearing without a score, and certainly

    not as readily as in the Bach. In other words, the second work is more efficient, more

    condensed. Because the missing pitches cannot be determined easily based on the

    surrounding material, these pitches carry a high information content.

    Other potential sources of corruption beyond literal transmission factors like

    radio static, a corrupted score, or poor acoustics, and figurative transmission factors like

    poor performance raise larger questions about the nature of entropy in music. One can

    interpret an imprecise piano reduction of an orchestral work as a corruption of that

    orchestral work, in roughly the same way one could consider a poorly executed English

    translation of a German text a corruption of the original. However, if one considers

    corruption as something that can happen within the music itself, as opposed to something

    imposed upon the music by external factors (things like radio static or performers mistakes), it becomes difficult to decide which musical features are the original signal and which are corruption: is a theme an original signal and its variations corruption? is

    the original A section of a ternary form an original signal and its altered A corruption?

    Since entropy is defined as a messages ability to resist corruption, what can entropy be

    said to measure in these cases? It may be meaningful to say that a theme resists

    variation or that a melody resists ornamentation, if the former is not very memorable

    or if the latter is already very elaborate, but these states may or may not coincide with

  • 18

    entropy figures generated for these passages (in that a very elaborate melody may still be very predictable and therefore have a low entropy, for example).

    More to the point, this approach makes questionable implications about the nature

    of musical meaning in such a work. Is it reasonable to consider a Stokowski transcription

    as necessarily subsidiary to the work it transcribes, as opposed to an independent work in

    its own right even if the aesthetic of the transcription is meaningfully different from the

    aesthetic of the original? If so, is it still reasonable to consider a Webern transcription of

    Bach, or for that matter a Wendy Carlos performance of Bach, in the same light? In cases

    of music not governed by a score, which performance is the canonical performance and

    which is the corrupted performance?

    Meyer also raises the issue of cultural noise: corruption that occurs in

    transmission as the result of a time-lag between the habit responses which the audience

    actually possess and those which the more adventurous composer envisages for it.19 This

    can be understood as avant-garde music whose language an audience has not yet

    internalized, or as pre-modern music heard differently by modern or post-modern

    audiences. In this case, the music is not corrupted by any external factors, but the

    audience's perception is; the issue is not signal transmission, but signal reception.

    It seems most reasonable, for the purposes of this project, to consider each score as an uncorrupted signal, accepting publisher and performer mistakes as corruption but

    accepting changes that arise through arrangement as part of an original signal. (That is to say, this project accepts Shelleys philosophy of translation: that a translation is or should be a new artwork unto itself rather than a derivative work dependent upon an original.20)

    19 Meyer, Meaning in Music and Information Theory, 420.

    20 Percy Shelley, A Defence of Poetry and Other Essays (1840; Project Gutenberg, 2005), http://www.gutenberg.org/etext/5428. See Part I.

  • 19

    The issue of cultural noise is important, because it is important in every work of analysis;

    an information-theoretic analysis cannot assume an audience will hear a work the way an

    ideal listener would, but nor can any other kind of analysis that wishes to reflect a

    practical perceptual reality.

    In any case, the factors that lead to high or low entropy in a musical example are

    the same as in the excerpted Fitzgerald, O'Brien, and Celan texts. If we analyzed these

    texts using literal letters as an alphabet, we would be able to identify the texts as English,

    and we would probably be able to make general statements about the author's style for

    example, one could determine the average entropy for a passage saturated with Latinate

    vocabulary and the average entropy for Anglo-Saxon vocabulary, based on which letters

    occur the most frequently and which letters do not occur at all (such as w and j in Latin), and from this make predictions about the loftiness or folksiness of the author's writing

    style. Similarly, if we accept pitch as an alphabet, we can make predictions about how

    diatonic or how chromatic a musical excerpt is, based on which pitches occur the most

    frequently. However, loftiness of vocabulary does not result from avoiding the letters w

    and j, any more than tonality results from using scale degrees 1 and 5 frequently. Low entropy (on a pitch-by-pitch basis) is generally symptomatic of tonality, but does not speak to the harmonic progressions that bring tonality into being.

    It may be inappropriate to claim that entropy created by pitches is directly

    analogous to low-level, letter-based entropy in text. In some contexts a pitch may be

    operating as a part of a word (for example, a single pitch within an arpeggiation), while in other contexts that pitch may be a word unto itself. For this reason, pitch-based entropy

    may be more relevant to musical analysis than letter-based entropy is to literary analysis.

    Nevertheless, it seems reasonable to claim that the analysis of more complex musical

    alphabets may strengthen the link between musical style or predictability and entropy

  • 20

    calculations, creating something more broadly comparable to word-based entropy in

    text. In both music and in text, entropy (as perceived intuitively by the listener or reader) is lowered by the presence of connective material (arpeggiations, passing tones, parsimonious voice leading), repetition (motivic material, canons, imitation), and larger structures (a T-P-D-T phrase structure, a serial row). If alphabets are built that can address the existence or nonexistence of these elements and structures, a more intuitive

    interpretation of entropy will result.

    Generally speaking, entropy is less of a commentary on musical meaning than it is

    a commentary on musical style, and the degree of redundancy or predictability with

    which that meaning is communicated. With that said, though, it is impossible to divorce

    the two concepts, just as the meaning of a text cannot be separated from the words with which it is conveyed or, arguably, from the audience's interpretative creation of

    meaning. As Meyer writes,

    Both meaning and information are thus related through probability to uncertainty. For the weaker the probability of a particular consequent in any message, the greater the uncertainty (and information) involved in the antecedent-consequent relationship.21

    Earlier, Meyer highlights this same relationship as the source of musical meaning:

    Musical meaning arises when an antecedent situation, requiring an estimate as to the

    probable modes of pattern continuation, produces uncertainty as to the temporal-tonal

    nature of the expected consequent.22 Although this relationship has not always been the

    focus of music theory's use of information theory entropy, Meyer's comments imply that

    information theory entropy has potential insight into musical meaning as well as musical

    style.

    21 Meyer, Meaning in Music and Information Theory, 416

    22 Ibid.

  • 21

    CHAPTER III EXISTING MUSIC-THEORETIC SCHOLARSHIP ON INFORMATION THEORY

    ENTROPY

    Use of entropy in music theory is generally thought to begin with Youngblood's

    1958 article Style as Information, in which entropies are calculated for eight songs

    from Schuberts Die Schne Mllerin, six arias from Mendelssohn's St. Paul, and six

    songs from Schumanns Frauen-Liebe und Leben. Only melodies in major keys are considered. In each case, a modified system of scale degrees is used as an alphabet; 1

    indicates tonic, 2 indicated a raised tonic or a lowered subtonic, and so forth up to 12. His

    zero-order results for these composers can be summarized as follows:

    Composer Zero-order Entropy Zero-order Relative Entropy Mendelssohn 3.03 84.60% Schumann 3.05 85.00% Schubert 3.13 87.00%

    Table 3.1: Pitch entropies from Youngblood

    Youngblood finds the Mendelssohn sample to have the lowest entropy (or, alternately, the greatest redundancy/inefficiency) of the three, although he finds all three composers to have very similar entropies overall.23

    Youngblood also compares the entropy values for these composers to the

    23 Joseph Youngblood, Style as Information, Journal of Music Theory 2, no. 1 (1958): 24-35.

  • 22

    entropies of a collection of randomly chosen Mode I chants. When these chants are

    considered as representatives of a seven-note alphabet, they are found to have a much

    higher relative entropy than the lieder and arias (HR=96.7%). Youngblood attributes this to the chants' more regular use of non-final and non-tenor tones, as compared to the

    lieder's marked preference for diatonic pitches over chromatic ones. Of course, when

    considered as representative of a twelve-note alphabet, the chant selections have a lower

    entropy than the works of all three later composers (H=2.72, HR=76%).24 Knopoff and Hutchinson question Youngblood's non-chant results for statistical

    reasons, claiming that Youngblood's sample size is too small for the differences he finds

    in Mendelssohn's and Schuberts entropies to be significant. In support for this argument,

    they construct confidence intervals for Youngblood's data, shown in Figure 3.1. When

    Youngblood says the entropy of his Mendelssohn sample is 3.03, he makes the implicit

    claim that this sample is representative of all of Mendelssohn that if an analyst were to

    compute a total entropy for all extant Mendelssohn works, that result would be fairly

    close to Youngblood's. Confidence intervals measure how certain we are that the sample's

    entropy is comparable to that of Mendelssohns complete body of work. Figure 3.1 shows

    Knopoff and Hutchinsons confidence intervals for Youngblood's entropy calculations.

    In this example, Knopoff and Hutchinson state with 95% confidence that

    Mendelssohn's entropy falls between 2.895 and 3.183, and that Schubert's entropy falls

    between 3.016 and 3.244. Since the confidence intervals overlap, one cannot conclude

    based on this data that Schubert's and Mendelssohn's total entropies differ; it is entirely

    possible, based on this data, that Schubert's total entropy is in fact lower than

    24 Although most listeners probably hear chant in terms of a seven-note alphabet, one can

    imagine factors that would lead the listener to hear chant in terms of a twelve-note alphabet, such as placement of the chant between or within tonal works (as in the fragment of chant that concludes Bruckners Os Justi), or a listeners lack of familiarity with the repertoire.

  • 23

    Mendelssohn's, or that the two are equal.25

    For simple random samples, confidence intervals are generally computed using

    some variant of the following formula

    25 Leon Knopoff and William Hutchinson, Entropy as a Measure of Style: The Influence of Sample Length, Journal of Music Theory 27, no. 1 (1983): 75-97.

    Figure 3.1: 95% confidence intervals for Youngblood's entropy calculations

  • 24

    where x-bar is the mean of the sample (in this case, the sample's entropy), s is the sample's standard deviation, n is the sample size, and is the quantity we wish to

    establish: the predicted entropy for the musical style or composer in question.26 As is

    clear from this formula, there are two factors that influence the size of a confidence

    interval: sample size (Knopoff and Hutchinson's focus) and sample variance (the focus of a 1990 Snyder article). The former is reasonably intuitive; a very small sample could be a fluke, but if a large sample of Mendelssohn's work supports the conclusion that his total

    entropy is 3.03, then it seems much more probable that Mendelssohn's overall entropy

    really is close to 3.03. Snyder adds that variance within the sample can also make us

    more or less confident. If an analyst looks at four Mendelssohn samples of comparable

    length and finds them to have entropy values of 2, 4.98, 3.6, and 1.4, that analyst would

    have difficulty predicting Mendelssohn's total entropy, since the samples are so disparate.

    By contrast, if the first four samples came back as 3.01, 3.08, 2.97, and 2.93 the

    conclusions drawn about these data would seem much more reasonable, even if the

    sample were smaller.27

    In addition to statistical concerns, Snyder and Knopoff and Hutchinson highlight

    26 The multiplier 1.96 specifies a 95% confidence interval that is, if we take 100 samples, the means of 95 of the samples will fall within this interval. Multiplying by 1.645 instead would return a 90% confidence interval. This formula is provided only to illustrate the concept; confidence intervals in this paper were calculated for binomial proportions, taking propagation of error into account. See Appendix A of Knopoff and Hutchinson, 1993, for more information.

    27 John Snyder, Entropy as a Measure of Musical Style: The Influence of A Priori Assumptions, Music Theory Spectrum 12, no. 1 (1990): 121-160.

  • 25

    several methodological problems, the clearest of which is modulation. In a piece that

    modulates from C minor to Eb major, one would expect the pitches C, G, Eb, and Bb to occur with the greatest frequency which increases that piece's entropy quite sharply,

    since such a piece contains four pitches that occur frequently, instead of the two such

    pitches found in a nonmodulatory work. (This logic could be expanded to include scale degree 7 of each key as well as scale degree 5, but the result would be the same.) In a piece that modulates from C minor to G major, the shift to a new diatonic collection would result in a higher entropy, as well. Youngblood's analyses make no accommodation

    for this; although he computes entropies based on a scale-degree system, these scale

    degrees are never adjusted for modulations. He notes that this lack of regard for modulation may have disguised the differences between his Schumann and Schubert

    samples, noting that (at least in these samples) Schubert's chromatic pitches tended to arise from modulation, whereas chromatic pitches in his Schumann samples tended to be

    more ornamental very different phenomena that lead to similar results.28

    In their analyses, Knopoff and Hutchinson compensate for modulations by

    normalizing all passages to C major or A minor, although this normalization is only initiated by changes in written key signature. Snyder finds this disregarding of implied

    modulations quite problematic, as well as the implied prioritization of la-minor. Since a

    modulation between relative keys is never accompanied by a change in key signature, a

    piece that modulates from, say, F major to D minor would register as having higher entropy insofar as the latter tonal area deviates from its la-tonic. Snyder also questions

    the assumption that modulations should be normalized away, arguing that a piece that

    begins and ends in distantly related keys ought to have a higher entropy than a piece that

    28 Youngblood, 78.

  • 26

    begins and ends in the same key.29 Alternately, one could argue that a piece that

    modulates from I to V ought to have a lower entropy than a piece that modulates from,

    say, I to bII that the predictability (and perhaps even the smoothness) of a modulation ought to be a factor in that piece's entropy calculations.

    Unfortunately, there are few solutions to the problem of accurate representation of

    modulations in entropy. Arguably, Youngblood's system does associate distant keys with

    higher entropies, since a modulation from C major to G major would contribute much less to a piece's entropy than a modulation from C major to Ab major would, based on the number of pitches held in common between the two respective diatonic collections. One

    could imagine combining this method with a weighting system, in which pitches

    belonging to passages in non-tonic keys contribute less to the piece's total entropy than

    pitches in the tonic key do. Ideally, these weights would be determined in part by the

    amount of time spent in the new key (as the listener's ability to remember the home key diminishes over time), but any such system would almost certainly be criticized as arbitrary.

    The debate over how best to represent modulation in entropy calculations for

    tonal repertoire highlights the concern that underlies most if not all entropy-based

    analyses: what alphabet best reflects listeners' perceptions of musical language?

    Uninterpreted pitch or pitch class is rejected as an alphabet because it is a poor reflection of listeners' interpretative hearings, since it shows no connection between the roles of C

    and G in C minor and Eb and Bb in Eb major. Similarly, when Snyder adopts a twenty-eight-letter alphabet in which enharmonic spellings are taken as separate up to double

    29 Snyder, 126-128. While the average listener may not realize that a piece has ended in C# major instead of C major, this same listener would probably notice if the piece begins in C major and ends in F# minor, if only for reasons of mode and register a distinction that cannot be made within this system.

  • 27

    flats (of scale degrees 7, 3, 6, and 2) and double sharps (of scale degrees 1, 4, and 5), his motivation is the notion that listeners hear F and E# as distinct pitches in certain contexts,

    rather than the creation of an exhaustive system.

    The variety of options available to analysts (even in terms of pitch alone) speaks to the expressive potential of these alphabets, since they can be altered to best reflect

    listeners' perceptions of any given repertoire. This same flexibility can limit the analyst's

    ability to compare samples from sufficiently different styles, though. It would seem

    unfair to compare Wagner's entropy within a twenty-eight-note system with late serial

    Schoenberg's, for example, since Schoenberg's disuse of double sharps does not speak to

    any increased predictability in his music as compared to Wagner's, nor would it be fair to

    say the listener finds Schoenberg's style more constricted because these letters are

    omitted. One of the unstated goals of such analysis, then, is the selection of an alphabet

    that is sensitive to perceptual concerns for specific repertoires but also general enough in

    its applicability that its use on music from other repertoires seems reasonable.

    This challenge is even greater for contextual music. The most common and most

    universally applicable alphabet, either pitch names or scale degrees accepting octave and

    enharmonic equivalence, is all but useless for serial music or any sort of music that

    exhausts the aggregate regularly. Any such piece will have maximal entropy for that

    alphabet cardinality, regardless of whether the piece is based on a derived row or an all-

    interval row and, indeed, regardless of whether or not the piece is atonal at all. This

    entropic equality implies that Webern's Variations for Piano is exactly as predictable as

    Boulez's Piano Sonata no. 2, which would be in turn just as predictable as the first few bars of Coltrane's Giant Steps an unintuitive claim to say the least.

    One potential solution is the incorporation of higher-order entropies, often

    accomplished through the guise of Markov chains. Such constructs would allow the

  • 28

    analyst to look for patterns in the ordering of pitches, rather than relying on their

    frequency alone. With a simple pitch alphabet, Markov chains could not differentiate

    between serial rows, but could at least distinguish between a serial piece and a non-serial

    piece that happens to use each pitch equally. Higher-order constructs have even clearer

    applicability in entropic analyses of tonal music, since they measure predictability of

    succession something of particular importance if entropy is taken to be a measure of

    tonality, since entropy on its own is order-blind. Thus, from the perspective of zero-order

    entropy, the progressions in Figures 3.2 and 3.3 are exactly the same, although certainly

    one is more predictable than the other within a tonal paradigm, and certainly one is more

    tonal than the other. By contrast, Markov chains could differentiate between these two

    strings easily.

    Figure 3.2: Passage with pitch-class entropy 2.52

    Figure 3.3: Passage with pitch-class entropy 2.52

  • 29

    In his 1958 analyses, Youngblood computes entropies on first-order combinations,

    in addition to entropies based on zero-order data. That is, rather than accepting C, D, and

    E as the most basic units of music, Youngblood accepts C followed by C, C followed by

    C#, C followed by D, and so forth as individual letters, creating an alphabet with 144

    letters. However, Hessert notes that the continued effectiveness of this strategy is limited;

    an alphabet built from consecutive pitch pairs is almost reasonable at 144 letters, but

    three consecutive letters lead to 1728 possibilities, which leads to unwieldy

    calculations.30 One can only imagine the complexity of a higher-order alphabet that does

    not accept octave equivalence.

    Hessert advocates the use of an alphabet based on intervals as a potential solution

    to this problem, since a computation based on intervals is effectively first-order without

    requiring any first-order computations. He also notes that an alphabet based on intervals

    avoids the issue of modulation quite nicely, while reflecting motivic content more

    accurately than pitch-based analysis can and potentially allowing for more meaningful

    comparisons across disparate repertoire.31 Rhodes advocates a similar solution: an

    alphabet that combines each pitch with its preceding interval.32 Potentially, such an

    alphabet would allow the analyst to distinguish between typical and non-typical

    resolutions of dissonant tones; a piece in which any scale degree can be left by any

    interval is probably less tonal than a piece in which certain scale degrees (4 and 7, e.g.) can usually only be left by certain intervals (down by step and up by step, respectively). Of course, in the eyes of this computation, a composer who always resolves 7 to b5

    30 Hessert, 16ff.

    31 Ibid., 43-44.

    32 James Rhodes, Musical Data as Information: A General-Systems Perspective on Musical Analysis, Computing in Musicology 10 (1995-1996): 165-180.

  • 30

    would be no more or less predictable than a composer who always resolves 7 to 1, or

    even a composer whose 7s can resolve anywhere but whose b3s always resolve to b6.

    Through its reliance on pitch, this sort of analysis nullifies many of the benefits Hessert

    ascribes to intervallic analysis.

    Lewin conceptualizes the importance of higher-order analytical capacity in terms

    of charge, defined as the listener's degree of uncertainty as to what an upcoming

    interval will be based on the intervals directly preceding it. His analysis goes up to sixth-

    order strings (that is, fifth-order computations based on intervals), but it occurs within a highly idealized environment: a twelve-tone row independent of musical context, and

    therefore independent of irregularities (e.g., partial presentations or reorderings of a row) or complications (e.g., the division of a row into verticalities, leading to the creation of melodic intervals not present in the original row) that would make such higher-order analysis impractical.33

    Based on this ideal environment, Lewin determines that if a listener is able to

    remember the previous five intervals of Schoenberg's String Quartet, no. 4, the listener can predict the sixth interval with complete certainty (assuming the row form in question has not been altered or truncated). This certainty is not an accurate reflection of listeners' perceptions of this row, though, even under ideal circumstances; if it were, Lewin argues,

    the associated musical experience would be quite dull. Therefore, he concludes, the

    listener probably only hears back two or three intervals perhaps more or fewer,

    depending on motivic structure, complexity of the line's presentation, repetitiveness of

    the line, and other factors, but probably not six. Thus, even if such higher-order analyses

    were practical, they may not be a reasonable reflection of the listener's experience.

    33 David Lewin, Some Applications of Communication Theory to the Study of Twelve-Tone Music, Journal of Music Theory 12, no. 1 (1968): 50-84.

  • 31

    Of course, one can imagine situations in which a less literal sixth-order analysis

    would be appropriate. Although it seems unreasonable to expect a listener to remember

    and consider six successive intervals, it seems quite reasonable for a listener to remember

    a six-element contour, or six intervals expressed in the form of two or three verticalities.

    To date, this form of entropy chunking has had almost no mention in the relevant

    literature.

    Hessert also raises the question of duration, finding it problematic that a half-note

    chord root C is treated as equal to a sixteenth-note neighbor tone D in most entropic

    analyses.34 Hiller and Bean reflect this same concern in their 1966 analyses of sonata

    expositions, in which longer notes are weighted more heavily than shorter notes but

    Hessert criticizes this approach for its lack of attention to attack, arguing that sixteen

    sixteenth-note Cs are quite different perceptually from a single whole-note C.35 One

    imagines that any method of computation that addresses both concerns would be

    prohibitively complex; it seems most likely that any interested analyst must choose

    whichever approach seems least inappropriate for that analyst's particular repertoire.

    In any case, the most salient of Hessert's concerns that an ornamental tone is

    treated as equal to a chord tone seems to be more an issue of interval than of duration,

    since ornamental tones are approached by step more often than not (and since an ornamental tone approached by a large leap is probably aurally surprising enough that it

    ought well to contribute as much to the piece's entropy as the chord tone it ornaments). It seems perceptually reasonable to claim that a whole step is a whole step, regardless of

    whether that whole step connects a C and a passing D or a chord tone C and an adjacent

    34 Hessert, 68.

    35 Lejaren Hiller and Calvert Bean, Information Theory Analyses of Four Sonata Expositions, Journal of Music Theory 10, no. 1 (1966): 96-137.

  • 32

    chord tone D. From a pitch-based perspective, the distance a line must travel to arrive at

    the next pitch may be more relevant from the perspective of musical predictability than

    how long the line stays on that pitch but even in the absence of pitch, it seems the

    primary determinant of predictability is not the duration of each individual pitch, but

    instead either the rhythmic pattern in which these pitches present themselves or the

    presence or absence of attacks at certain metric positions.

    Hessert cites one example of entropy calculations based on rhythmic patterns, an

    unpublished 1959 Master's thesis by John Brawley (Indiana University). Hessert finds this analysis problematic, since it relies upon an implicit invocation of an alphabet of

    infinite cardinality, which makes the computation of relative entropy and redundancy

    impossible. Additionally, Brawley sets forth no predetermined limits to what constitutes a

    pattern. Does a dotted quarter followed by an eighth note constitute a rhythmic pattern?

    If this configuration begins on a weak beat or is preceded by an eighth note, is it the same

    pattern? Is pattern perceptually the same at M.M.=160 as it is at M.M.=40?36

    Snyder advocates the exploration of duration-sensitive entropy calculations, but

    he notes that such calculations almost necessarily conflate clock time with perceptual

    time.37 In other words, by creating calculations based on the notated tempo we implicitly

    privilege the former, which is less defensible given the degree to which analysis based on

    entropy is meant to be a measure of listeners' perceptions of predictability. Of course, any

    analysis that claims to be a reflection of perceptual time must almost certainly encompass

    multiple musical domains beyond rhythm, tempo, and duration. In priviliging longer

    notes over shorter ones, we run the risk of (for example) privileging extended neighbor notes over the shorter chord tones they ornament.

    36 Ibid., 45-50.

    37 Snyder, 125-126.

  • 33

    Other than Rhodes, few analysts have attempted to deal with more than one

    musical alphabet simultaneously. The notable exception is Hiller and Fuller's 1967

    analysis of the op. 21 Symphony, in which pitch (not pitch class) is combined with the number of eighth notes between successive attacks. Entropies are also computed on

    various types of intervals. These entropy calculations are then used to draw conclusions

    about formal sections of each section of the first movement. When pitch is considered

    alone, results between zero-order entropy and first- or higher-order chains are

    inconclusive; although the development is (as one would expect) the least predictable in terms of individual pitches, its higher-order results are more predictable than either the

    exposition or the recapitulation.38 These inconsistencies carry over into interval-based

    and attack-point-based entropies.39

    As mentioned, entropy is the quantity of information (measured in the number of bits the message would require to store or transmit) that each letter of an alphabet conveys. Hiller and Fuller also express their entropy in terms of bits per second (based on the notated tempo) that is, examining entropy in terms of the rate at which information is presented. Their hope is to distinguish between the listener's experience of a great deal

    of information presented quickly, and the same amount of information presented over a

    longer timespan. Interpreting entropy in terms of bits per second does not change the

    entropy results for op. 21, but the idea bears investigation: that the speed with which

    information is presented influences the audience's perception of its complexity.

    Unfortunately, this measure cannot describe how evenly distributed information is across

    this passage distinguishing, for example, a burst of information followed by silence

    38 Lejaren Hiller and Ramon Fuller, Structure and Information in Webern's Symphonie, op. 21, Journal of Music Theory 11, no. 1 (Spring 1967): 78. 39 Ibid., 84ff.

  • 34

    from a passage with a continuous information rate. Of course, the accuracy of Webern's

    notated tempos is problematic in any case, and the frequent ritardandos in his music make

    it less plausible that a calculation of this type could be relevant to a performance. Despite

    any practical limitations, though, the fact that entropy was considered in terms of the rate

    at which information is received hints at an early connection between entropy and

    diachronic analysis and arguably, an early connection between entropy and time, as

    well.

    Hessert gives four criteria for effective entropy-based analyses:

    1. An alphabet should be finite; 2. Elements in an alphabet should be discrete; 3. Sample sizes should be as large as possible; 4. Analysis should be based on as many musical domains as possible.

    The first two are basic criteria without which entropy calculations are impossible; the

    second two are desiderata but not necessarily requirements. To these one can add that

    entropy can most effectively analyze samples with low variances, since entropy is in

    some sense a decontextualized measure of central tendency. Smaller sample sizes may

    assist in analysis, if they serve to reduce variance; it is more effective to analyze a small

    sample that possesses a given characteristic uniformly than to combine this sample with

    another sample lacking this characteristic. Imagine a bimodal grade distribution in which

    many students have a 90% average and many have a 60% average. Considering these

    students in terms of two smaller sample sizes allows one to generalize about the data

    easily, but combining the two samples yields both an unhelpful overall average and a

    much higher degree of uncertainty. The same logic ought to apply to musical domains.

    Considering data across multiple domains is useful, but considering multiple domains

    simultaneously that is, combining entropies of different domains into a single entropy

  • 35

    measure may disguise tendencies in the data.

    The cautions one can draw from the history of entropy in music are, for the most

    part, no different from the cautions that apply to all analysis. In particular, entropy-based

    analyses are problematic when they do not reflect musical experience. If one accepts that

    all music analysis is necessarily metaphor that quantitative analyses are simply a

    different way of exploring metaphor then the most important caution is that these

    metaphors must be apt, rather than relying upon their quantitative nature to make their

    arguments. If an analyst is careful to ensure that conclusions based on entropy are

    reflective of musical experience and perception diachronic or synchronic then entropy

    can prove a useful tool for analysis.

  • 36

    CHAPTER IV ALPHABETS FOR ENTROPY-BASED ANALYSIS

    Interval Entropy

    As discussed previously, pitch class entropy is rarely useful for analysis of post-

    tonal music. The table below gives pitch class entropy figures for a collection of post-

    tonal vocal works; Youngbloods results for Schuberts pitch entropy provides a baseline

    from tonal repertoire.

    Work Style Pitch class entropy Relative entropy Webern op. 15 (Fnf geistliche Lieder), without no. 540

    Freely atonal 3.58 100%

    Webern op. 16 (Fnf Canons) and op. 15, no. 5

    Freely atonal canons 3.57 99.7%

    Webern op. 25 (Drei Lieder)

    Serial, based on a derived row

    3.58 100%

    Babbitt, Widow's Lament in Springtime

    Serial, based on an all-interval row

    3.58 100%

    Youngblood's Schubert sample

    Tonal 3.13 87.4%

    Table 4.1: Pitch entropies in Webern works, compared with Babbitt and Schubert

    No measures of statistical significance are necessary to interpret these results.

    40 Since op. 15, no. 5 is a canon, it is included with the op. 16 canons throughout this section.

  • 37

    Although pitch entropy is able to distinguish Schubert from Webern, it is unable to

    distinguish between serial and freely atonal works, or derived rows and all-interval rows.

    Even canons are seen as maximally unpredictable, although one imagines the second and

    third voices are quite predictable indeed.

    Intuitively, it seems entropy based on interval class should be able to distinguish

    between these styles. Pitch class entropy can only recognize canons iterated at the same

    pitch level, but a canon interpreted as a series of intervals should be recognizable at any

    pitch level. Although the order-blindness of entropy somewhat limits its effectiveness for

    canons, interval class entropy can at least distinguish a canon from a non-canonic piece in

    the same style. The same logic applies to serial works; a serial work will generally have

    lower entropy than a freely atonal work since any interval appearing in the row would be

    repeated many times, while any interval not appearing in the row would be heard very

    infrequently. (Similarly, a serial work based on an all-interval row should have roughly the same number of all interval classes, whereas a work based on a derived row would

    have roughly proportionate numbers of a few interval classes and very few of any others.) Such a measure could be unable to distinguish between a serial work based on a derived

    row and a freely atonal work saturated with the pitch class set that forms the basis of the

    former's derived row, or a serial work based on an all-interval row and a freely atonal

    work that simply exhausts the aggregate of interval classes regularly, but arguably, most

    listeners would not be able to make this distinction, either.

    These intuitions are somewhat flawed, in that they assume an idealized linear

    presentation of a serial row. Vertical presentation of a portion of a row or the division of a

    row amongst several voices will almost certainly create new intervals not represented in

    the original row. Nevertheless, entropy is at heart a measure of predictability, and it seems

    reasonable that it should reflect the listener's surprise at hearing an interval not linearly

  • 38

    present in the row, even if reflection of this surprise comes at the expense of the

    construct's ability to identify a work as serial or non-serial.

    Horizontal interval class analysis of the same works from Table 4.1 provides the

    results shown in Table 4.2 and Figure 4.1.

    Work Interval class entropy

    Deviation at a 95% confidence level

    Relative entropy

    Webern op. 15 2.57 .04 91.5% Webern op. 16 2.48 .04 88.3% Webern op. 25 2.35 .05 83.6% Babbitt, Widow's Lament 2.72 .06 96.8%

    Table 4.2: Interval class entropies comparing serial and non-serial works

    Figure 4.1: Interval class entropies comparing serial and non-serial works

  • 39

    These data indicate that interval class entropy is able to distinguish between

    derived and all-interval rows, and between canons and non-canons from approximately

    the same period. These are both important tests of the construct's effectiveness; its ability

    to make these distinctions speaks toward its ability to reflect musical saturation and

    predictability.

    These distinctions are not retained when vertical intervals are included.

    Work Entropy (vertical and horizontal intervals)

    Deviation Relative entropy

    op. 16 3.42 .05 95.5% op. 25 3.34 .06 93.3%

    Table 4.3: Vertical and horizontal interval entropy on one serial and one non-serial work

    Although op. 25's entropy is still lower than op. 16's, the difference is no longer

    significant. In other words, based on these calculations we cannot posit a distinction

    between Webern's use of verticalities in op. 16 and op. 25; if both pieces were played as

    block chords, it is unlikely the listener would be able to distinguish between them based

    solely on intervallic content.

    Returning to the question of horizontal intervals, then, we find that removing

    inversional equivalence eliminates many of the distinctions between these works, as

    shown in Table 4.4 and Figure 4.2. Without inversional equivalence, relative entropies are

    higher across the board since what was originally an emphasis on interval class 1

    becomes a dual emphasis on registrally-ordered interval classes 1 and 11. Variances

    increase for the same reason, which makes statistically significant distinctions less likely.

  • 40

    Nevertheless, registrally-ordered interval class entropy can still distinguish

    meaningfully between canons and non-canons (Webern op. 15 vs. op. 16) and between derived rows and all-interval rows (Webern op. 25 vs. Babbitt). The most interesting difference between Table 4.2 and Table 4.4 is op. 25, which has a lower interval class

    entropy than op. 16 but a higher registrally-ordered interval class (ric) entropy. This distinction speaks to a fundamentally different approach to inversion between these two

    works. In op. 16, an ric1 is not the same as an ric11, since a melodic ric1 in the clarinet

    line could not be answered with an ric11 in the vocal line without breaking the canon.

    Assumptions of inversional equivalence seem much more reasonable in op. 25, since the

    juxtaposition of prime rows with inversional rows leads the listener to hear intervals and their inversions as at least related, if not equivalent.

    Work Registrally-ordered interval class entropy

    Deviation Relative entropy

    Webern op. 15 3.40 .06 95.0% Webern, op. 16 3.24 .06 90.5% Webern, op. 25 3.34 .06 93.3% Babbitt, Widow's Lament in Springtime

    3.50 .07 97.8%

    Table 4.4: Registrally-ordered interval class entropy in Webern and Babbitt

  • 41

    The remaining oddity in these data is the similarity between Webern op. 15 and

    Babbitt. To investigate this similarity, we expand intervallic entropy into interval entropy

    (-72 < x < 72) and ordered directional interval class entropy (-12 < x < 12).41

    Ordered directional interval class entropy bears few surprises. The data in Table 4.5 and

    Figure 4.3 show the expected distinction between Webern and Babbitt, but from these

    data no conclusions can be drawn about any of the Webern works examined almost the

    opposite of the results generated by registrally-ordered interval class entropy.

    41 72 (or six octaves) is a number chosen out of convenience the distance between the highest and lowest pitch in any of these pieces, rounded up to the nearest octave.

    Figure 4.2: Registrally-ordered interval class entropy in Webern and Babbitt

  • 42

    Work Ordered directional interval class

    Deviation Relative entropy

    Webern op. 15 3.71 .08 82.1% Webern op. 16 3.64 .09 80.1% Webern op. 25 3.52 .11 77.9% Babbitt, Widow's Lament

    4.03 .12 89.2%

    Table 4.5: Ordered directional interval class entropy in serial and non-serial works

    Figure 4.3: Ordered directional interval class entropy in serial and non-serial works

  • 43

    The inferences to be made from this apparent inconsistency either that ordered interval

    class is less a relevant structure in these Webern works, or that Webern's predictability in

    terms of ordered interval class remains consistent across a variety of post-tonal styles

    are at first alarming. Either conclusion makes suspect Hessert's claim that interval-based

    entropy is capable of dealing meaningfully with works from disparate periods and styles

    given that registrally-ordered interval class entropy lacks the generality to distinguish

    between Babbitt and freely-atonal Webern, while another lacks the generality to

    distinguish between Webern works of different styles and time periods. Perhaps the more

    useful claim to draw from this perceived lack of generality is that any invocation of

    intervallic entropy must be nuanced that in computing intervallic entropy we make

    implicit assumptions about a given composer's approach to the interval, assumptions that

    should be examined and argued.

    One must also keep in mind that although statistically significant differences

    between works imply differences in style, the lack of statistically significant differences

    does not imply stylistic similarities. The lack of distinction between Webern's op. 15 and

    Babbitt's Widow's Lament in terms of registrally-ordered interval class entropy does

    not imply a fundamental similarity between these works' use of registrally-ordered

    interval classes; rather, the differences between the works are simply not profound

    enough for us to be certain that they imply a genuine stylistic difference. In short, an

    unexpected significant difference between two works is noteworthy, but an unexpected

    similarity need not be.

    At the very least, these results demonstrate the utility of examining repertoire

    from multiple perspectives on the interval. These results also hint at the possibility of

    using various types of intervallic entropy as evidence in an argument against, for

  • 44

    example, accepting inversional equivalence as a given in analysis of a particular work.

    Entropy computations for pure intervals, as opposed to interval classes, provide

    the following results:

    Work Interval entropy Deviation Webern op. 15 4.92 .11 Webern op. 16 4.75 .11 Webern op. 25 4.98 .16 Babbitt, Widow's Lament

    4.91 .16

    Table 4.6: Interval entropy in Webern and Babbitt

    Figure 4.4: Interval entropy in Webern and Babbitt

  • 45

    Relative entropy is omitted here, because the maximal entropy of a 144-letter

    alphabet is extraordinary large. As a result, these results would have extraordinarily small

    relative entropies, which would give an impression of predictability not audible in the

    music.

    Although op. 16 seems to have a much smaller entropy than all other works

    considered, this deviation is not statistically significant. Even if it were, the conclusions

    drawn would be slightly problematic. One could not conclude even from significantly

    smaller interval entropy that Webern op. 16 relies more upon smaller interval


Recommended