+ All Categories
Home > Documents > Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text...

Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text...

Date post: 31-May-2018
Category:
Upload: adam-breindel
View: 219 times
Download: 0 times
Share this document with a friend

of 31

Transcript
  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    1/31

    N ote regarding 2007 republication: Source and binary code referenced in note 34 are no longer at that

    URL. The same assets are now hosted athttp://www.selfmummy.com/mss2dna

    Adam Breindel

    Departm ent of Classics, Brow n UniversityMay 1998

    The Application of a D iscrete-Character Parsimony

    Phylogeny-Inference Algorithm to Classical Text Stemmata

    The purp ose of this paper is to present two interdisciplinary observations; a new

    technique for stemmatic analysis; and preliminary results from an app lication of this

    technique.

    The first interd isciplinary observation is that the method s and pu rpose of

    stemmatics overlaps substantially with the method s and p urp ose of the biological sub-

    discipline of cladistic analysis. While this fact is rarely em ph asized or exploited, it is not

    a new discovery, and its history w ill be discussed. The second interd isciplinary

    observation is that compu ter software which has been developed for biologists in ord er

    to solve problems in clad istic analysis now offers us the possibility of advances in the

    construction of textual stemm ata, through a non-traditional use of trad itional

    manuscript collations.

    The analytic technique contained herein wh ich d oes not appear ever to have

    been attempted heretoforeis the application of an existing cladistic analysis software

    package to the stemmatic analysis of a manu script collation. The use of this technique to

    analyze part of the Sallustian corpus is thorough ly documented in this stud y.

    Preliminary results indicate that th e technique produ ces a stemm a nearly iden tical to

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    2/31

    Breindel 2

    that pu blished by L.D. Reynold s, the editor of the Oxford text. Hence, this method

    app ears to offer an effective new ap proach to evaluating t he relationships amon g extant

    versions of a text.

    The interp lay between the disciplines of biological systematics, genetics, and

    textual criticism, w hich m akes this pap er p ossible, has a somewh at Byzantine history

    spann ing the last thirty years. I ask the read er to consider with charity my exposition of

    this history. For it seems that the relative uniqueness of this pap er demand s an

    unusu ally large amou nt of background information.1

    Background

    In 1968, John G. Griffith p ublished a p aper entitled A Taxonomic Stud y of the

    Manuscript Trad ition of Juvenal.2In this study , Griffith app lied m ethods of numerical

    taxonomy to the classification of Juvenal manuscripts. The taxonomic methods, as he

    explains in a similar article the following year ,3

    he had in turn learned from biologist

    Rober t Sokals 1966 Scientific American article on that top ic.

    Griffith describes the biological advances wh ich he exploits in analyzing the texts

    of Juv enal:

    1I have found it necessary in the course of this paper to refer to some technical aspects of systematics and

    genetics. I have attemp ted to r estrict to an elemen tary level the familiarity requ ired w ith these disciplines,

    in order to make this work accessible to a broad aud ience. Non etheless, readers seeking an introd uctory

    exposition may find ap propriate sections of the following textbooks useful:

    Gamblin, Linda and Gail Vines, eds. (1991) The Evolution of Life, Oxford, chapter 3.

    Maxson, Lind a R. and Ch arles H. Dau gherty (1992) Genetics: A Human Perspective,

    Dubu que, Iowa, chapter s 8, 10.

    Minkoff, Eli C. (1983)Evolut ionary Biology , Reading, Mass., chapter 22.2Griffith, John G. (1968) A Taxonomic Stud y of the Manu script Trad ition of Juv enal Museum Helveticum

    25:101-38.3Griffith, John G. (1969) Nu merical Taxonomy an d Some Prim ary Man uscript s of the Gospels, Journal of

    Theological Studies 20:389-406.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    3/31

    Breindel 3

    Scientist have long been aware of the limitations of the traditional

    meth ods of classifying specimen s; biologists in particular have laboured

    under this hand icap. Within th e last 10-15 years considerable ad vances

    have been made, largely because techniques developed for comp uter u se

    have enabled specialists in this activity, who style themselves numerical

    taxonomists, to sift w ith speed and precision large masses ofunprom isingly heterogeneous material, and thereby to isolate groups or

    taxa of related specimens, on the basis of which further inquiry may be

    conducted. ...4

    Thus, Griffith identifies a requirem ent which textual criticism has in comm on with

    taxonomy: in both d isciplines, objects mu st be group ed based on small numbers of

    distinctions among v ast amounts of similarity. The numeric taxonomy m ethods ap pear,

    he says, to offer new quan titative approaches app licable to both p roblems. He then

    expresses the hop e that w e might find associations between specimens by evaluating

    large amounts of data w ith machine assistance. In light of the existing resou rces,

    though, he remarks that for a textual critic operating with only a few thousand lines of

    text it is simply not w orth the trouble of program ming the data for machine-

    processing...5

    The limitations to Griffiths pioneering app roach were unfortun ately several. His

    procedu re w as, first, extraordinarily laborious: for the fourteen Gospel manuscripts

    analyzed in h is article of 1969, up to fifty-six manua l record ing acts were required for

    every variant among one or more of the manu scripts. Thu s he was constrained to look

    at only small samples of the d ata. Moreover, if he had had access to more data, he may

    likely have lacked access to the technology to evaluate it.

    4Griffith, op. cit. 1968, pp. 113-14.

    5ibid.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    4/31

    Breindel 4

    Griffiths procedure (and , in a ll fairness, the biological methods with h e worked )

    had a more troublesome limitation in that they resulted only in associations of objects.

    Griffith could assert the d istribution of manu scripts into various sub -groups w ith

    statistically-argued accuracy, but th e mere grou ping of the manu scripts d oes not seem

    to have accomp lished m uch. His method s said nothing abou t the genealogical

    relationships of the manuscripts. For example, if manuscripts A, B, and C are found to

    be in a single taxon, we have only formalized their external similarity. As useful as such

    formalization m ight be, little is ind icated abou t the genealogical relationships likely to

    inhere between the manu scripts.

    Thus, Griffith su cceeded in bringing n um erical taxonomy into the arena of

    textual criticism, but the biological appr oach upon which he d epend ed w as not

    ambitious enough to describe the relationships among the sp ecimens and so his textual

    techniques appear to have fallen into d esuetud e.

    In 1973, Martin West published a shor t work on textual criticism, Textual

    Criticism and Editorial Technique.6

    In this work, West explains that compu ters might

    theoretically hold some prom ise for stemm a construction, because, und er the best

    possible circumstan ces, building a stemm a d emand s only simple logic. Such a stemma

    would natu rally be an ad vance over Griffiths taxonomic man uscript associations. West

    is, however, skeptical about th e idea and holds ou t some theoretical reservations:

    If prov ided w ith suitable prepared transcriptions of the manuscripts,

    pu rged of coincidental errors, a comp uter could d raw u p a clumsy and

    unselective critical apparatus; and it could in principle wh ere there was

    no contamination!work out an un oriented stemma. That m eans ... that

    it could w ork out a scheme simply by comparing the variants, without

    6West, Martin L. (1973) Textual Criticism and Editorial Technique, Stuttgart.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    5/31

    Breindel 5

    regard to w hether they were right or wrong; but this scheme would be

    capable of susp ension from an y point [i.e., the schem e could not

    distingu ish the subarchetyp es] ... The correct orientation could on ly be

    determined by evaluating th e quality of the variants, which no machine is

    capable of doing.7

    Wests objections w ill be considered in detail later, as they are impor tant to the

    present investigation. But it is worth noting for now that even if West had wanted to

    test a compu terized construction of a stemm a, there would have been obstacles to his

    progress.

    First, there wou ld not have been readily available technology for his pu rpose.

    But more impor tantly, outside of theoretical comp uter science or mathematical graph

    theory, there had not been p ractical research on autom ating the construction of

    stemmata when the data for the specimens is inconsistent or underdetermined. That is, if the

    variants in a set of manuscripts were completely compatible with a u nique stemma, w e

    would need on ly make the right inferences to generate it. In reality though, there is

    usually no stemm a which is not inconsistent with at least one locus in the m anuscripts;

    conversely, if a degree of latitude is allowed so as to overcome such strict

    inconsistencies, we find a multitude of possible stemm ata. These stemmata we mu st

    distinguish on the basis of some criterion capable of evaluating the likelihood that each

    would give rise to the man uscripts as they exist.

    Thus, a variety of difficult problems, theoretical and computational, inhere in the

    task of mechanically constructing a stem ma and they are not problems wh ich

    classicists were likely to attack on their ow n. Fortuitously, however, developm ent had

    simultaneously been taking place within the biological disciplines of taxonomy and

    7West, op . cit., pp . 71-2.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    6/31

    Breindel 6

    systematics so as to motivate biologists to attempt these same problems. For derivation

    of the evolutionary relationships of a grou p of extant specimens w as a key part of the

    emergent stud y now called cladistics.

    Biologist Willi Hennig h ad begun to d evelop and advocate a strictly phylogenetic

    approach to arranging organisms.8Hen nigs view, that the evolutionary relationships of

    organisms formed the best found ation for classifying and systematizing them, was an d

    remains the object of debate.9Parts of his theory how ever, seem to have been ad opted

    or ad apted by increasing numbers of systematists throughou t the 1970s.

    The cladistic approach seems intu itively obvious, and G.D.C. Griffiths (along

    with many defenders) insisted that it alone had the ad vantage of relying on objective

    fact abou t the organisms in qu estion (rather than d eploying the organisms into classes

    invented by hu man s). Griffiths writes, [Henn igs method ] provides the only

    theoretically sound basis for achieving an objective equivalence between the taxa

    assigned to particular categories in a phylogenetic system.10

    Unfortunately, what seems

    intu itively obvious can also be deceptively fallacious, and cladistics does have a

    disingenuou s side. It is wor th p ointing out tw o objections to the system here, largely so

    that the reader may see that they do notapp ly to a textual app lication of the theory.

    8Hennig m ight be called the father of modern cladistics; his work was d eveloped and debated in v arious

    pu blications includ ing (1950) Grundzge einer Theorie der Phylogenetischen Systematik, Berlin.

    (1966) Phylogenetic Systematics, Urbana, Illinois.

    (1971) Zur Situation der biologischen Systematik, Erlanger Forschungen, R. Siewing ed .,

    Erlangen.9

    For views on the early intellectual p ositions in t he d ebates, see Ernst Mayr (1976) Evolution and the

    Diversity of Life: Selected Essays, Cambridge, Mass., pp. 435-41.10

    Griffiths, G.D.C. (1972) The Phylogenetic Classification of Diptera Cyclorrhap ha with Special

    Reference to the Structur e of the Male Postabdom en, W. Jun k, N.V., The H ague.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    7/31

    Breindel 7

    First, even if we are granted a thorou gh knowledge of the evolutionary

    interrelationships of the specimens in qu estion, no m ethod is thereby presented for

    determ ining the level of descent at which class divisions shou ld be m ade. We are on ly

    shown that, having mad e a choice, we are bound to include and exclud e certain

    specimens.

    Second, given three organisms, A, B, and C, suppose that A and B are similar in

    form, while C differs greatly from both A and B. Sup pose further that A and C are

    closer evolut ionarily to one another than either is to B. In th is situat ion wh ich is not

    uncommon in naturewe w ould be forced u nd er Hen nigs system either to class A, B

    and C all together, or else to class A and C together against B (Figure 1).

    Figure 1

    Neither of these options appeals to our intuition the way that th e system at first did . For

    A and B appearto form a group as against C, and yet this is precisely the classification

    which we are proh ibited from making.

    These two objections, wh ile hav ing much p ractical import for the classifying of

    organisms, will clearly be irrelevant w hen w e come to ap ply this method to

    manu scripts. First, we need nt classify manuscripts by name (and if we do, we accept

    that classification as our own prod uction); second , we have no sympathy for similarity

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    8/31

    Breindel 8

    of appearance between man uscripts if we have hard evidence that they are u nrelated in

    origin (since it is the or igin that is the object of the textual qu est).

    These cladistic meth ods of analysis and classification, even if controversial,

    promp ted research into the creation and evaluation of stemmata (or cladograms) from

    incomp lete and incompatible data. The cladistic approach d epend s for a starting p oint

    on d etermining the evolutionary relationships of the specimensand these

    relationships m ust be assembled from lists of variations among the specimens. Hence,

    in a sense, biologists set to work on the p roblems which had stood in front of Martin

    West.

    But debate abou t the p hilosophical und erpinnings of the cladistic method ology

    did not subside. In 1977, the methodology attracted a d efender in University of

    Michigan classicist and zoologist H. Don Cameron, due to cladistics eviden t similarity

    to established techniques in trad itional (i.e., not m echanical) textual criticism.11

    Cameron

    along with Norman I. Platnick d escribe the d ebate, and situate th emselves in it, thus:

    Recent years h ave seen an increasing aw areness and use am ong zoological

    systematists of the theory and method s of phylogenetic analysis

    (cladistics) developed by H ennig. These methods h ave been w ell

    defended by [E.O.] Wiley from the point of view of Popperian

    hypothetico -dedu ctive science. Critics, both of the method s themselves

    and of their application to classification, have not been silent... The

    pu rpose of this paper is to point out a fact overlooked d uring th e

    controversy, namely, that method s analogous to those of Hennig are

    accepted as the stand ard tools of analysis in two other fields th at resemble

    phylogenetic systematics in being p rimarily concerned with constructing

    and testing h ypotheses about the interrelationships of taxa connected by

    ancestor-descendant sequences.

    11Platnick, Norm an I. and H . Don Cam eron (1977) Clad istic Method s in Textu al, Lingu istic, and

    Phylogenetic An alysis, Systematic Zoology 26:380-85.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    9/31

    Breindel 9

    The fields referred to are ... textual criticism ... and ... lingu istic

    reconstruction.12

    Cameron and Platnick, wr iting for an aud ience of biologists, next summarize the

    techniques of textual criticism pu t forth by Pau l Maas.13

    Differences of techniqu e

    between biological and textual stemmatics wh ich Cameron an d Platnick view as

    subord inate to an overarching similarity are described in m oderate d etail.14

    The pap er

    is intend ed to pr ovide a critique of a situation w ithin a d iscipline of biology, but it

    serves also to ind icate that these scholars can recognize and make precise the

    correspond ence between stemm a construction an d cladistic analysis.

    In a conference conclud ed in 1983, Cameron again p resented his view of textual

    criticism. The conference had been organized to investigate the biological and clad istic

    metaphor in other intellectual fields.15

    Cameron treated stemm atics, but he d id not

    discuss stemmata as a metaphor from biology, since, as he points out, the stemmatic

    meth ods as used in both fields were d eveloped by classical scholars systematically in

    the nineteenth century and ... the origins of the method can be found as early as the

    sixteenth century...16

    Beyond merely recounting the techniques of Maas, Cameron

    explores the distinction as far as it imp acts his cladistics-stemm atics comparison

    between vertical or uncontaminated traditions and horizontal transmissions, those

    12Platnick, op . cit., p. 380.

    13Maas, P. (1958) Textual Criticism, Oxford ; Platnick, op. cit., p. 381-3.

    14Platnick, op . cit., p. 384.

    15Biological Metaphor O utside Biology (1982) and Interdisciplinary Round -Table on Clad istics and Other

    Graph Theoretical Representa tions (1983) symp osia at the University of Pennsylvan ia. Proceedings in

    Hoen igswald , Henry M . and Lind a F. Wiener, eds. (1987)Biological M etaphor and Cladistic Classification,

    Philadelphia. 16

    Camer on, H .D. (1987) The Up side-Down Cladogram: Problems in Manuscript Affiliation, in

    Hoenigswald, op. cit.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    10/31

    Breindel 10

    full of Byzant ine, and even ancient, ed iting and conjecture.17

    In the latter cases,

    clad istic method s give little aid. But in the former, he conclud es:

    [V]ertical transmission an d uncontaminated text tradition make the

    mechanical application of cladistic methods to reconstruct a singlearchetype a workable and successful method, with a claim to being

    scientific...18

    Thus, Cameron argues that, at least in a vertical textual trad ition, we ought to be able to

    use method s from cladistics to derive a stemma an d even an archetype.

    At this point, the next move for a textual critic might have app eared obvious:

    mate Wests insight abou t mechanical prod uction of stemmata with Cameron s insight

    th at clad istics provides the theoretical and algorithmic un derpinn ing for Wests

    operation. That is, use cladistic techniques to attack thorny problems of textual

    transmission. It is un clear w hy this app roach was not exploited in the 1980s. We might,

    however, hypothesize a pau city of tools to supp ort such research.

    In the 1980s, three fur ther d evelopm ents came about wh ich m ade th e project

    presented herein more practicable.19

    One breakthrough was imp roved DNA

    sequencing:20

    it became possible to put genetic mater ial from var ious species into an

    autom ated process and receive, as outp ut, essentially a collation showing every genetic

    difference between the samples.21

    More abundant data w as now available with w hich

    cladistic analysis could work.

    17Cameron, op. cit., p. 238.

    18ibid.

    19It is important to note that none of these three developments sprang fully formed from the head of Zeus

    in the 1980s. It is convenient to d escribe them here, as their confluence seems to change the research

    environment at the time, but research on DNA sequencing, p arsimony algorithms, and of course

    computers had a long prior history.20

    In paticular the developm ent of polymerase chain reaction (PCR) dup lication of DNA segments.21

    That is, in the sequenced stran ds of DNA.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    11/31

    Breindel 11

    The second developm ent of this time period was th e availability of comp uters

    sophisticated enou gh to compare and evaluate the thousand s or tens of thousands of

    possible cladogram s (stemmata) which might result from comparing large num bers of

    species. That is, computers allowed biologists to overcome that challenge which Maas

    had identified for textual critics, when he observed that a large num ber of specimens or

    witnesses would prod uce an astronomical num ber of possible stemmata.22

    The last pre-requisite development was software systems to pu t large quantities

    of data (whether from DNA or elsewhere) together w ith the compu ters. Software to

    compute likely stemmata involves, at its core, algorithm s which have been top ics in

    computer science and mathematics for a ha lf-centur y or more. Hence, strictly speaking,

    app ropriate software had probably been in d evelopm ent in research u niversities and

    corporate labs for some time. But the ear ly 1980s saw the release of packages d esigned

    specifically for cladistics, tailored to the needs of practicing biologists, and read y to ru n

    on existing microcompu ters.

    The present experimental stud y, described below, is an attem pt to establish a

    stemma for the textual trad ition of SallustsDe Coniuratione Catalinae using one such

    software package, the freely-distributable Phylogeny Inference Package (or, as henceforth ,

    PHYLIP).23

    22Maas, op. cit., p. 47: If we have fou r w itnesses, the num ber of possible types of stemma am oun ts to 250,

    if we have five, to appr oximately 4,000, and so on in qu asi-geometrical progression. 23

    Felsenstein, J. (1993) PHY LIP (Phylogeny Inference Package) version 3.5c, distributed by the author, Dept.

    of Genetics, Univ. of Washington , Seattle. See http:/ / evolution.genetics.washington .edu

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    12/31

    Breindel 12

    Before p roceeding to describe the m ethod and outcome of the experiment, it is

    app ropr iate to consider two technical objections wh ich textual critics have put forward

    concerning stemm a construction.

    The first objection is one of M.L. West, prin ted above. West correctly pointed out

    that any stemma derived by algorithm w ould be an unoriented stemma (or, as the

    clad ists say, an un rooted cladogram ).24

    That is, the algorithm could determine the

    branchings of the stemma but could not ascertain w hich branching belongs at the

    top(in practice, this amou nts to identifying the nodes representing the subarchetypes).

    An un rooted cladogram (Figure 2) can represent several d istinct ro oted versions (Figure

    3). Each rooted cladogram can, in turn represent several distinct possible phylogenies

    (Figure 4).25

    Figure 2. Unrooted cladogram. This cladogram shows

    the relationships of the specimens relative to oneanother, but does not indicate their relationship toancestors from which they descend.

    Figure 3. Rooted cladograms. Each of these five rooted cladograms is consistent with

    the unrooted cladogram above (Figure 2). By postulating the first branching in thedescent, the known relationships specify the remainder of the tree. Note, however, thatthe lenths of branches, and the specimens which might lie on the nodes of the tree, arenot indicated.

    24West, op . cit., pp . 71-2.

    25Hu mp hries, C.J. and P.H . Williams (1994) Clad ogram s and Trees in Biodiversity, Models in Phylogeny

    Reconstruction, Robert W. Scotland, Darrell J. Siebert, and David M. Williams, eds., Oxford, pp. 336-7.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    13/31

    Breindel 13

    Figure 4. Phylogenetic Trees. All four of these phylogenetic trees are compatible with asingle cladogram above (Figure 3.ii). Note that schemata involving direct descent areincluded.

    Wests objection is legitimate. It shou ld not, though, prevent u s from pu rsuing

    autom ated stemma construction, for several reasons. First, the unrooted cladogram is, if

    accura te, a great advance over no stemma and an even greater advance over an

    incorrect stemma. Second , it may in many cases be tolerably easy to p roperly root th e

    cladogram, thus p roducing a traditional stemma, based on our knowledge of the dates

    and locales of origin for the various manuscripts. Third, compu ter method s are

    par ticularly useful in the frequent circumstance that the collation is not u niquely

    compatible with any single prop osed stemm a. In such cases, we shall be happ y to have

    an analysis of the entire collation, a most-likely stemma, and a mathematical

    justification for excluding many other stemmata.

    The second objection is one advanced by Roger David Dawe in stud ies of the

    traditions of Aeschylus and Sophocles.26Dawes contention is that there is so much

    horizontal transmission in the trad itions for these authors, as ind icated by numerou s

    true readings app earing in d ependent m anuscripts though absent in other manu scripts,

    as to invalidate th e stemmatic approach.27 Dawe confronts the method ology of Pasquali

    26Daw e, R.D. (1964) The Collation and Investigation of Manuscripts of Aeschylus, Cambridge

    and (1973) Studies on the Text of Sophocles, 2 vols., Leiden .27

    Cameron, op. cit., p. 237.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    14/31

    Breindel 14

    and consequently confronts my m ethod, which d erives partly throu gh Pasquali, Maas,

    and Westat least in the case of individu al authors such as Aeschylus. He writes:

    We believe that th e fact of un ique preservation has been dem onstrated [in

    the Aeschylean case]; consequen tly the fault mu st lie with th e theory ofdescent, and we conclud e that the ... stemm a does not after all represent,

    even in th e simplest form, the tru e character of the trad ition. ...

    It seems clear that the p icture p resented by the manuscripts is one of a

    recension so entangled that it is utterly impossible for us to u nravel the

    threads.28

    Cameron summ arizes the p roblems w hich Dawes assertion p oses to an y method such

    as the one employed in the present study:

    Dawe d enies radically that archetypes can be reconstructed, but he

    necessarily p ays a theoretical pr ice for his conclusion...

    If there are no archetypes or stemm ata, and if true readings are uniquely

    preserved in any m anuscript regard less of its stemmatic position, we are

    then throw n back to a procedur e of evaluating readings which is unaided

    by considerations of outgroup comp arison, reconstruction of an

    archetype, or to p ush the concept to its logical conclusion, withou t the

    consideration of manu script au thority of any kind.29

    In order tha t we m ay avoid an imbroglio in Aeschylean Textkritik, we might concede

    Dawes assertion to hold true in certain specific textual trad itions. But we need not

    suppose that any p articular nu mber of such trad itions invalidates the ded uctive

    stemmatic method in general. Hence, in the absence of any argum ent against stemmatic

    representat ion of the Sallustian trad ition, we can proceed to an alyze it via the cladistic

    approach.

    Experimental Procedure

    28Daw e, op . cit. 1964, pp. 157-8.

    29Cameron, op. cit., pp. 237-8.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    15/31

    Breindel 15

    In this stud y, the manu scripts containing the De Coniuratione Catilinae and theDe

    Bello Iugurthino were examined, as these two works are found together in one set of

    manuscripts. Absent access to a complete collation, an ad apte d collation w as formed by

    the following method . Eleven manuscripts were selected from those includ ed in L.D.

    Reynold s Oxford text of 1991 (Table 1).

    Siglum Manuscript

    A Parisinus 16025B BasileensisC Parisinus 6085D Parisinus 10195F Hauniensis Fabricianus

    H Berolinensis Phillippsianus 1902K Vaticanus Palatinus 887N Vaticanus Palatinus 889P Parisinus 16024Q Parisinus 5748V Vaticanus 3864

    (Florilegium Vaticanum)

    Table 1

    Beginning at Catilina 1.1, the first 300 loci were selected which contain var iants in

    one or m ore of the above eleven man uscripts.30

    The adap ted collation was then formed

    by listing, for each locus, the grou ps of man uscripts wh ich exhibited the same reading.

    The collation then consisted of a sequence of row s such as appear in Table 2.

    Locus: Group 1 Group 2 Group 3 Group 4

    [rows 1-11]12 ABCDFNP HK V13 C N A BDFHKPV[rows 14-300]

    Table 2

    30To be more precise, in keeping w ith the biological metaphor, only the latest markings in the

    man uscripts wer e collated. Thus, as corrected mar kings were ignored , loci containing variants in ear lier

    hands are n ot includ ed in th e 300. The selected loci do, how ever, includ e every varian t in the last hand (at

    each locus) of the app rop riate manu script from Catilina 1.1 to 52.35.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    16/31

    Breindel 16

    To analyze the collation, the DNAPARS componen t of the PHYLIP package was to be

    employed, because it is the only component of PHYLIP which can p rocess mu lti-state

    discrete characters (albeit by marking the states with DNA labels).31

    DNAPARS is a

    program w hich compar es DNA base sequences for a set of specimens and evaluates

    various p ossible cladogram s on the basis of a parsimony criterion.

    A parsimony criterion favors arrangements of the specimens w hich require the

    fewest character state changes in the course of the specimens evolution. For example, a

    phylogeny wh ich requires a specimen possessing a DN A sequence of AAA to give rise

    to one possessing ACT and , thereafter, requires the specimen p ossessing ACT to give

    rise to one possessing the sequence AAA again wou ld not be favored. This prop osed

    phylogeny requires two bases to change state (AA to CT) and later to change again

    (back to AA), involving four base changes overall. Instead , a parsimon y criterion might

    favor an arrangement w here one specimen featur ing the AAA sequence gives rise to the

    other w ith the AAA sequence, and the latter gives rise to that possessing the ACT

    sequence.32

    This latter phylogeny requ ires only a single change of two bases, or two

    character state changes overall, and is thus more parsimonious than the former.

    Further assump tions involved in the par simony m ethod, and d iffering views

    about them, are listed (or references provided) by Felsenstein.33

    In order to evaluate the collation using DN APARS, the collation d ata had to be

    converted from the form illustrated in Table 2 to a form wh erein manuscrip ts group ed

    31See Frequen tly Asked Qu estions, Felsenstein, op. cit.

    32This ph ylogeny m ight be favored because one can observe other p ossible phylogenies with only two

    character state changes. Such phylogenies would be equally parsimonious with the one given, and hence

    wou ld be judged equally d esirable by a p arsimony criterion.33

    DNAPARSDNA Parsimony Program (documentation) in Felsenstein, op. cit.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    17/31

    Breindel 17

    by a shared reading were each assigned a particular DNA base abbreviation (A, C, G, T,

    or -, wh ich ind icates a fifth state to DN APARS). The DNA base label assigned to a

    manuscript at a particular locus wou ld correspond to the group in wh ich that

    manuscript resided at that locus.

    Each row of the collation would yield one DN A base label for each m anu script;

    thu s the 300 loci in the collation wou ld prod uced a 300-base DNA strand for each of

    the eleven manu scripts. The creation and da ta entry of these 3,300 base labels was

    beyond what could easily be accomplished man ually. To perform the task, a custom

    app lication p rogram was written (MSS2DNA) which allows th e entry of the collation in

    table form, p erforms the translation to sequences of DNA base labels for the various

    manuscripts, and m oun ts the results on the Microsoft Wind ows clipboard (Figure 5).34

    From the clipboard , the DNA d ata for the various manu scripts was assembled

    with a text editor into the file format required by DNAPARS, as documente d by

    Felsenstein.35

    In order to facilitate comp arison to Reynolds stemmatic work on the

    Sallust m anu scripts, and because they represent only p arts of the text, data for

    manuscripts V (a florilegium) and Q w ere removed from the d ata file, leaving the nine

    manuscripts for which Reynolds had p ublished a stemma. In removing V and Q, some

    27 (i.e., 9%) of the loci were ren dered irrelevant, although th ey remain in the set.36

    34This program , while not elegant, is pu blicly available (with source code) so that others may

    independ ently condu ct investigations or repeat and verify the p resent investigation. The pr ogram,

    MSS2DNA, runs on 32-bit Microsoft Wind ows p latforms (Wind ows 95, Wind ows 98, Window s NT) and

    may be dow nloaded in archived (ZIP) form at http:/ / homer.bus.miami.edu / ~adbreind/ mss2dna.zip35

    Molecular Sequence Program s in Felsenstein, op. cit.36

    These data p oints represent loci at wh ich only Q and/ or V differed from the consensus of remaining

    ma nu scripts. These sites can be identified from Ap pen dix B, in the table marked steps in each site, as

    sites where the table shows 0 steps. That is, the rem aining man uscripts show consensus at the site, so no

    character state changes are required for any phylogenetic arrangement of the manuscripts.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    18/31

    Breindel 18

    The completed DNAPARS file app ears in this report as Ap pen dix A: Infile.

    The DNAPARS program was th en run, using th is file as its data source.37

    Figure 5. MSS2DNA. The columns collect themanuscripts which share a reading at each locus. Thecolumn headings indicate the DNA base labels which willbe attached to the manuscript groups.

    DNAPARSprodu ced the outp ut file wh ich app ears in th is report as App end ix B:

    Outfile, and which includ es the preliminary phylogenetic tree (Figure 6). DNAPARS

    was then run on the inpu t data several more times in order that other possible most

    parsimonious trees might be discovered. No other m ost parsimonious trees were found.

    37The 386-Windows precompiled PHYLIP executables were used throughout. The program options

    selected for DNAPARS were all defaults with the following exceptions: Rand omize ord er was selected,

    with a seed of 69 (=4*17+1) and 100 perm utations of the inp ut r ows; terminal typ e was set to (none); inpu t

    sequences interleaved was set to No; and all printing options for the output were selected.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    19/31

    Breindel 19

    One most parsimonious tree found:

    +--F.Hauniens

    +--8+--7 +--D.Par10195! !

    +--6 +-----H.Beroline! !

    +--------5 +--------K.VatP_887! !! +-----------N.VatP_889

    +--4! ! +--C.Par_6085! ! +--3

    --1 +--------------2 +--B.Basileen! !

    ! +-----A.Par16025!+-----------------------P.Par16024

    remember: this is an unrooted tree!

    Figure 6

    In order that the outpu t from this program m ight be comp ared to Reynolds

    pu blished stemma for Sallust, and in recognition of Reynolds jud gments abou t the

    quality of the textual variants, the tree was re-oriented using th e PHYLIPs RETREE

    program. Since manu scripts F, D, H, K, and N formed a monop hyletic group an d

    because they had been collected in Reynolds presentation ofthe Sallust stemma, the

    nod e representing their common an cestor was selected for the ou tgroup (or

    subarchetype). Note that althou gh the tree was re-oriented n o changes were made to

    the genealogical relationships inferred betw een the m anuscripts by DNAPARS.38

    The

    transcript of the RETREE session ap pear s in th is report as Ap pendix C: RETREE

    38Re-orientation in effect asserts likely positions for the subarchetyp es. As described above, West had

    indicated th at such a step w ould be r equired, and that it should be cond ucted using a critics evaluation

    of the variants.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    20/31

    Breindel 20

    Session.39

    The session also produ ced as outp ut a new tree file. This tree file was u sed as

    inpu t to PHYLIPs DRAWGRAM program, which constructed a grap hical

    representation of the stemma (Figure 7).

    Figure 7

    For the sake of comparison, Reynolds stemma is reprod uced (Figure 8).40

    Figure 8

    As can be observed from the compu ter-generated tree and Reynolds tree

    (Figures 7 and 8), they are nearly identical modu lo inversion. There are, however, tw o

    39The progr am op tions selected for RETREE were all defaults w ith the following exception: no gra ph ics

    was selected.40

    Reynold s, L.D., ed. (1991) C. Sallusti Crispi: Catilina, Iugurtha, Historiarum Fragmenta Selecta, Appendix

    Sallustiana, Oxford, p . xi.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    21/31

    Breindel 21

    differences. First, Reynold s associates N an d K more closely with each other than with

    H, D, or F, while DNAPARS detected no su ch difference in proximity. Second,

    Reynolds associates A more closely with P than with B or C, wh ile DNAPARS indicated

    no su ch closer affiliation. This latter d istinction can in fact be attribu ted to d ifferences in

    the text being collated, rather than to differences between th e analyses of Reynolds and

    DNAPARS (see below).

    Analysis

    Since several hund red rearrangements of the order of the DNA strand s

    produ ced no further most parsimonious trees, it seems reasonable to supp ose that the

    manuscript collation d ata specify a unique m ost parsimonious tree.41

    The existence of a

    un ique most par simon ious tree is itself an ind ication that the p resent method may be

    produ ctive, as it obviates the need for a hu man to insert prejudices into the analysis, by

    selecting one cladogram from a list of many. The similarity of the resu lts derived

    throu gh Reynolds analysis to those derived through the parsimon y analysis can, in

    light of the n ovelty of the app roach, only be called stu nning.

    This similarity is fur ther strengthened when w e account for one of the two

    ind icated differences between the stemmata. As described above (see n. 30), in keeping

    with the metaphor of biological evolut ion, only the latest extant m arkings (corrections,

    not including d eletions) on each man uscript w ere collated. Thus, wh ere the first and

    41This supposition is based on Felsensteins implicit assumption that a relatively small number of

    rearran gemen ts of the inpu t data ought to yield mu ltiple most par simonious trees if they exist. Such an

    assertion seems mathem atically suspect, considering th e large num ber of possible permu tations of, say,

    nine m anu scripts (over 360,000). On this m atter, how ever, I defer to Felsensteins know ledge as a

    specialist.

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    22/31

    Breindel 22

    second hand s of A differed, the second han d w as read for the collation instead of the

    first. Reynolds naturally constructs his stemma indicating the position of the original A

    text. But h e notes that Secund a manu s (A2) librum lectionibu s instruxit ex aliquo stirp is

    [= B, C] codice petitis. That is, where readings exist in A2, they come from the B-C

    branchwhich fact DNAPARS appears to have recognized, in asserting the A -A2

    man uscript to d escend both from an ancestor of P and also from a closer ancestor of B

    and C. To test this hypoth esis, we w ould merely need to modify the collation to reflect

    only A-A1

    readings, and th en see where DNAPARS places the manu script.

    Hav ing taken the discrepancies into account, it seems that both th e hum an and

    the machine-assisted analysis derive results from th e same und erlying p attern among

    the man uscript readings. This study, then, preliminarily suggests that the parsimony

    analysis technique could substantively advance knowledge of textual tran smission.

    Furthermore, the parsimony analysis can indicate the readings likely to ap pear in

    the archetype an d subarchetypes, in ord er that th ey most efficiently give rise to the

    extant man uscripts. A detailed examination of such archetype reconstruction is beyond

    the scope of this study. But ambitious readers shou ld note that Append ix B to this

    pap er (i.e., the DNAPARS outp ut) prov ides the readings likely to appear at various

    nod es in the cladogram for every locus studied. On Reynolds view of the transmission,

    the archetype (his ), ough t to bear the readings given for node 4.

    Future Research

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    23/31

    Breindel 23

    The futu re p resents a nu mber of immediate challenges and possibilities for the

    cladistic analysis of texts using p arsimony techniques. The obvious methods throu gh

    which th e procedu re may be tested include examining a variety of texts, as well as

    using full collationsin p lace of collations bu ilt from apparatus criticiso as to avoid

    dep endence on one editors opin ion of wh at may be viable manuscript read ings.42

    If positive results are indicated, p arsimony an alysis might be deployed to assist

    the textual critic in determining the relationships of texts, and in reconstructing

    archetypes, for n ew pu blications. Perspectives may also be presented for re-evaluating

    existing dogm a about traditions wh ich have not been recently examined .43

    In the

    classroom, the use of graphical interactive parsimony programs, wh ich allow on e to

    manipulate stemmata on -screen an d immed iately to observe the consistencies or

    inconsistencies thus fostered , may facilitate integration of stemmatics into the stand ard

    classics curr iculu m.44

    Lastly, literary theorists may w ish to pond er the existence of

    deeper metaph ors connecting the enzymes and mutation s of DNA replication with the

    correspond ing verbal agents and scribal errors giving rise to many of our textual

    variants.

    42Readings w hich mu st quite certainly be eliminated h ave no place und er the text, wr ites Maas (p. 23),

    thu s giving editors license to omit even from the app. crit. those readings deemed eliminanda.43

    We may sup pose that p arsimony an alysis will be effective in evaluating relationships between

    man uscripts of texts in m oder n, as well as ancient, languages.44

    MacClade (distributed by Sinau er Associates) is one such program. Many cand idates which might be

    useful for heavy-du ty analysis as well as pedagogy are described by Felsenstein at

    http:/ / evolution.genetics.washington.edu/ phylip/ software.html

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    24/31

    Appendix A: Infile

    9 300

    P.Par16024AAACCCCCCCAATACCCCCCAACGCCCACACCCAACCACCACCCCCACGACGGAAACCCCCCGCCCCCCACCACACCACACCCCCCACCCAACCCATACCACCAAACACACGCCCCCGCCACACGACGGCACACACCCCACACAACAC-

    CTCACCCACAACCCCCCCCACCACCCAACAAACCGCCCAAACCCACACCCCCACCACCCAACCCCCACCCCACACCCCCACAAACCCCCCACCTACCCCCCCCCACCAGCCCCACCAACCAAACCAACCCCCGAACCACCCCCACCCCCCAA.Par16025CCACACCCCACAGCCCCCCCACCGCACCCCCCACCCCCCCCCCCCCCCATCGAACCCGCCACGCCCCCCGCAACCCCCCCCCCCCCCACACTCCCCACCACCCACACCCCCGACCCAAACCAAAAACGGCCCCCCCCCCACACCCCCCCCAC-CCACCCAAAACACCCACCCCCCCACACCCAACCAACAAAACCACCCCCCCCAACACCCCCCACCACCCCCCCACACCCCCCAAACAACCCACCCCCACCCCCCCGCCCCACCCACCACCCCACACCCCGACCCCACCCCGACACCGCB.BasileenAACCCCCCCAAATCACCCCCAGCGCCCACACAACCCCCACACCCCCGCCCCGGAACCCCCCCCCACCACCCCCCCCCCCCCCCCCCCACCATACCCAACGAACAAACCCCCGAACCACACACAACCCGGACACCGCCCCACACCCCCCACCACCCACCCCCAACCCCCACACCCCACCAGACAACCACCCCCCCCACCCCCCCCAACCCCCCCCCCCCAACCCCCCCCCCCCCCCCGCCCACCCCCCCCACCACCCCGCACCACCCACCGCCCCACCCCCCGACCCCCCCCCCACACCGAC.Par_6085AAACCCCCCAAAACACCCCCAGCGCACCCCCAACCCCCCCCCCCCCGCACCGGACCCCCCACACACCACTCAACCCCACACCCCCCCACACTCCCCTACGCCCACACCCCCGCACCACACAAAACCCGGCCACCCCCCCACACCCCCCCCCACCCACCCCCAACCCCCCCCCCCCCCCAGCCACCCACCCCCAACACCCCCCCCACCACCCCCCACCCCAACACACCCCCCCCCCCGCCCCCCCCCCCCACCCCCCCGCCCCACCCACCACCCCACCCCCCGACCCCACCCCGACACCGCN.VatP_889CAACCCAACCCACACCACAAACCGCCCCACCCCCCAAAACACCCCCAAGGCAGCCCCACACAAACCCCCACCACCC

    CCCCCAACCCCACCCGCCCCACAAAGAACCCACCCGCCCCCGCCCCCAGCAGGCGGCCGACACCACCCCCCAGCGCCCCCCCCCCCAACCCCCCCCCCCCCCAGACAGCACACACCCAACACCCCCCCAACACCCCCACACCCCACCCCCCCCCCCAACCACAACAACCACCCCCCCCCCAGCCCCCCCAACCACCCCCCCCCCCCCCCACCCCCCGCAGCCGCK.VatP_887ACCCCCCACCCCTCCAACCAAGCGCCCCCCCCCCCAAAACACCACAAAGGAGGCCCCCCCACGACCCCCGCCGCCCCCCCCCCCCACCCCGGACCCCACAAGACCCACCCCACACACGCCACCCGCAGAGGGCCCACACCACCCACCATCCCTCCACCCCCCACCACACACCCCCAACAGCCAGCCCCCCCCCAACCCCACCACAACCCACAAACCCCCCCCCCCCCCCCCCCACCACCCAACACACCCCCACCCAAGCCCCACCACCCCCACCCCCCCCCGCCCACCACCCACCACCGCH.BerolineAACCCACCACACTCCCCCCACGACCCCCACCCCCCCACACGCCCCAAAGCCGGACCCCACCGGCCCCACCACCCCCCCCCCCCCCACACCCTACCCGCCTAGCCCAACCCACCCACCGCCCCCAGCCGGACTCCCCAACCCCCCGCCATCCC-CCACCCCCAACCACACCCCACCACAAGACAGACCCCCCCCCCACCCCCCAAGCCCAAAAACAGACCAACCCCCACCCCCCCCCCCCCAC-GCCCCCCCACCCCCACACCCCCACCAGCCCCACCCACCGCCCCCCCCCAACCTCCGCD.Par10195CACACACCACCATCCACACAATCACCACCCACCCCCAAACGACAAAACGGACCCCCCCCCCAGACAAACACCCACCCCCCACAACACCCCTCAAACGCCAAGCCCCACACAACCAACGCCCCCCGCACCGGGCCACAAACCCCCCCCAGCCCGCCCCACCCACCCCAACAACAACGAAAGCAAGCCCCCCCCCACCCAACCCAACCCAAAAAACCGACCAACCACCACCCCCCCCCCCCCACGCCCACCCCACCACCCCACACCCACCAGCAAACACCCAAGCCCCCCCACCCCCACCAC

    F.HauniensCACACACCCACATCCAAACCAGCAACACCCACCCCCAACAAAACAAACGGACCCCCATCCCACACAAACACACACACCCCACAACCCACCTGAACCGGCAAGCCCCCCACCACAAACGCACCCCGCCAGTGTCCGCACACCCCACCCAGCCCGCCCCACCCACACAAACACCAAAGAAAGCAAGCCCCACCCCACGCAACACAACCCCAAAACCCCAACACCCACCACCCCCCCCCCACCACGCCCACCACAACACCCCACCCCCACCAGCCAAGACCAAAGCCACCCCACCCCC-ACAC

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    25/31

    Breindel 25

    Appendix B: Outfile

    DNA parsimony algorithm, version 3.572c

    Name Sequences---- ---------

    P.Par16024 AAACCCCCCC AATACCCCCC AACGCCCACA CCCAACCACC ACCCCCACGA CGGAAACCCCA.Par16025 CC..A....A C.GC...... .C...A.C.C ..ACC..C.. C.....C.AT ..A.CC.G..B.Basileen ..C......A ...CA..... .G........ .AACC..CA. ......G.CC .....C....C.Par_6085 .........A ..ACA..... .G...A.C.C .AACC..C.. C.....G.AC ....CC....N.VatP_889 C.....AA.. C.C...A.AA .C.....CAC ...CCAA.A. .......A.G .A.CCC.A.AK.VatP_887 .CC....A.. CC.C.AA..A .G.....C.C ...CCAA.A. ...A.A.A.G A..CCC....H.Beroline ..C..A..A. .C.C.....A CGAC...CAC ...CC.ACA. G....A.A.C ....CC..A.D.Par10195 C.CA.A..A. C..C.A.A.A .T.A..AC.C A..CC.A.A. GA.AAA...G ACCCCC....F.Hauniens C.CA.A...A C..C.AAA.. .G.AA.AC.C A..CC.A..A .AA.AA...G ACCCCCAT..

    P.Par16024 CCGCCCCCCA CCACACCACA CCCCCCACCC AACCCATACC ACCAAACACA CGCCCCCGCCA.Par16025 A........G .A..C..C.C ......CA.A CT...CAC.A C...C..C.C ..A...AAA.B.Basileen ..C.A..A.C ..C.C..C.C ......CA.. .TA..CA..G .A.....C.C ..AA..ACA.C.Par_6085 A.A.A..A.T .A..C..... ......CA.A CT...C...G C...C..C.C ...A..ACA.N.VatP_889 .AAA...... ....C..C.C .AA...CA.. CG...CACAA .GA.CC...C ..........

    K.VatP_887 A..A.....G ..G.C..C.C .....AC... GGA..CC..A .GACCCAC.C .A.A.A....H.Beroline .G.....A.C A.C.C..C.C .....ACA.. CTA..CGC.T .G.CC.AC.C AC..A.....D.Par10195 .A.A.AAA.. ..CAC..C.C A.AA.AC... TCAAACGC.A .G.CCCACAC AA..AA....F.Hauniens .ACA.AAA.. .ACACA.C.C A.AA..CA.. TGAA.CGG.A .G.CCC.CAC .A.AAA...A

    P.Par16024 ACACGACGGC ACACACCCCA CACAACAC-C TCACCCACAA CCCCCCCCAC CACCCAACAAA.Par16025 CA.AA..... C.C.C..... ...CC.C.C. A.-..AC.C. AAA.A..... .C...C...CB.Basileen ...ACC...A CAC.G..... ...CC.C.A. CAC..AC.CC .AA....... AC....C..GC.Par_6085 .A.ACC.... CAC.C..... ...CC.C.C. CAC..AC.CC .AA.....C. .C...CC..GN.VatP_889 C.CA.CA... GGC.GA.A.C AC.CC.CAG. G.C...C.CC ..AA....C. .C...CC..GK.VatP_887 ..C..CA.AG GGC.CA.A.C AC.C..CAT. C.T..AC.CC ..A..A.ACA .C.......GH.Beroline C.CA.C...A CTC.C.AA.C .C.CG.CAT. C.-..AC.CC .AA..A.AC. .CA...CA.GD.Par10195 C.C..CACCG GGC...AAAC .C.CC.CAG. C.G...CACC .A....AACA ACAA.G.A.GF.Hauniens C.C..C.A.T GTC.G.A.AC .C..C.CAG. C.G...CACC .A.A.AAACA .CAAAG.A.G

    P.Par16024 ACCGCCCAAA CCCACACCCC CACCACCCAA CCCCCACCCC ACACCCCCAC AAACCCCCCAA.Par16025 C.AA..A.C. AAAC...... .C...A.ACC ....AC.A.. C.C..A.AC. CCC.AAA.A.B.Basileen ..AA..ACCC ...C...... .C...A..CC .....C..AA C.C.....C. CCC...G..CC.Par_6085 C.AC..ACCC ..A....... .C.....ACC ....AC...A ....A...C. CCC...G..CN.VatP_889 ..A..A..C. ....ACA... .C...A.ACC ...A...... ..C.....C. CC.A..A.A.K.VatP_887 C.A....CCC ....AC...A .CA..A..C. .AAA.C.... C.C.....C. CCCA..A..CH.Beroline ..A.A..CCC ...C...... .CAAG..... AAA.AGA..A ..C...A.C. CCC......CD.Par10195 CAA....CCC .....C.AA. .CAAC..A.. AAA..GA..A ..CA..A.C. CCC......CF.Hauniens CAA....C.C .....G.AA. ACAAC..... AA...CAA.A C.CA..A.C. CCC....A.C

    P.Par16024 CCTACCCCCC CCCACCAGCC CCACCAACCA AACCAACCCC CGAACCACCC CCACCCCCCAA.Par16025 ..C......A ...C..C... .....C.... CC...CA... ...C..CA.. ..GA.A..GCB.Basileen A.CC.....A ..AC..C..A .....C...G CC...C.... ...C..C... ..CA.A..G.C.Par_6085 ..CC.....A ...C..C... .....C.... CC...C.... ...C..CA.. ..GA.A..GCN.VatP_889 .AAC.A.... ...C...... ..C....... CC..CC.... .CCC.AC... ..G.AG..GC

    K.VatP_887 AAC..A.... .A.C.A.... ......C..C C...CC.... ..CC.AC.A. .....A..GCH.Beroline A.-G...... .A.C..CA.A ..C...C.AG CC...C..A. ..CC..C... .A...T..GCD.Par10195 A.GC..A... .A.CA.CC.A .AC...C.AG C.AAC....A A.CC..C..A ..C..A..ACF.Hauniens A.GC..A..A .AACA.CC.A ..C...C.AG CCAAG...AA A.CCA.C..A ..C..-A.AC

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    26/31

    Breindel 26

    One most parsimonious tree found:

    +--F.Hauniens+--8

    +--7 +--D.Par10195! !+--6 +-----H.Beroline! !

    +--------5 +--------K.VatP_887! !! +-----------N.VatP_889

    +--4! ! +--C.Par_6085! ! +--3

    --1 +--------------2 +--B.Basileen! !! +-----A.Par16025!+-----------------------P.Par16024

    remember: this is an unrooted tree!

    requires a total of 500.000

    steps in each site:0 1 2 3 4 5 6 7 8 9

    *-----------------------------------------0! 3 2 2 1 1 1 1 2 210! 2 3 2 3 2 1 2 3 1 120! 2 1 4 1 2 1 2 1 2 230! 2 1 1 1 1 1 2 1 2 340! 1 4 1 1 2 1 1 2 2 250! 4 2 2 2 2 2 1 1 3 1

    60! 1 3 3 4 2 1 1 1 2 070! 5 1 3 3 1 1 1 0 2 080! 2 1 1 2 1 0 2 1 3 090! 2 4 4 2 1 1 1 4 4 1100! 3 2 2 2 1 2 2 2 2 1110! 1 2 2 2 3 1 2 1 2 1120! 1 3 2 1 3 2 2 3 2 2130! 4 3 4 1 0 5 2 1 2 1140! 1 2 1 0 2 3 0 1 1 5150! 0 3 1 5 0 0 3 1 1 1160! 2 1 2 2 2 1 2 1 1 2170! 2 2 1 1 1 1 4 3 1 0180! 2 4 1 1 2 1 1 1 2 2190! 2 1 1 2 3 2 3 1 1 1200! 1 1 1 1 1 2 3 0 4 2

    210! 2 1 1 2 2 3 4 1 2 1220! 2 4 0 2 1 1 1 1 1 1230! 0 1 1 2 2 1 1 3 1 2240! 2 2 2 4 4 0 2 1 0 0250! 2 0 1 2 1 1 1 2 2 0260! 2 0 1 2 0 0 1 1 0 1270! 3 1 3 1 1 3 2 1 0 2280! 1 1 1 1 1 1 2 1 2 1290! 1 0 1 4 1 1 4 1 0 2300! 2

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    27/31

    Breindel 27

    From To Any Steps? State at upper node( . means same as in the node below it on tree)

    1 AAACCCCCCC MATMCCCCCC AVCGCCCMCM CCCMMCCACC1 4 maybe .......... .......... .S.....C.C ...CC.....4 5 yes .......M.. C.....M..A .......... .....MA.A.

    5 6 yes ..C....... .M.C.M.... .G........ ..........6 7 yes .....A.CM. .......... ...V...... .....C....7 8 yes C..A...... .A...A.A.. ...A..A... A.........8 F.Hauniens yes ........CA ......A..C ....A..... ........CA8 D.Par10195 yes ........A. ......C... .T........ ..........7 H.Beroline yes ........A. AC...CC... C.AC....A. .......C..6 K.VatP_887 yes .C.....A.. .C...AA... .......... .....A....5 N.VatP_889 yes C.....AA.. ..CA..A.A. .C......A. .....A....4 2 yes .........A ...C...... .....M.... ..A....C..2 3 yes .......... A...A..... .G........ .A........3 C.Par_6085 yes .......... ..A....... .....A.... ..........3 B.Basileen yes ..C....... .......... .....C.A.A ........A.2 A.Par16025 yes CC..A..... C.G....... .C...A.... ..........1 P.Par16024 maybe .......... A..A...... .A.....A.A ...AA.....

    1 ACCCCCACGN CGGAMMCCCC CCGCCCCCCA CCACMCCMCM1 4 maybe .......... ....CC.... .......... ....C..C.C4 5 yes .......A.G ...C...... .M.A...... ..........5 6 yes .....A.... M......... .......... ..V.......6 7 yes R......... .......... .V.....A.. ..C.......7 8 yes .A..A..C.. ACC....... .A...AA... ...A......8 F.Hauniens yes A.A....... ......AT.. ..C....... .A...A....8 D.Par10195 yes G..A...... .......... .......... ..........7 H.Beroline yes G........C C..A....A. .G.C.....C A.........6 K.VatP_887 yes ...A...... A......... AC.......G ..G.......5 N.VatP_889 yes .......... .A.....A.A .AA....... ..........4 2 yes M.....V.A. .......... M........N .M........2 3 yes ......G..C .......... ..V.A..A.. ..........3 C.Par_6085 yes C......... .......... A.A......T .A.....A.A3 B.Basileen yes A.......C. ....A..... C.C......C .CC.......

    2 A.Par16025 yes C.....C..T ..A....G.. A........G .A........1 P.Par16024 maybe .........A ....AA.... .......... ....A..A.A

    1 CCCCCCMMCC MDCCCMWMCM ACCAMACACM CGCCCCCGCC1 4 maybe ......CA.. C....CA..A ....C..M.C ..........4 5 yes .......... .G........ .GM..C.... ..........5 6 yes .....A.... ..A...V... ...C..AC.. .A...M....6 7 yes .......... ......GC.. ..C....... M...A.....7 8 yes A.AA...... T..A...... ........A. .....A....8 F.Hauniens yes .....C.... .......G.. ......C... C..A.....A8 D.Par10195 yes .......C.. .C..A..... .......... A.........7 H.Beroline yes .......... .T.......T .....A.... AC...C....6 K.VatP_887 yes .......C.. G.....CA.. ..A....... ...A.A....5 N.VatP_889 yes .AA....... .......CA. ..A....A.. ..........4 2 yes .........M .T........ M......C.. ..M...AVA.

    2 3 yes .......... .......A.G .......... ...A...C..3 C.Par_6085 yes .........A ......T... C......... ..C.......3 B.Basileen yes .........C A.A....... AA..A..... ..A.......2 A.Par16025 maybe .........A .......C.. C......... ..A....A..1 P.Par16024 maybe ......AC.. AA...ATA.C ....A....A ..........

    1 MCAMGMCGGC VCMCMCCCCA CACMMCMC?C YC?CCMMCMM1 4 maybe .......... ..C.C..... ...CC.C... C.?...C.C.4 5 yes ..C..CM... GG...M.A.C MC.....A.. .........C5 6 yes .........G .......... ........K. ..?.......6 7 yes C......... .K...CA... C......... ..........

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    28/31

    Breindel 28

    7 8 yes ...C...V.. ....V...A. ........G. ..G..C.A..8 F.Hauniens yes ......CA.T .T..G..C.. ...A...... ..........8 D.Par10195 yes ......ACC. .G..A..... .......... ..........7 H.Beroline yes ...A..C..A CT........ ....G...T. ..-..A....6 K.VatP_887 yes A..C..A.A. .....A.... A...A...T. ..T..A....5 N.VatP_889 yes C..A..A... ....GA.... A.......G. G.C..C....4 2 maybe .M.AV..... C......... ........C. .....A....

    2 3 yes A...CC.... .A........ .......... .AC......C3 C.Par_6085 maybe .A........ .......... .......... ..........3 B.Basileen yes .C.......A ....G..... ........A. ..........2 A.Par16025 yes CA..AA.... .......... .......... A.-......A1 P.Par16024 maybe A..C.A.... A.A.A..... ...AA.A.-. T.A..CA.AA

    1 CCMCCCCCAC CMCCCMACAR MCMGCCCAMA CCCACACCCC1 4 maybe ..A....... .C.......G ..A.....C. ..........4 5 yes ........C. .......... .......... ....MC....5 6 yes .....A.A.M .....A.... .......C.C ..........6 7 yes .A........ ..A....A.. .......... ....C.....7 8 yes ..C...A..A ...A.G.... CA........ .......AA.8 F.Hauniens yes ...A...... ....A..... ........A. .....G....8 D.Par10195 yes .....C.... A......... .......... ..........7 H.Beroline yes .........C ......C... A...A..... ...C.A....

    6 K.VatP_887 yes .........A .......... C......... ....A....A5 N.VatP_889 yes ...A...... .....CC... A....A.... ....A.A...4 2 yes .A........ .......... ...A..A... ..MM......2 3 yes .......... ......C... .......C.C ..........3 C.Par_6085 yes ........C. .....C.... C..C...... ..AA......3 B.Basileen yes .......... A....A.... A......... ..CC......2 A.Par16025 yes A...A..... .....C...C C......... AAAC......1 P.Par16024 maybe ..C....... .A...A...A A.C.....A. ..........

    1 CMCCAMCMMM CCCCCMCCCC ACMCCCCCMC MMACCCMCCA1 4 maybe .C...A..C. .......... ..C.....C. CCM...A...4 5 maybe .......... ...M...... .......... ...M......5 6 yes ..A....C.A .AA..V.... .......... ..C......C6 7 yes ...AVC..A. A..C.SA..A ......A... ...C..C...7 8 yes ....C..... .......... ...A...... ..........

    8 F.Hauniens yes A......... ..C..C.A.. C......... .......A..8 D.Par10195 yes .......A.. .....G.... .......... ..........7 H.Beroline yes ....G..... ....AG.... .......... ..........6 K.VatP_887 yes .......... ...A.C.... C......... ...A......5 N.VatP_889 yes .......A.C ...A.A.... .......... ..AA....A.4 2 maybe .........C ....MC.... M......... ..C.......2 3 yes .......... .........A .......... ......G..C3 C.Par_6085 yes .....C.A.. ....A..... A.A.A..... ..........3 B.Basileen yes .......C.. ....C...A. C......... ..........2 A.Par16025 yes .......A.. ....A..A.. C....A.A.. ....AA..A.1 P.Par16024 maybe .A...C.CAA .....A.... ..A.....A. AA....C...

    1 CCYMCCCCCC CCCMCCAGCC CCACCAACCA MMCCAMCCCC1 4 maybe ..C....... ...C...... .......... CC...C....4 5 yes .M...M.... .......... ..M....... ....C.....

    5 6 yes A......... .A........ ......C..V ..........6 7 yes .C?V.C.... ......CV.A ..C.....AG ........M.7 8 yes ..GC..A... ....A..C.. .......... ..AA.A...A8 F.Hauniens yes .........A ..A....... .......... ....G...A.8 D.Par10195 yes .......... .......... .A........ .A......C.7 H.Beroline yes ..-G...... .......A.. .......... ....A...A.6 K.VatP_887 yes .A.A.A.... .....A.... ..A......C .A........5 N.VatP_889 yes .AAC.A.... .......... ..C....... ..........4 2 yes .........A ......C... .....C.... ..........2 3 maybe ...C...... .......... .......... ..........3 C.Par_6085 no .......... .......... .......... ..........

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    29/31

    Breindel 29

    3 B.Basileen yes A......... ..A......A .........G ..........2 A.Par16025 yes ...A...... .......... .......... ......A...1 P.Par16024 maybe ..TA...... ...A...... .......... AA...A....

    1 CGAMCCMCCC CCRCCMCCSM1 4 maybe ...C..C... .....A..GC4 5 yes ..C..M.... ..........

    5 6 maybe .......... ..A.......6 7 maybe .....C.... ..........7 8 yes A........A ..C.....A.8 F.Hauniens yes ....A..... .....-A...8 D.Par10195 no .......... ..........7 H.Beroline yes .......... .A...T....6 K.VatP_887 yes .....A..A. ..........5 N.VatP_889 yes .C...A.... ..G.AG....4 2 yes .......M.. ..GA......2 3 no .......... ..........3 C.Par_6085 maybe .......A.. ..........3 B.Basileen yes .......C.. ..C......A2 A.Par16025 maybe .......A.. ..........1 P.Par16024 maybe ...A..A... ..A..C..CA

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    30/31

    Breindel 30

    Appendix C: RETREE Session

    Tree Rearrangement, version 3.572c

    Settings for this run:U Initial tree (arbitrary, user, specify)? User tree from tree file

    N Use the Nexus format to write out trees? No0 Graphics type (IBM PC, VT52, ANSI)? ANSIW Width of terminal screen, of plotting area? 80, 80L Number of lines on screen? 24

    Are these settings correct? (type Y or the letter for one to change)0

    Tree Rearrangement, version 3.572c

    Settings for this run:U Initial tree (arbitrary, user, specify)? User tree from tree fileN Use the Nexus format to write out trees? No0 Graphics type (IBM PC, VT52, ANSI)? (none)

    W Width of terminal screen, of plotting area? 80, 80L Number of lines on screen? 24

    Are these settings correct? (type Y or the letter for one to change)y

    Reading tree file ...

    retree: can't read intreePlease enter a new filename>treefile

    ,>>1:F.Hauniens,>15

    ,>14 `>>2:D.Par10195! !

    ,>13 `>>>>>3:H.Beroline! !

    ,>>>>>>>12 `>>>>>>>>4:K.VatP 887! !! `>>>>>>>>>>>5:N.VatP 889

    ,>11! ! ,>>6:C.Par 6085! ! ,>17

    -10 `>>>>>>>>>>>>>16 `>>7:B.Basileen! !! `>>>>>8:A.Par16025!`>>>>>>>>>>>>>>>>>>>>>>>9:P.Par16024

    NEXT? (Options: R . U W O T F B N H J K L C + ? X Q) (? for Help) oWhich node should be the new outgroup? 12

    ,>>1:F.Hauniens,>15

    ,>14 `>>2:D.Par10195! !

    ,>13 `>>>>>3:H.Beroline! !

  • 8/15/2019 Application of a Discrete-Character Parsimony Phylogeny-Inference Algorithm to Classical Text Stemmata

    31/31

    Breindel 31

    ,>>>>>>>>>>12 `>>>>>>>>4:K.VatP 887! !! `>>>>>>>>>>>5:N.VatP 889!

    -10 ,>>6:C.Par 6085! ,>17! ,>16 `>>7:B.Basileen

    ! ! !`>>>>>>>>>>>>>11 `>>>>>8:A.Par16025!`>>>>>>>>9:P.Par16024

    NEXT? (Options: R . U W O T F B N H J K L C + ? X Q) (? for Help) wEnter R if the tree is to be rootedOR enter U if the tree is to be unrooted: r

    Tree written to file


Recommended