Understanding and Measuring Morphological Complexity · IE Indo-European IMP imperfective IN...

OUP CORRECTED PROOF – FINAL, 5/3/2015, SPi

Understanding and Measuring MorphologicalComplexity



Understanding andMeasuringMorphologicalComplexity

Edited byMATTHEW BAERMAN, DUNSTAN BROWN,AND GREVILLE G. CORBETT

1


3Great Clarendon Street, Oxford, ox2 6dp,United Kingdom

Oxford University Press is a department of the University of Oxford.It furthers the University’s objective of excellence in research, scholarship,and education by publishing worldwide. Oxford is a registered trade mark ofOxford University Press in the UK and in certain other countries

© editorial matter and organization Matthew Baerman, Dunstan Brown, andGreville G. Corbett 2015© the chapters their several authors 2015

The moral rights of the authors have been asserted

First Edition published in 2015Impression: 1

All rights reserved. No part of this publication may be reproduced, stored ina retrieval system, or transmitted, in any form or by any means, without theprior permission in writing of Oxford University Press, or as expressly permittedby law, by licence or under terms agreed with the appropriate reprographicsrights organization. Enquiries concerning reproduction outside the scope of theabove should be sent to the Rights Department, Oxford University Press, at theaddress above

You must not circulate this work in any other formand you must impose this same condition on any acquirer

Published in the United States of America by Oxford University Press198 Madison Avenue, New York, NY 10016, United States of America

British Library Cataloguing in Publication DataData available

Library of Congress Control Number: 2014947260

ISBN 978–0–19–872376–9

Printed and bound byCPI Group (UK) Ltd, Croydon, cr0 4yy

Links to third party websites are provided by Oxford in good faith andfor information only. Oxford disclaims any responsibility for the materialscontained in any third party website referenced in this work.


Contents

List of Figures and Tables viiList of Abbreviations xiList of Contributors xiii

Part I. What is Morphological Complexity?

Understanding and measuring morphological complexity: An introduction Matthew Baerman, Dunstan Brown, and Greville G. Corbett

Dimensions of morphological complexity Stephen R. Anderson

Part II. Understanding Complexity

Rhizomorphomes, meromorphomes, and metamorphomes Erich R. Round

Morphological opacity: Rules of referral in Kanum verbs Mark Donohue

Morphological complexity à la Oneida Jean-Pierre Koenig and Karin Michelson

Gender–number marking in Archi: Small is complex Marina Chumakina and Greville G. Corbett

Part III. Measuring Complexity

Contrasting modes of representation for inflectional systems: Someimplications for computing morphological complexity Gregory Stump and Raphael A. Finkel

Computational complexity of abstractive morphology Vito Pirrelli, Marcello Ferro, and Claudia Marzi

Patterns of syncretism and paradigm complexity: The case of Old andMiddle Indic declension Paolo Milizia


vi Contents

Learning and the complexity of Ø-marking Sebastian Bank and Jochen Trommer

References Languages Index Names Index Subject Index


List of Figures and Tables

Figures

Figure 3.1 Kayardild clause structure and attachment of features 31Figure 3.2 Features which percolate onto word ω 34Figure 5.1 A hierarchy of nom-index 80Figure 5.2 A hierarchy of person values 80Figure 5.3 A hierarchy of gender values 80Figure 5.4 A hierarchy of number values 80Figure 6.1 Types of simple dynamic verbs according to the phonological

shape of their perfective stem (total 142) 112Figure 8.1 One-to-one and one-to-many relations in

one-node-per-variable and many-nodes-per-variable neuralnetworks 146

Figure 8.2 A two-level finite state transducer for the Italian irregular formvengono ‘they come’ 150

Figure 8.3 Outline architecture of a TSOM and a two-dimensional 20×20TSOM 152

Figure 8.4 Topological propagation of long-term potentiation andlong-term depression of temporal re-entrant connections overtwo successive time steps; and word-graph representation ofGerman past participles 154

Figure 8.5 Topological dispersion of symbols on temporal andspatio-temporal maps, plotted by their position in input words 157

Figure 8.6 Alignment plots of the finden paradigm on a temporal (left)and a spatio-temporal (right) map 158

Figure 8.7 (Topological) dispersion values across activation patternstriggered by selected inflectional endings on temporal andspatio-temporal maps for Italian and German known wordforms, and on unknown word forms 159

Figure 8.8 BMU activation chains for vediamo-vedete-crediamo on a20×20 map (left) and their word-graph representation (right) 160

Figure 8.9 Correlation coefficients between alignment scores and recallscores for novel words, plotted by letter positions relative to thestem-ending boundary 161


viii List of Figures and Tables

Figure 8.10 Stem dispersion over <PREF-X, X > pairs in Italian andGerman verb forms. Dispersion values are given as a fraction of100 over the map’s diagonal 163

Figure 9.1 A graph representing the relative frequencies ofmorphosyntactic property combinations in Old Indic adjectivalparadigms 172

Tables

Table 3.1 Morphome types and their mode of categorization 29Table 3.2 Inflectional features and their potential values 32Table 3.3 Features for which lexical stems can inflect 37Table 3.4 Inflectional features and their morphomic exponents 51Table 3.5 Phonological exponents of morphomes 51Table 4.1 Inflection for ampl ‘laugh at’ with different subjects, objects,

and tenses 56Table 4.2 Free pronouns in Kanum, absolutive, and ergative forms 56Table 4.3 Kanum verbal inflection 57Table 4.4 Inflection for makr ‘roast’ with different subjects, objects, and

tenses 61Table 4.5 Opacity and transparency in pronominal systems 62Table 4.6 Tense distinctions portmanteau with agreement affixes 62Table 4.7 Pronominal distinctions in the idealized agreement affixes 63Table 4.8 Takeovers in the object prefixes for two verbs 63Table 4.9 Referrals in the object prefixes 64Table 4.10 Referrals in tense affixes 64Table 4.11 Referrals in the subject suffixes 64Table 5.1 Oneida prenominal prefixes (C-stem allomorphs) 73Table 5.2 The nineteen possible categories of semantic indices 79Table 5.3 Number of morphs in each stem class from which all other

fifty-seven forms can be deduced 88Table 5.4 Number and percentage of stems of each class in twelve texts 89Table 6.1 Agreeing lexical items in the Archi dictionary 95Table 6.2 Syncretism pattern A 98Table 6.3 Syncretism pattern B 98Table 6.4 Archi prefixes 98Table 6.5 Archi infixes, Set I 98


List of Figures and Tables ix

Table 6.6 Archi infixes, Set II 99Table 6.7 Archi suffixes 99Table 6.8 Partial paradigm of first and second person pronouns 102Table 6.9 Verbs of similar phonology, but different patterns of realizing

agreement 111Table 7.1 Orthographic forms of the verbs in (1) 121Table 7.2 Hearer-oriented plat for the verbs in (1) 122Table 7.3 Speaker-oriented plat for the verbs in (1) 123Table 7.4 Morphological indices employed in the speaker-oriented plat 124Table 7.5 Two hypothetical plats 125Table 7.6 Plat representing a hypothetical system of ICs 127Table 7.7 Inflection classes, exponences, and distillations in the two plats 128Table 7.8 4-MPS entropy (× 100) of the four distillations in Table 7.7 129Table 7.9 Distinct exponences in each of the four distillations 129Table 7.10 Traditional principal parts of five Latin verbs 130Table 7.11 Optimal dynamic principal-part sets of three verbs (H-O plat) 131Table 7.12 Dynamic principal-part sets in the two plats 132Table 7.13 Dynamic principal-part numbers in the two plats 132Table 7.14 Candidate principal-part sets for cast in the H-O plat 133Table 7.15 Candidate principal-part sets for cast (S-O plat) 133Table 7.16 Number of viable optimal dynamic principal-part sets 134Table 7.17 Candidate principal-part sets for pass (H-O plat) 134Table 7.18 Candidate principal-part sets for pass (S-O plat) 135Table 7.19 Density of viable dynamic principal-part sets among all

candidate sets having the same number of members 135Table 7.20 IC predictability of twelve verbs in the H-O and S-O plats 137Table 7.21 Cell predictability measures for thirteen verbs in the H-O and

S-O plats 138Table 7.22 Predictiveness of a verb’s cells, averaged across verbs 139Table 7.23 Predictiveness of a verb’s past-tense cell in the H-O and S-O plats 139Table 9.1 Relative frequencies of inflectional values on the basis of

Lanman (1880) 171Table 9.2 Old Indic -a-/-a- adjective declension and relative frequencies

of the sets of inflectional value arrays associated with thedifferent exponents 173


x List of Figures and Tables

Table 9.3 R and D values for some hypothetical variants of the paradigmin Table 9.2 174

Table 9.4 Pali -a-/-a- adjective declension 175Table 9.5 Syncretism in the marked gender and in the marked number in

Pali -a-/-a- adjectives 176Table 9.6 Old Church Slavic (a) and Russian (b) definite adjective 178Table 9.7 Gender syncretism and gender semi-separate exponence in the

plural of the Latin -o-/-a and of the Pali -a-/-a- adjectives 179Table 9.8 Vertical and horizontal syncretism in a hypothetical paradigm

inflected for number, gender, and case 181Table 9.9 Jaina-Maharas.t.rı -a-/-a- adjective declension 183Table 10.1 Typological pilot study: language sample 203


List of Abbreviations

A agentABL ablativeABS absolutiveACC accusativeALLO allomorphyATTR attributiveBMU Best Matching UnitBMU(t) Best Matching Unit at time tCAUS causativeCOMP complementizationDAT dativeDP dual or plural (Chapter 5)DP determiner phraseDU dualERG ergativeEX/EXCL exclusiveF feminineFACT factualFEM feminineFI third person feminine singular or third person indefiniteFZ feminine zoicGEN genitiveGEND genderHAB habitualH-O hearer-orientedIC inflection class (Chapter 7)IC information content (Chapter 9)IE Indo-EuropeanIMP imperfectiveIN inclusive (Chapter 5)INCL inclusiveINDEF indefiniteINSTR instrumentalJN joiner vowelJUNC junctureKS known subjectLOC locativeM masculineMASC masculine


xii List of Abbreviations

MPS morphosyntactic property setN neuterNEG negationNON-OBL non-obliqueNP noun phraseNTR neuterNUM numberOBL obliqueOI Old IndicOT Optimality TheoryP patientpastPple past participlePERS personPFV perfectivePL pluralPNC punctualPRS presentpresInd present indicativepresPple present participlePRIM primary morphomePP prepositional phraseREP repetitiveSEJ sejunctSEQ sequentialSG singularS-O speaker-orientedSOM self-organizing mapSTAT stative/perfectiveSUB subjunctiveTAMA athematic tense/aspect/moodTAMT thematic tense/aspect/moodTSOM temporal self-organizing mapVOC vocative


List of Contributors

Stephen R. Anderson, Yale University

Matthew Baerman, University of Surrey

Sebastian Bank, University of Leipzig

Dunstan Brown, University of York

Marina Chumakina, University of Surrey

Greville G. Corbett, University of Surrey

Mark Donohue, Australian National University

Marcello Ferro, Institute for Computational Linguistics, CNR Pisa

Raphael A. Finkel, University of Kentucky

Jean-Pierre Koenig, State University of New York at Buffalo

Claudia Marzi, Institute for Computational Linguistics, CNR Pisa

Paolo Milizia, University of Cassino

Karin Michelson, State University of New York at Buffalo

Vito Pirrelli, Institute for Computational Linguistics, CNR Pisa

Erich R. Round, University of Queensland

Gregory Stump, University of Kentucky

Jochen Trommer, University of Leipzig



Part I

What is Morphological Complexity?



Understanding and measuringmorphological complexity:An introduction

M AT T H EW BA E R M A N, DU N STA N B ROW N, A N DGREV ILLE G. C ORBET T

Language is a complex thing, otherwise we as humans would not devote so much ofour resources to learning it, either the first time around or on later attempts. Andof the many ways languages have of being complex, perhaps none is so daunting aswhat can be achieved by inflectional morphology. A mere mention of the 1,000,000-form verb paradigm of Archi or the 100+ inflection classes of Chinantec is enoughto send shivers down one’s spine. Even the milder flavours of Latin and Greek havecaused no end of consternation for students for centuries on end. No doubt muchof the impression of complexity that inflectional morphology gives is due to itsidiosyncrasy, both at the macro- and micro-level. At the macro-level it appears to bean entirely optional component of language. Other commonly identified componentsof language have a broader, even universal remit. There is not a single known languageto which the notions of semantics, phonology, syntax, and pragmatics cannot beprofitably applied. And probably all languages have something that can be described asderivational morphology, though individual traditions differ on how they draw wordboundaries; in any case, it would be hard to imagine a language without productivemeans to create new words. But many languages do completely without inflectionalmorphology, so that the mere fact that it exists at all in some other languages issomething remarkable. At the micro-level, inflectional morphology is idiosyncraticbecause each language tells its own self-contained story, much more so than with otherlinguistic components. That is, one can take two unrelated languages and discover thatthey share similar syntax or phonology, but one would be hard pressed to find twounrelated languages with the same inflectional morphology.


Matthew Baerman, Dunstan Brown, and Greville G. Corbett

This idiosyncrasy has long stood in the way of describing the properties of inflec-tional systems across languages in any comprehensive fashion. Although proposalshave been advanced over the years about universal constraints on inflectional struc-ture (e.g. Carstairs 1983 on inflection classes, Bobaljik 2012 on patterns of suppletion),the actual task of verifying them is so onerous that they remain suggestive hypotheses.Otherwise, inflection is left to conduct its business in the twilight zone of asystematiclexical specification, free from the scrutiny afforded to syntax. But as in other fields,what looks a mess on the surface may, if subject to the appropriate analysis, revealitself to be a complex system with its own internal logic. This logic may not map allthat readily onto anything else, but for that reason it is all the more worth uncovering;this is because it does not necessarily follow from our prior conceptions of how thingsought to work.

Related fields within linguistics have contributed a particular view of complexitythat has shaped our vision of the components of language, including morphology.Ackerman et al. (2009) note that morphology can be considered either in syntagmaticor paradigmatic terms. They concentrate on the paradigmatic dimension, but thesyntagmatic conceptualization, which dominates several sub-disciplines, has oftenchannelled thinking about complexity in morphology and other components ofgrammar. That is, there is a view of complexity in terms of the relationships betweenconcatenated elements, rather than in terms of paradigmatic oppositions. This is onlynatural if one applies the logic associated with the analysis of syntax, but if one wishesto understand distinctions that are unnecessary from the point of view of syntax, thenconcentrating on one of these dimensions to the exclusion of the other is insufficient.

Computational complexity is an important notion, but not the main focus of thisvolume. Jurafsky and Martin (2009: 563) note that grammars can be understood interms of their generative power or the complexity of the phenomena which they arebeing used to describe. Well-known examples of complexity from this perspectiverelated to morphology include Culy’s (1985) discussion of the whichever X construc-tion in Bambara, a Mande language of Mali. There is, as ever, the important problemthat we should not necessarily draw inferences about the overall computationalcomplexity of a language from particular constructions, as pointed out by Mohri andSproat (2006: 434), because it is possible for a regular language to contain a context-free or context-sensitive subset, for instance.

In this volume we concentrate on morphological complexity as the additionalstructure that cannot readily be reduced to syntax or phonology. As Anderson (thisvolume) notes, human languages already have a combinatorial system, the syntax, andthey already have a system for the expression of linguistic signs in form, the phonology.Morphology is therefore a kind of complexity which is entirely unnecessary fromthis perspective. While morphotactics are concerned with the combinatorial or syn-tagmatic dimension they are not the same thing as the syntax, because we can finddifferent orderings obtaining within the different components. This need not result in


Understanding and measuring morphological complexity

differences of complexity in the sense in which we are used to it within formal languagetheory, but it represents an unnecessary additional combinatorial system. Layered ontop of this issue is the fact that morphology also exhibits a very different, paradigmatic,complexity which cannot be considered in terms of combinatorics. Complexity inthe sense in which we are employing it is about distinctions which are excessivefrom the point of view of the other essential components, either because there isan additional syntagmatic or combinatorial system over and above the one found insyntax, or because the paradigmatic dimension allows for cross-cutting distinctionswhich are not relevant for syntax. In some approaches it may be a straightforwardmatter to specify inflectional class information. Indeed, the existence of inflectionalfeatures as such is not an issue for finite-state approaches to morphology, wherethis information is encoded as part of the stem (see discussion in Hippisley 2010:39–41, for instance). The larger issue here is that what is or is not ‘excessive’ dependson one’s particular model, and inflectional classes are just one example of thisproblem.

From a purely formal perspective, then, the existence of morphological featuresdoes not say much about generative power or complexity in terms of strings. It isreally about the need to distinguish different description languages to talk aboutmorphology separately from syntax, for instance. Indeed, if we are to measure andunderstand the complexity of morphology we need to adapt methods to capture thedegree of predictability between different paradigm cells, for instance. Entropy-basedmeasures have been employed for this purpose, as discussed in Ackerman et al. (2009).Entropy-based measures are related to predictability. Under such measures there isa greater degree of entropy in the system if new instances are difficult to predict.Kolmogorov complexity is another measure, where the complexity of the data is seenin terms of the minimum size of the rule required to generate that data (Sagot 2013).Sagot discusses three different approaches to morphological complexity:

• Counting-based• Entropy-based• Description-based

The counting-based approach to morphological typology is what most typologists arefamiliar with. What is counted may differ between theorists, of course. Some will countthe number of morphemes found in words, however these are defined; some maycount the number of features or feature values available. This approach has a numberof disadvantages. If we count features in different languages, we need to be able toestablish that the features are comparable. Where one language has an abundance offeatural distinctions in one area, and another has an abundance in a different area,how are we to judge which is more complex? More importantly, there is an assumptionthat the larger the number of values for a given feature, the more complex. And yetcomplexity can arise in languages where the inventory of relevant items is quite small.



It is possible that the description of the system is quite involved, or that there is asignificant degree of unpredictability. Entropy- and description-based measures aretherefore better able to capture such differences. Sagot and Walther (2011) developmethods for measuring the complexity of various descriptions of morphology. Thedegree of dependence on the formal description is an important issue. Adherentsof entropy-based approaches argue that these are not so dependent on the formaldescription, but as Sagot (2013) notes, the conditional entropy measure may dependon formalization. Sagot provides an example with reduplication for a language whereall stems have the structure C1VC2. Each paradigm has two forms, as in (1).

(1) Sagot’s (2013) example of the formalization-dependence of conditional-entropyForm 1 Form 2C1VC2 C1VC2VC2

Sagot argues that if the formalism which models (1) only allows for concatenativeaffixation, then the paradigm in (1) will be fairly complex, whereas if reduplicationis permitted as an operation of the morphology, then it will be straightforward. Infact, the conditional entropy will be zero, as we can predict with certainty that form 2is the reduplicated version.

In Stump and Finkel’s (2013) monograph-length typology a number of detailedmeasures are developed for quantifying the morphological complexity of differentinflectional class systems. This work starts out from the traditional notion of principalparts but takes this much further, moving on to develop novel measures, based ondynamic principal parts and average cell predictor numbers, among others. Stumpand Finkel (2013: 109) make an important comparison between entropy-based mea-sures when applied to syntax and morphology. For a given syntactic context certaincombinations are more predictable than others, such as the occurrence of little aftervery, but entropy can never be reduced to zero for active syntactic structures, becausethey naturally allow for different combinations. Entropy measures have also beenapplied to the analysis of paradigms, such as Moscoso del Prado Martín et al. (2004),Milinet et al. (2009), and Ackerman et al. (2009). However, as Stump and Finkel(2013: 111) persuasively argue, there is a major contrast when one compares whathappens in morphology with what happens in syntax. In morphology, given the rightimplicative relations, the entropy can be reduced down to zero, whereas no matter howmuch syntactic context is provided, this is not possible in syntax. This shows that tworadically different beasts are being described by these measures. Stump and Finkelargue further that while the entropy-measures can describe both the syntagmaticdimension (of syntax) and the paradigmatic dimension (of morphology), the set-theoretic measures of inflectional class complexity which Stump and Finkel developcannot be used to describe syntax. The fact that these do not apply across domainsentails that they are specialized for describing the implicative relations between cells.In addition to the fact that the syntagmatic properties of morphology are different



from those of syntax, this dimension of complexity which is peculiar to morphologyis also worthy of consideration.

Concentrating on what is peculiar to the combinatorics of morphology and theadditional considerations which arise with the paradigmatic dimension is our purposewith this volume. As such it complements a body of work from the last decade whichdeals with the notion of complexity in language more generally. For example, Dahl(2004) takes a novel view of complexity. He starts from basic notions of informationand redundancy, and works his way through relevant ideas from outside linguistics.He then analyses a range of linguistic phenomena, including inflectional morphologyand featurization, showing how complexity results from historical processes. Thebook’s strengths are the introduction of new ideas into the discussion of linguisticcomplexity, with critical discussion of their applicability, and the original examinationof familiar linguistic material. Sampson et al. (2009) is a volume of papers thatresponds to a growing recent trend in linguistics to question the principle of invarianceof language complexity, the assumption that languages are all in a certain sense equallycomplex, and that where one particular component of a language is more complex,another will balance this out by being less so. It deals with a wide range of issues includ-ing discussion of its relationship with social and cognitive structures. In the introduc-tion to this volume Geoffrey Sampson takes issue with an assertion by Hockett (1958:180–1) that ‘languages have about equally complex jobs to do’ by arguing that definingwhat grammar does is challenging if one wishes to make precise predictions aboutcomplexity (Sampson 2009: 2). Culicover (2013: 14) makes a similar point in relationto what he terms ‘relative global complexity’. We believe this to be a valuable insightand, instead of dealing with complexity in all its manifestations, we aim for a precisefocus on the contrast between featural distinctions relevant for syntax on the onehand, and those systematic elements of morphology that appear to cross-cut syntax.Miestamo et al. (2008) is concerned above all with typology, and with the simplifyingeffects of contact. Morphology is touched upon in several of the contributions, withcomplexity assessed according to two parameters: (i) the number of morphosyntacticfeatures and values, and (ii) deviations from a one-to-one mapping between meaningand form. The first parameter treats inflectional morphology itself as an elementof complexity, a question we choose to remain agnostic on for the present. But thesecond parameter coincides with the core concerns of the current volume, and is theparticular focus of two chapters. In ‘Complexity in linguistic theory, language learning,and change’, Kusters traces developments within the Quechua verb paradigm, wherevarious many-to-one mappings between morphosyntax and form were disentangledto something like a transparent one-to-one mapping in those varieties of the languagemore heavily affected by contact. In ‘Complexity in nominal plural allomorphy’,Dammel and Kürschner compare plural marking across the Germanic languagesaccording to a rich set of criteria. Of particular note is the fact that they look not justat the forms, but at the assignment principles, so that phonologically or semantically



predictable allomorphy is treated as less complex than that which is arbitrarilystipulated.

More broadly, complexity is a notion which has wide applicability in a variety of dif-ferent fields. Gershenson and Fernández (2012: 32) note that it can be understood as abalance between order1 (information growth) and variety. Where the variety involvedis great we should expect the information growth not to be high. Morphologicalcomplexity fits into this general categorization at some level. Where the relationshipbetween syntax and its expression in form is straightforward, for example, variety inmorphology does not interfere to create complexity. So the study of morphologicalcomplexity fits within a wider cross-disciplinary research programme, but we need tounderstand its peculiarities when applying more general techniques.

The present volume is the fruit of a three-day conference held at the BritishAcademy in London in January 2012 on the theme of morphological complexity,and represents a selection of the most interesting and relevant contributions. Wehave divided them up into three sections. Chapter 2 (Anderson) continues the themeof Part I, giving a typological overview of morphological complexity both in itsparadigmatic and its syntagmatic aspects. The rest of the volume concentrates on threeaspects of morphological complexity:

(i) different expression of equivalent distinctions in particular language. At thesimplest level, English expresses the plural in different ways (cats, children,cacti) and this allomorphy of inflectional exponence divides the lexicon intodifferent inflection classes according to the particular form taken;

(ii) units which exist within morphology and which cannot be readily defined interms of syntactic or semantic feature values; for instance, some French verbshave a different form for the first and second persons plural (a dramatic caseis the verb ‘go’, with je vais ‘I go’ but nous allons ‘we go’;

(iii) complexities in the realization of morphological form; thus the German Buch‘book’ has the plural Bücher, which involves both an inflection and a change ofthe root vowel.

The chapters in Part II deal with questions of theoretical and formal analysis ofcomplexity as a route towards better understanding of what is involved. Chapter 3(Round) provides a novel typology of the morphome that integrates all three themes:rhizomorphomes divide the lexicon into morphologically specified units (inflectionclasses), metamorphomes divide the paradigm into morphologically specified units,while meromorphomes are abstract units that serve as the building blocks of mor-phological form. Chapter 4 (Donohue) describes a system in which rules of referralplay a strikingly predominant role, with forms from one cell in the paradigm being

1 This is a measure of information transformation which depends on the relationship between input andoutput.



co-opted for use in another. While there is no apparent semantic motivation forthese patterns, they follow a strict morphological hierarchy, resulting in a networkof implicative relationships. The Oneida system described in Chapter 5 (Koenigand Michelson) is also characterized by massive syncretism. While the individualconflations may have a plausible morphosyntactic motivation, the broader principleswhich determine when and how these conflations are effected are less transparent.In this case inflection classes also play a contributing role to considerable morpho-tactic complexity otherwise characteristic of Iroquoian languages. In contrast to themorphosyntactic and morphological richness displayed by Oneida, the system ofgender and number agreement in Archi, described in Chapter 6 (Chumakina andCorbett) seems quite spartan, with just a handful of affixes realizing eight possiblemorphosyntactic distinctions. Nevertheless, the realization of this small paradigmis anything but straightforward, as there is a large inventory of agreement targets,yet lexemes belonging to the same part of speech do not necessarily behave alike.It proves difficult to predict which items will realize gender and number, and evenmore challenging to determine the position of the marking (notably whether it will beprefixal or infixal).

Part III concentrates on measuring and quantifying complexity usingcomputational techniques. Inflection classes are the particular focus of the firsttwo chapters. Chapter 7 (Stump and Finkel) describes a number of differentcomputationally implemented metrics of the complexity of inflection class systems,pointing out an often-overlooked parameter, namely the mode of representation.They contrast a speaker-oriented (i.e. phonological) and hearer-oriented (i.e.morphologically decomposed) representation, which may give quite differentresults, depending on the metric. Chapter 8 (Pirrelli, Ferro, and Marzi) offersa psycholinguistically plausible computational model of the word-and-paradigmapproach of inflection. This approach is motivated in particular by the presenceof inflection classes, which blur any obvious segmentation between lexical andinflectional material. The result is an explicit representation of the local analogicalrelationships between word forms that can be used to generate paradigms.

The final two chapters focus on the internal organization of the paradigm, inparticular on the indirect mapping between morphosyntactic values and morpho-logical form that characterize syncretism. Chapter 9 (Milizia) returns to the long-standing question of markedness as a motivating factor behind syncretism. Given theuncertainties that plague the notion of markedness, he offers instead an information-theoretic approach, suggesting that the information load inherent in a system withinflection classes may be a motivation for the conflation of paradigmatic cells.Chapter 10 (Bank and Trommer) develops an automated method for morpho-logical segmentation which allows a quantificational assessment of different segmen-tation strategies, balancing the complexities of morphosyntactic representation andmorphological exponence.



Each chapter can be seen as a demonstration of morphological complexity. Takentogether, with their different data and different methodologies, they reveal a strikingpicture: inflectional systems are indeed intricate and challenging, they are also elegant,and when viewed abstractly enough they reveal comparable structures. But they leavethe perplexing question of why this complexity and elegance is so pervasive in somelanguages, while other languages are devoid of inflection.

Acknowledgements

We would like to thank the contributors to this volume, since the work and ideaspresented here are theirs. We also thank Penny Everson and Lisa Mack for theirinvaluable assistance in the preparation of the manuscript. Finally, none of thiswould have been possible without the support of the European Research Council(grant ERC-2008-AdG-230268 MORPHOLOGY), which is gratefully acknowledged.


Dimensions of morphologicalcomplexity

STEPHEN R. ANDERSON

The question of what aspects of a language’s word structure patterns contribute toits overall complexity has a long tradition. With roots in the notions of linguistictypology associated with traditional grammar, Sapir (1921; see also Anderson 1992:§12.2) organizes these matters along three dimensions. One of these concerns therange of concepts represented by morphological markers, and refers to the extentof elaboration of the inflectional and derivational category structure of a language.A second refers to the range of marker types, and thereby differentiates transparentaffixation of the sort associated with pure ‘agglutinating’ languages from a variety ofother formal processes by which morphological information can be conveyed. Thethird dimension is that of the overall internal complexity of words, the sheer numberof distinct pieces of information that are combined in a single word, ranging fromthe simplest case of ‘isolating’ languages that involve (little or) no morphologicalcombination up to the ‘polysynthetic’ type1 in which most or all of the components ofa full sentence are expressed within a single word.My goal in this chapter is to develop and elaborate a characterization of the mor-

phological characterization of languages along lines similar to Sapir’s, so as not onlyto serve similar typological goals but also to provide a framework for understandingthe questions of linguistic typology that motivate other authors in this volume. In thatspirit, I will feel free to propose an agenda of questions to be asked about languageswithout being obliged to offer a comprehensive set of answers. Before proceeding tothat enterprise, however, I want to step back from the details and ask what it is aboutmorphology that constitutes ‘complexity’ in the broader picture of human naturallanguage.

1 Not to be confused with the distinct technical sense of this term in Baker () and related work.


Stephen R. Anderson

. What is ‘complex’ about morphology?

Different observers will see different things about a language as making it ‘complex’.In traditional terms, for example, the outer limit of morphological complexity innatural languageswas often seen as represented by languages of the polysynthetic type,highlighting the sheer quantity of morphological elaboration as the main contributorto complexity.While teaching a course on the diversity of the world’s languages to a class of

Chinese students in Beijing recently, I was exposed to what was, for me, a somewhatunusual perspective on this matter. The students had been reading some of MarkBaker’s work on the parametric description of grammars, in which he talks about‘polysynthesis’ as a parameter of grammatical structure. They did not really knowanything about the specific languages he discussed in this connection, but it was clearto them that this notion was associated with being very complex. In trying to figureout what it might mean for a language to be ‘polysynthetic’, one of my students offereda clarification in a written exercise:

A ‘Polysynthetic’ language is one in which words are very complex.That is, they havemore thanone meaning element combined into a single word: for instance, English cat-s.

From the perspective of a speaker of a Chinese language, apparently, any morpho-logical structure seems to be complex. While initially merely amusing to speakers oflanguages of the ‘Standard Average European’ type, I will suggest that this is a rathermore coherent and principled view than it may seem at first sight.What aspects of a system contribute to its complexity? A standard rhetorical move

would be to consult the dictionary for a starting point. The equivalent source ofwisdom in the present age is Wikipedia, which provides the following:

A complex system is a system composed of interconnected parts that as a whole exhibit one ormore properties (behavior among the possible properties) not obvious from the properties ofthe individual parts.

‘Complexity’, then, can be seen as the consequence of a system’s displaying charac-teristics that do not follow as theorems from its nature, as based on its irreduciblecomponents. How, then, does this apply to language? What aspects of language areessential, and what properties that languages display represent complications that arenot logically necessary?Languages are systems that provide mappings between meaning, or conceptual

structure on the one hand, and expression in sound (or signs) on the other. In orderto fulfil this function, there are some kinds of organization that they have to displayby virtue of their essential character.Of necessity, a language has to have a syntax, because it is the syntactic organization

of meaningful elements as we find it in human languages that gives them their


Dimensions of morphological complexity

expressive power, providing their open-ended ability to express and accommodate afull range of novel meanings.It is also plausible to suggest that languages need to have phonologies. That is

because individual meaningful elements—linguistic signs—must have characteristicexpressive properties to serve their essential purpose, as stressed by linguists since deSaussure. When these are combined by means of the syntactic system, however, theresult may be at odds with the properties of the system through which they are to beimplemented: the properties of the vocal tract, or of the signing articulators. Thereis thus a conflict between the need to preserve the distinguishing characteristics ofmeaningful elements and the need to express them through a system with its ownindependent requirements. OptimalityTheory articulates this explicitly as the conflictbetween considerations of Faithfulness and Markedness, but similar considerationsare at the foundation of every theory of phonology. Some account of how this tensionis to be resolved is inherently necessary, and thus the presence of phonology, like thatof syntax, follows from the nature of language.Both syntax and phonology are thus inherent in any system that is to fulfil the basic

requirements of a human language, and their presence (as opposed to their specificstipulated properties) cannot be seen as constituting complexity in itself. The samecannot be said for morphological structure, however, as pointed out forcefully byCarstairs-McCarthy (2010), and that makes it hard to understand why humans shouldhave evolved in such a way that their languages display this kind of organization at all.To clarify this, let us note that the content we ascribe to morphology can be dividedinto two parts, ‘morphotactics’ and ‘allomorphy’, and in both cases it is difficult to seethe existence of such structure as following inherently from the nature of language.Morphotactics provides a system by which morphological material (grossly, but

inaccurately identified with the members of a set of ‘morphemes’) can be organizedinto larger wholes, the surface words of the language. But in fact language alreadyinvolves another system for organizing meaningful units into larger structures, thesyntax, and so to the extent themorphotactics of a language can be distinguished fromits syntax, this would seem to be a superfluous complication.Allomorphy is the description of the ways in which the ‘same’ element of content

can be realized in a variety of distinct expressions. When that variation followsfrom the language’s particular resolution of the conflict between the requirements ofFaithfulness and Markedness, as described previously, this is just phonology, and canbe seen as necessary. But when we find allomorphic variation that does not have itsroots in the properties of the expression system, it does not have this character ofnecessity, and so constitutes added complexity.

.. Morphotactics �= syntax

In fact, the two a priori unmotivated kinds of complication are essentially constitutiveof morphology. Carstairs-McCarthy (2010) cites examples of morphotactic organi-


Stephen R. Anderson

zation that are unrelated to (and in fact contradict) the syntax of the languages inwhich they occur; another such example was discussed in Anderson (1992: 22–37) inKwakw’ala, a Wakashan language of coastal British Columbia.To summarize, Kwakw’ala is a language with a rather rigid surface syntax. Sentences

conform to a fairly strict template, with the verb (or possibly a sequence of verbs)coming in absolute initial position, followed by the subject DP, an object DP markedwith a preceding particle beginning with x. - (if the verb calls for this), possiblyanother object marked with s- (again, if the verb subcategorizes for an object of thistype), optionally followed by a series of PPs. Adjectives strictly precede nouns theymodify, and other word order relations are similarly quite narrowly constrained bythe grammar.However, Kwakw’ala also has a rich system of ‘lexical suffixes’ constituting its

morphology, and these correspond functionally to independent words in other lan-guages. The point to notice is that when meaningful elements are combined in themorphology, as in the abundant variety of complex words consisting of a stem andone or more lexical suffixes, the regularities of order found in the syntax are quiteregularly and systematically violated.For instance, as just noted, verbs are initial in syntactic constructions, with any

objects coming later. When elements corresponding to a verb and its object arecombined within a single word, however, they typically appear in the order O-Vrather than V-O.Thus, u’ena-gila ‘fish.oil-make, make fish oil’ has the object of ‘make’initially, and the element with verbal semantics following.Similarly, although the subject invariably precedes any objects in a syntactic con-

struction, if (and only if) the element expressing the object of the verb is a lexical suffixattached to the verb in the morphology, it can precede the subject DP. Thus, in na’w-әm’y-ida bәgwanәm ‘cover-cheek-the man, the man covered (his) cheek’ the object of‘cover’ is the suffix -әm’y ‘cheek’, and as a result it precedes the subject -ida bәgwanәm‘the man’—an ordering that would be quite impossible if the object were separatelyexpressed in the syntax.As another example, exactly when they are combined in a single word by the

morphology rather than composed in the syntax, an adjectival modifier can followthe expression of the noun it modified. The lexical suffix -dzi ‘large’ not only can butmust follow an associated nominal stem, as in u’aqwa-dzi ‘copper-large, large copper(ceremonial object)’. Once again, the ordering imposed by the morphotactics of thelanguage is directly contrary to that which would be given in the syntax.There is actually something of an argument-generating algorithm here: find any

systematic regularity of order in the syntax of Kwakw’ala, and it is quite likely thatthe principles of morphotactic organization will systematically violate it. Overall,the morphotactics of the language bear little or no resemblance to its syntax, whichnaturally raises the question of why a language should have two quite distinct systems



that both serve the purpose of combining meaningful units in potentially novel waysto express potentially novel meanings.Given the existence of two separate and distinct combinatory systems in Kwakw’ala,

there is potential duplication of function: a new complex meaning might be con-structed either on the basis of the morphology or on that of the syntax. And this isindeed the case, as illustrated in (1). Kwakw’ala has a suffix -exsd that adds the notion‘want’ to the semantics of a stem, and the meaning ‘want to (X)’ can be conveyedthrough the addition of this element to the stem representing a verb. But there isalso a semantically empty stem ax. -, and the suffix can be added to this to yield anindependent verb ax. exsd ‘to want’, which can in turn take another verb as its syntacticcomplement to yield essentially the same sense.

(1) a. kwakw’ala-exsd-әnspeak.Kwakw’ala-want-1sgI want to speak Kwakw’ala

b. ax. -exsd-әn-want-1sg

q-әnthat-1sg

kwakw’alaspeak.Kwakw’ala

I want to speak Kwakw’ala

Interestingly, there appears to have been a subtle but significant shift in the language:where traditional speakers relied heavily on the morphology for the composition ofnovel expressions, modern speakers are much more likely to combine meanings inthe syntax. Importantly, neither the morphology nor the syntax has actually changedin any relevant way: what has happened is just that the expressive burden has shiftedsubstantially from one combinatory system to the other. My understanding is thatsimilar developments have occurred in other languages with complex morphology ofthis sort, as a function of declining active command of that system (though withoutloss of the ability to interpret morphologically complex words).

.. Allomorphy �= phonology

A similar case can be made concerning the relation between allomorphic alternationand patterns of variation dictated by the phonology of a language. Again, we canillustrate this fromKwakw’ala. As in the otherWakashan languages, Kwakw’ala lexicalsuffixes each belong to one of three categories, depending on their effect on the stemsto which they are attached. These are illustrated in (2).

(2) Hardening (roughly, glottalizing) suffixes, e.g. /qap+ alud/ −→ [qap’alud] ‘toupset on rock’

Softening (roughly, voicing), e.g. /qap+ is/ −→ [qabis] ‘to upset on beach’

Neutral (no change), e.g. /qap+ a/−→ [qapa] ‘(hollow thing is) upside down’


Stephen R. Anderson

Importantly, the category of a given suffix is not predictable from its phonologicalshape: it is an arbitrary property of each suffix that it is either ‘hardening’, ‘softening’, orneutral. As a result, the different shapes taken by stem final consonants with differentsuffixes cannot be regarded as accommodations to the requirements of Markednessconditions.This example is of course quite similar to that of the initial mutations in Celtic

languages and other such systems. There is little doubt that, like these, there was apoint in the history of theWakashan languages at which the ancestors of these suffixesdid differ in phonological form, and the changes we see today are the reflexes of whatwere originally purely phonological alternations. But the important thing is that inthe modern languages, by which I mean everything for which we have documentaryevidence since the late nineteenth century, this motivation is no longer present, butthe allomorphic variation persists.Languages are perfectly content, that is, to employ principles of variation whose

properties do not follow from the necessary resolution of the competing demandsof Faithfulness and Markedness. Such variation is the content of the component ofmorphologywe call allomorphy, and its presence in natural languagemust be regardedas not following from their nature, and thus as adding complexity.So we must conclude that from the point of view of what language does and what it

needs to fulfil that task, morphological structure is superfluous: neither morphotacticregularities nor non-phonologically conditioned allomorphic variation follow fromthe basic requirements of linguistic structure. Nonetheless, virtually all languages—even Chinese languages—have at least some morphology that is not reducible tosyntax and/or phonology. Since it appears to be the case that in a very basic sense anymorphology at all constitutes ‘complexity’, that fact stands in need of an explanation.A genuinely explanatory account of the basis formorphological organizationwould

have to lie in the evolutionary history by which the human language faculty hasemerged in the history of our species. Like many aspects of the structure of language,the search for such an account runs up against a general lack of firm evidence, sincelanguage in general leaves no direct trace in the physical record. Of course, just howmuch morphological complexity there is and where it is located can vary enormouslyfrom language to language. It is the structure of this variability to which I turn in theremainder of this chapter.

. The structure of ‘complexity space’ in morphology

While some languages display very little organization that is autonomously mor-phological, others provide us with rather more to explore. Three North Americanlanguages that are notably robust in their morphology are exemplified in (3).2

2 My thanks to Marianne Mithun for the Central Alaskan Yupik and Mohawk examples here.



(3) Kwakw’ala: hux.w-sanola-gil-e≠vomit-some-continuous-in.housesome of them vomit in the house

Central Alaskan Yupik: Piyugngayaaqellrianga-wapi-yugnga-yaaqe-lria-nga=wado-able-probably-intr.participial-1sg=supposeI suppose I could probably do that

Mohawk: wa’koniatahron’kha’tshero’ktáhkwenwa’-koni-at-ahronkha-’tsher-o’kt-ahkw-enfactual-1sg/2sg-middle-speech-nmzr-run.out.of-caus-stativeI stumped you (left you speechless)

While obviously displaying more complex morphology than familiar Europeanlanguages, these differ somewhat from one another. Kwakw’ala presents us with manyword forms that incorporate amore diverse collection of information thanwe are usedto in languages like English, but the individual components are relatively transparentand the degree of elaboration in words that occur in actual texts ismoderate. ‘Eskimo’-Aleut languages like Central Alaskan Yupik are commonly cited as falling at theextreme end of morphological complexity, because they make use of essentially open-ended combinations of meaningful elements to construct expressions of arbitrarycomplexity, but the individual components of a word are still relatively easy to teaseapart, and the actual degree of complexity in common use is only somewhat greaterthan in Kwakw’ala. Iroquoian languages like Mohawk or Oneida are somewhat moreintricately organized, and the complexities of combination are much harder to disen-tangle (see Koenig andMichelson, this volume). While the results are often very elab-orate, there is a sort of upper bound imposed by the fact that, unlike the other two butsimilar to the Athabaskan languages, their morphology is based on a word structuretemplate with a limited (if large) number of slots, such that the degree of complexityof the material filling any particular slot is (with some exceptions) bounded.Other chapters of this volume present a variety of examples of complex morpho-

logical systems. My goal here is to characterize the logical space within which suchcomplexity falls, the major dimensions along which languages elaborate the structureof words in ways that do not derive transparently from the essential nature of theselinguistic elements. These fall into two broad categories: properties of the overallsystem, and characteristics of the relation between individualmorphological elementsand their exponents, ways in which the realization of content in overt form does notfollow from the nature of either.

.. Overall system complexity

Morphological systems taken as wholes can differ in several ways. Some languagessimply havemore robust inventories ofmorphologicalmaterial, more non-rootmean-ingful elements (typically, but not always, affixes) than others.This difference is at least


Stephen R. Anderson

logically distinct from the extent to which languages combinemultiple morphologicalelements within a single word. And where such combined expression is found, theextent to which relations between elements (such as linear order within the word) canbe predicted from their nature can also vary.

... Number of elements in the system A basic sort of complexity derives from thesimple matter of how much morphological elaboration a language makes available.In this regard, the languages of the Eskimo-Aleut family may be more or less at theextreme end: these languages commonly have more than five hundred derivationalsuffixes, and an inflectional system that involves at least as many more.The Salish andWakashan languages of the PacificNorthwest are also rich in derivationalmorphology,though perhaps not quite to the same extent: Kwakw’ala, for instance, has about twohundred and fifty suffixes of this type (along with a few reduplicative processes),as documented by Boas (1947). English plays in a slightly lower league, though itis still not trivial: Marchand (1969) identifies about one hundred and fifty prefixesand suffixes in the language. Even Chinese languages, which are sometimes claimedto have ‘no morphology’, do in fact display some. Packard (2000) describes sevenprefixes and eight suffixes in standardMandarin, and provides arguments for thinkingof these as morphological elements. Perhaps there are languages that are absolutelyuncomplicated in this respect—Vietnamese is sometimes suggested, although thislanguage exhibits extensive compounding, which is surely morphological structurein the relevant sense—but this is certainly an extremely rare state of affairs, if indeedit exists at all.Even a system with a comparatively small number of markers may exhibit com-

plexity of a different sort when the factors determining the choice of a particularmarker and the conditions of exponence (e.g. as a prefix, suffix, or infix) are themselvescomplex, as in the case of gender marking in Archi (Chumakina and Corbett, thisvolume).

... Number of affixes in a word The extent to which a language makes full use ofits morphological capabilities can vary independently of the structure of the systemitself. For example, the Eskimo languages all have more or less the same inventories ofmorphological possibilities, but some of them seem to put more weight on this aspectof their grammar than others. de Reuse (1994) observes that Central Siberian Yupik

postbases are most often productive and semantically transparent, and can be added one afteranother in sequences of usually two or three, the maximum encountered being seven. Thesesequences are relatively short in comparison to other Eskimo languages, such as C[entral]A[laskan]Y[upik], where one canfindmore than six postbases in aword, andwhere it is possibleto have more than a dozen.

What is at stake here is a difference related to the change mentioned previouslyin the extent to which modern Kwakw’ala speakers rely more on syntactic thanon morphological elaboration to express complex meanings, although the poten-



tial expressive capacity of the morphology remains unreduced. For comparison,Kwakw’ala is roughly similar to Central Siberian Yupik in the degree of observedcomplexity of individual words.

... Principles of morphological combination Apart from sheer numbers of pos-sible morphological elaborations of a basic stem, either in the size of the language’ssystem or in what can be observed in individual words, another dimension of alanguage’s morphological complexity is the principles that govern combinations ofmorphological markers. Inmany cases, the content (or ‘meaning’) of various parts of aword’s morphology corresponds to a structure in which some elements take semanticscope over others. The most straightforward way in which the formal correspondentsof these elements can be related is for their combination to reflect such scope relationsdirectly. Where all of the markers in question are identifiable affixes, this is achievedby having these added one after another (working out from the root), with each onetaking all of the material inside it (i.e. preceding if a suffix or following if a prefix) asits scope.We can see this inKwakw’ala, where the same affixes can combine in different orders

depending on the meaning to be expressed as shown in (4).

(4) a. ‘cause to want’:ne’nakw’-exsda-mas-ux. w

go.home-want-cause-3sgJohnJohn

gax-әnto-1sg

John made me want to go home

b. ‘want to cause’:q’aq’oua-madz-exsd-ux.w

learn-cause-want-3sgJohnJohn

gax-әnto-1sg

q-әnthat-1sg

gukwilebuild.house

John wants to teach me to build a house

Here the order follows from the content properties of the elements involved, and sodoes not contribute complexity.Contrast that situation with one in which the order of elements within a word is

specified as an autonomously morphological property, rather than following fromtheir semantics (or something else). This situation is often referred to in terms ofmorphological templates, such as what we find in the Athabaskan languages. Anexample is the templatic order of markers within the Babine-WitsuWit’en verb, givenin (5) and derived from Hargus (1997, apud Rice 2000).

(5) Preverb + iterative + multiple + negative + incorporate + inceptive + dis-tributive # pronominal + qualifier + conjugation/negative + tense + subject+ classifier + stem

This is actually one of the simpler and more straightforward template types foundwithin the Athabaskan family (and is chosen here from among the many examplesprovided in Appendix I to Rice (2000) in part because all of the marker categories are


Stephen R. Anderson

comparatively self-explanatory). For each of these languages, we can give somewhatsimilar templates, specifying the order in which morphological elements appearwithin a rather complexwhole.Theprinciples governing such templates are notwhollyarbitrary, but various factors are involved including at least some stipulation: theordering of element classes is partly based on semantics, partly on phonology (withprosodically weaker elements located closer to the stem), and partly arbitrary.While the partial arbitrariness of such templatic morphology obviously adds com-

plexity to the language in the sense being developed here, it is notable that suchtemplates appear to be highly stable, at least grossly, over very long periods. Similarityin template structure, for example, is a significant factor in the evidence that supportsa kinship between Tlingit-Athabaskan-Eyak on the one hand, and the Yeniseianlanguages of Siberia on the other (Vajda 2010).We may ask what factors are ‘natural’ predictors of element order within words.

Like other ordering relations we find in grammar, the relative order of morphologicaloperations—commonly, but not exclusively represented by ‘morpheme order’—isgoverned by more than one principle, and these do not always agree.Basic, of course, is the notion of semantic scope: a morphological operator is

expected to take all of the content of the form to which it applies as its base. Anotherfactor, though, is the typical relation between derivational and inflectional material,with the latter coming ‘outside’ the former in the general case. This relation has beenasserted (in Anderson 1992 and elsewhere) to be a theorem of the architecture ofgrammar; a large and contentious literature testifies to the fact that this may need tobe qualified in various ways, but in the present context it is only the general effect thatmatters.There may also be rather finer-grained ordering tendencies of a similar sort (e.g.

mood inside of tense inside of agreement, etc.), as suggested in various work of JoanBybee (e.g. Bybee et al. 1994), although those also tend to have rather a lot of apparentexceptions. Linguistic theory needs to clarify the issue of which of these effects, if any,follow from the architecture of grammar, and which are simply strong tendencies,grounded in some other aspect of language.Somewhat surprisingly, phonological effects also show up, as argued by Rice for

various Athabaskan systems. This is an effect known from clitic systems: for instance,Stanley Insler argues (in unpublished work) that second position clitics in VedicSanskrit show regularities such as high vowels before low, vowel-initial clitics beforeconsonant-initial, etc. Perhaps the appearance of similar effects in morphology isanother example of how at least special clitics are to be seen as the morphology ofphrases (Anderson 2005).

.. Complexity of exponence

The other fundamental ways in which morphological structure can contributeto linguistic complexity derive from the non-trivial ways in which individual



morphological elements and their surface realizations can be related. As argued byStump and Finkel (this volume), the nature of this complexity depends in part on theperspective adopted, whether that of the speaker producing a complex form or that ofthe hearer attempting to recover the information it contains.The ‘ideal’ morphologicalelement, with what might be called canonical realization, corresponds to the classicalStructuralist morpheme, with a single discrete, indivisible unit of form linked toexactly one discrete unit of content. But as we know, realmorphology in real languagesis only occasionally like that, and commonly deviates from this ideal in a varietyof ways.

... Complexity in the realization of individual elements The simplest casesinvolve discontinuous aspects of a form that correspond to a single aspect of itscontent, including circumfixes and infixes. These are really two sides of the samecoin, since an infixed form can be regarded as coming to instantiate a ‘circumfixed’root. In both cases, a single morphological element has a discontinuous realization.Both, in turn, are simply the limiting, simplest variety of multiple exponence. Someexamples are rather more exuberant than this, with my personal favourite being theway negation is multiply marked in Muskogean languages such as Choctaw. All ofthese are exemplified in (6)

(6) Circumfixes: Slavey ya–ti ‘preach, bark, say’; cf. yahti ‘s/he preaches, barks,says’, xayadati ‘s/he prayed’, náya’ewíti ‘we will discuss’ (Rice 2012)

Infixes: Mbengokre [Jé] -g- ‘plural’, cf. fãgnãn ‘to spend almost all (pl), sgfãnãn (Salanova 2012)

Multiple Exponence: Choctaw akíiyokiittook ‘I didn’t go’; cf. iyalittook ‘I went’(Broadwell 2006)

In the Choctaw form, negation is marked in five independent ways: (a) substitutionof a- for -li as 1sg subject marker; (b) prefixed k-; (c) suffixed -o(k); (d) an accentualfeature of length on stem; and (e) suffixed -kii.Just as real languages involve cases in which a single element of content corresponds

to multiple components of the form of words, the opposite is also true: a singleelement of form can correspond to several distinct parts of a word’s content, eachsignalled separately in other circumstances. This is the case of cumulative morphs,typified by the -o ending of Latin amo ‘I love’. Indeed, particular elements of formmay correspond to no part of a word’s content, in the case of ‘empty’ or ‘superfluous’morphs. Conversely, a significant part of a word’s content may correspond to no partof its form.The usual ‘solution’ to this difficulty for the classical morphemes is to positmorphological zeroes as the exponents of the content involved, but it is important torealize that this is simply a name for the problem, not a real resolution of it.A variety of forms ofmorphological complexity introduced by these and other non-

canonical types of exponence are abundantly documented in the literature, beginning


Stephen R. Anderson

explicitly with Hockett (1947) and surveyed in Anderson (1992, to appear). These alsoinclude a variety of cases in which content is indicated not by some augmentationof the form, but rather by a systematic change of one of several sorts: subtractivemorphology; Umlaut, Ablaut, and other kinds of apophony; consonant mutation;metathesis; exchange relations, and others. The bottom line is that languages aboundin relations between form and content that are complex in the basic sense of violatingthe most natural way of expressing the one by the other.

... Complexity of inter-word relations Complexities of exponence are not, ofcourse, limited to those presented by the relation between form and content inindividual words referred to in section 2.2.2.1. Another class of complications to thecanonical involves the range of forms built on the same base—the paradigm of a givenlexeme. Several chapters in this volume are devoted to the complexity of paradigms,and the issue is thus well illustrated elsewhere, so it will suffice here simply to indicatethis as a contribution to morphological complexity overall.The paradigm of a lexeme can be regarded as a structured space of surface word

forms. The independent dimensions of that space are provided by the set of mor-phosyntactic properties that are potentially relevant to a lexeme of its type (definedby its syntactic properties); each dimension has a number of distinct values corre-sponding to the range of variation in its definingmorphosyntactic property. Since eachcombination of possible values for the morphosyntactic features relevant to a givenlexeme represents a different inflectional ‘meaning’, it follows that we should expecta one-to-one correspondence between distinct morphosyntactic representations anddistinct word forms. To the extent we do not find that, the system exhibits additionalmorphological complexity.Several types of complexity of this sort can be distinguished, and each is the subject

of a literature of its own. Syncretism (Baerman et al. 2005) describes the situation inwhich multiple morphosyntactic representations map to the same word form for agiven lexeme (e.g. [hit] represents both the present and the past of the English lexeme{hit}).The opposite situation, variation,wheremultiple word forms correspond to thesame morphosyntactic representation, is less discussed but still exists: e.g. for manyspeakers of American English both the forms [dowv] and [dajvd] can represent thepast tense form of {dive}. In some cases, a paradigm may be defective (Baerman et al.2010), in that one or more possible morphosyntactic representations correspond tono word form at all for the given lexeme. A fourth kind of anomaly arises in thecase of deponency (Baerman et al. 2007), where the word forms corresponding tocertainmorphosyntactic representations appear to bear formalmarkers appropriate tosome other, distinct morphosyntactic content (as when the Latin active verb sequitur‘follows’ appears to bear a marker which is elsewhere distinctive of passive verbs). Allof these types of deviation from the expected mapping between relations of contentand relations of form contribute to morphological complexity.



... Complexity of allomorphy To conclude this typology of the complexity intro-duced into language bymorphological structure, it is necessary to mention the factorsthat determine how a given morphological element is to be realized. The simplesttype here, of course, would be for each morphological unit to have a single, distinctrealization in the forms of words in which it appears, but of course morphology hasalways attended to the fact that a single morphological element can take multipleshapes, the very definition of ‘allomorphy’. Allomorphy can contribute to the complex-ity of the system to varying degrees, though, depending on the bases of the principlesunderlying its conditioning.Where the variation results from the independently motivated phonology of the

language, this does not contribute additional complexity to the language in the senseunder discussion here, as already noted in section 2.1.2. In other cases, though,although the conditions for allomorphic variation can be stated in purely phonologicalterms, the actual variants that appear are not predictable from the phonology itself.Thus, in Warlpiri the marker of ergative case is -ŋku if the stem is exactly bisyllabic,but -™u when added to stems that are trisyllabic or longer. A rather more exten-sive instance of such phonologically conditioned allomorphy is found in Surmiran(a Rumantsch language of Switzerland), where essentially every stem in the languagetakes two unpredictably related forms, depending on whether the predictable stressconditioned by its attendant morphology falls on the stem itself or on an ending(Anderson 2011).Since phonological conditioning factors are, at least in principle, transparent, they

contribute less complexity (again, in principle) than cases in which unpredictableallomorphy is based on specific morphological categories or on semantically orgrammatically coherent sets of categories. These, in turn, appear less complex thanones in which the allomorphy is conditioned by (synchronically) arbitrary subsetsof the lexicon, such as the Celtic mutations alluded to in section 2.1.2. Perhaps thesummit of complexity with regard to the conditioning of allomorphy is the case wherespecific, unpredictable variants appear in a set of semantically and grammaticallyunrelated categories whose only unity is the role it plays in determining allomorphy.Such collections of categories, called ‘morphomes’ by Aronoff (1994), may recur in anumber of rules within a language without having any particular coherence beyondthis fact. See also Round (this volume) for a further elaboration of this notion andillustrations of several distinct types of ‘morphomic’ structure.Another type of allomorphic complexity is presented by formally parallel elements

that behave in different ways, something that has to be specified idiosyncraticallyfor the individual elements. This is the issue of distinct arbitrary inflectional classesinto which phonologically and grammatically similar lexemes may be divided. Themembers of a single word class, while all projecting onto the same paradigm space,may nonetheless differ in theways inwhich those paradigm cells are filled. Conversely,affixes that are formally similar may induce different sorts of modifications in the


Stephen R. Anderson

stems to which they attach, as we saw already in section 2.1.2 with the three affixaltypes in Kwakw’ala.A related kind of complexity is found in languages where morphological elements

can display one of several distinct types of phonological behaviour. In earlier theoriessuch as that of Chomsky andHalle (1968), this was typically represented as a differencein the type of boundary associated with the element. Lexical Phonology incorporatedthis into the architecture of the grammar as the difference between the morphology of‘Level I’, ‘Level II’, etc., where specific elements are characterized by level, distinguishedfrom clitics. Included in this category perhaps should be the case of clitics attached atvarious prosodic levels, as in Anderson (2005).

. The source(s) of morphological complexity

This survey of ways in which word structure contributes complexity to the grammarsof languages naturally raises the question of where these things come from. As arguedin section 2.1, they do not follow from the intrinsic nature of the task of mappingbetween content and form, so where do they come from?Empirically, it seems clear that most of the ways in which grammars are morpho-

logically complex arise as the outcome of historical change, restructurings of varioussorts. Many of these fall under the broad category of ‘Grammaticalization’. Canon-ically, this involves the development of phonologically and semantically reducedforms of originally independent words, leading eventually to grammatical structure.Originally full lexical items may generalize their meanings in such a way as tolimit their specific content, leading to their use as markers of very general situationtypes. When this happens, they may also be accentually reduced, leading to furtherphonological simplification. This, in turn, may lead to their re-analysis as clitics,with an eventual development into grammatical affixes, and so new morphologyis born.But within linguistic systems, there are other possible paths that can lead to

morphology where before there was only phonology and syntax. For example, phono-logical alternations, when they become opaque in some way, can also be reinterpretedas grounded in themorphology instead.The standard example of such a change isGer-man Umlaut, and the overall pattern of development is quite familiar. A similar pointcan be made on the basis of the re-analysis of derived syntactic constructions, oncethey become opaque, being re-analysed as syntactically simple but morphologicallycomplex (Anderson 1988).Historical developments thus often yield systems that aremore complex inmorpho-

logical terms. But the opposite is true as well: when systems become more complex,that may trigger restructuring which reduces the complexity. Paradoxically, changeproduces complexity, but complexity can result in change.



It is sometimes assumed that morphology always has its origins in some other partof the grammar, particularly the syntax, as expressed in Givón’s (1971) aphorism thattoday’s morphology is yesterday’s syntax. But there are examples where that cannot bethe case, showing that morphology seems to have some sort of status of its own.This is demonstrated particularly elegantly in phenomena found in a language that

is of demonstrably recent origin, Al Sayyid Bedouin Sign Language as studied byMeiret al. (2010). In this language, more or less the entire history of the emergence ofgrammatical structure can be observed.The interesting point for our purposes here isthat by the third generation of speakers of this language, morphological structure hasbegun to emerge in the form of regularities of compound formation. One of these isthe generalization that in endocentric compounds, modifiers precede their heads: e.g.pray∧house ‘mosque’.This is hardly an exotic structure. But importantly, it is one thatcannot have come from the syntax, because we also find that in syntactic formations,heads precede their modifiers. This demonstrates that this bit of morphology really isnot entirely parasitic on other areas of grammar, however often the origin of specificmorphology can be found elsewhere.

. Conclusion

I started with what seems a logically plausible conception of what makes complexity:basically, some property of a system that cannot be derived from its essential character.When we look at the basic nature of human language, it seems to follow that anymorphological property is of this sort, since syntax and phonology would seem tosuffice unaided to fulfil the needs language serves. But given the pervasiveness ofmorphology in the world’s languages, and the tendency of morphological structureto be created in linguistic change rather than being uniformly eliminated, this wouldappear to be a sort of reductio ad absurbum of this notion of complexity, as applied tolanguage.In particular, what seems complex to us as scientists of language may or may not

pose problems for users of language. Strikingly, we find that little children seem tohave no remarkable difficulty in acquiring languages like Georgian, or Mohawk, orIcelandic along more or less the same time course as children learning English orMandarin. Of course, it might be that little children are just remarkable geniusesat solving problems that seem impenetrable to scientists. But it seems more likelythat morphology, despite the fact that a priori it seems like nothing but unmotivatedand gratuitous complication, is actually deeply embedded in the nature of language.Althoughmorphological structure of any sort would seem to be a serious challenge tothe notion that human languages are ‘optimal’ solutions to the problem of mappingcontent to form, morphology seems to be a fact of life—and a part of the humanlanguage faculty. And that has to give us pause about our ability to say anything seriousabout what is or is not complex inmorphological systems in any deep and basic sense.


Stephen R. Anderson

Acknowledgements

I am grateful to audiences at a SSILA special session on morphological complexity inthe languages of the Americas at the Linguistic Society of America Annual Meeting inPortland, Oregon in January 2012, and at the Morphological Complexity Conferencein London, for discussion of this material. I am especially grateful to Mark Aronoff,Andrew Carstairs-McCarthy, and Marianne Mithun for ideas, examples, and com-ments incorporated here. My work on Kwakw’ala was supported by grants from theUS National Science Foundation to UCLA.

OUP CORRECTED PROOF–FINAL, 9/3/2015, SPi

Part II

Understanding Complexity



Rhizomorphomes, meromorphomes,and metamorphomes

ERICH R. ROUND

. Three species of morphome

In this chapter, I wish to draw attention to a distinction between three species ofmorphome. A morphome (Aronoff 1994) is a category which figures prominently inthe organization of a language’s morphological system, yet in its most intricate man-ifestations is anisomorphic with all syntactic, semantic and phonological categoriesthat are active elsewhere in the grammar. Research into morphomes has intensified inrecent years and it is possible now to formulate a more nuanced theory of this objectof study. To that end, a distinction can be drawn between what I propose to termrhizomorphomes, meromorphomes, and metamorphomes. All three are equallymorphomic categories, but of different kinds. A summary appears in Table 3.1.Rhizomorphomes are categories that pertain to sets of morphological roots.

They divide the lexicon into classes whose members share similar paradigms (ofinflectional word forms, derived stems, or both). Classic examples of rhizomorphomiccategories are declensions and conjugation classes (Aronoff 1994). Meromorphomesare categories that pertain to sets of word formation operations, whose task is to derive

Table . Morphome types and their mode of categorization

Morphome types Pertain to Divide up By similarity of

Rhizomorphome sets of roots the lexicon paradigmsMeromorphome sets of word formation

operationsmorphologicalmappings

patterns of exponence

Metamorphome sets of cells in aparadigm

paradigm types incidence of(realizations of)meromorphomes


Erich R. Round

the pieces of individual word forms. Thus, meromorphomes inhere in the organisa-tion of a language’s morphological exponence system, as I will detail in this chapter.Metamorphomes are categories that pertain to distributions across a paradigm,of cells which contain pieces of exponence that are realizations of meromorphomiccategories. I will have more to say about metamorphomes towards the end of thechapter.The bulk of the chapter will be devoted to clarifying how meromorphomes differ

from similar concepts in current, formal morphological theory, and to presentingempirical phenomena which would appear to justify their use. The mode of argu-mentation is constructive, and so it will suffice to provide one set of examples, to betaken from the Kayardild language, whose inflectional system is both strongly orga-nized around meromorphomic categories and problematic for existing alternatives.Kayardild is also convenient because it is free from the distraction of rhizomorphomiccategories, that is, it has no conjugation or declension classes. Towards the end of thechapter I come to the topic of metamorphomes, such as the ‘L, U and N morphomes’of Romance languages (Maiden 2005). I also highlight parallels in how the threemorphome species can be complex

. Kayardild and its syntax–morphology interface

Kayardild is a member of the non-Pama-Nyungan, Tangkic language family of north-ern Australia. It was traditionally spoken primarily on Bentinck Island in the southof the Gulf of Carpentaria (Evans 1995a). Typologically, it can be characterized as anagglutinative, purely suffixing, dependent-marking language. Its argument alignmentis nominative–accusative. It has a fixedword order inDPs, but otherwise word order isfree, to the extent that any order appears to be possible under appropriate contextualconditions. DPs and certain verbs of movement and transfer are freely elided if themeaning is recoverable from context. The language is treated in a descriptive gram-mar (Evans 1995a), and in a formal analysis of phonology, morphology and syntax(Round 2009, 2013). The description of Kayardild’s syntax–morphology interfacewhich appears here is based on the formal analysis of Round (2013), which in turnis integrated with a formal analysis of the language’s phonology (Round 2009). Thisallows for reliable distinctions to be drawn between generalizations that are purelymorphological versus those that are phonological. Empirically, the analyses below andinRound (2013) are based on a corpus ofmaterials collected by StephenWurm in 1960,Nick Evans from 1982–2004, plus materials collected by the author in 2005–7. Otheraspects of the analysis to follow are expanded upon in Round (2010, 2011, forthc.,in prep).Round (2013) demonstrates that even though word order provides no evidence for

it, it is possible to infer the existence of a complex clause structure in Kayardild asshown in Figure 3.1, where DPs fit into the empty positions, and S′′ can be embedded


Rhizomorphomes, meromorphomes, and metamorphomes

S�

VPβ

VPα

V�β

V�α

S�α

S�β

COMP

TAMA

TAMA

TAMA

TAMT

SEJ

VPε

VPγ

VPδ

V

S

NEG; TAMT

Figure . Kayardild clause structure and attachment of features

as a sister of V. The structure in Figure 3.1 follows not from any a priori choiceof syntactic theory, but empirically from the facts of inflectional morphology andfrom alternations in argument structure and discourse function, such as passivization,topicalization, and focalization. Figure 3.1 also displays certain inflectional featuresattaching to various nodes in the tree; we will return to this shortly.Overall, the inflectional system of Kayardild can be analysed in terms of sevenmor-

phosyntactic features (i.e. inflectional features), listed in Table 3.2. As a consequenceof how features are assigned to individual words, a word may be specified for a valueof a given feature, or it may be left unspecified for that feature; in some instances, itmay even be specified for multiple values of a feature. Thus, for example, some nounswill be specified for a certain value of case, while some will have no specification forcase, and some are specified for multiple case values. Table 3.2 lists the values thateach of the seven features can take, when they are specified.


Erich R. Round

Table . Inflectional features and their potential values

Feature Abbr. Possible values

Complementization comp: [+]Sejunct sej: [+]Negation neg: [+]Athematictense/aspect/mood

tama: athematic antecedent, athematic directed,athematic incipient, athematic precondition,continuous, emotive, functional, future,instantiated, negatory, present, prior

Thematictense/aspect/mood

tamt: actual, apprehensive, desiderative, hortative,immediate, imperative, nonveridical, past,potential, progressive, resultative, thematicantecedent, thematic directed, thematicincipient, thematic precondition

Case case: ablative, allative, associative, collative,consequential, dative, denizen, donative,genitive, human allative, instrumental,locative, oblique, objective ablative, objectiveevitative, origin, privative, proprietive,purposive, subjective ablative, subjectiveevitative, translative, utilitive

Number num: dual, plural

The distribution of these features across the words in any sentence can be accountedfor in terms of their relationship to syntactic structures as follows. Features attachinitially to some node in the syntactic structure and then percolate down to allsubordinate nodes, and thus onto all subordinate words. The one constraint onpercolation is that S′′ nodes are opaque to it, and thus an embedded S′′ constituent willnot inherit features from its matrix clause. This general model ensures that differentpoints of features’ initial attachment in the syntactic tree lead to different distributionsof those features across the clause, or conversely, that by studying the distributionsof features across the clause, we can infer the syntax shown in Figure 3.1 and theaccompanying attachment points of features.According to thismodel, and as shown in Figure 3.1, the two features associatedwith

so-called complementized clauses, comp and sej, will attach highest in the clause.Attaching further down are a negation feature neg and two tense/aspect/mood (tam)features: a ‘thematic’ feature tamt and an ‘athematic’ feature tama1 Each individual

1 Both tamt and tama express tense/aspect/mood information. The terms ‘thematic’ and ‘athematic’refer to the morphotactic behaviour of the features’ exponents. The thematic feature, tamt is only everrealized on a stem whose morphomic representation ends with a morphomic ‘thematic’ element, whereastama is only ever realized on a stemwhosemorphomic representation does not endwith a thematic (Round: –, forthc.).



value of tama and tamt attaches to a specific one of the nodes indicated in Figure3.1. The feature case attaches to DP nodes, and num to either DP or NP (which sitswithin the DP).The near-unrestricted nature of percolation leads to words getting associated with

many inflectional features simultaneously. Consider for example the word ngur-ruwarrawalathinabamaruthurrka, glossed in (1). It was recorded in spontaneousspeech and has the inflectional properties of word ω in Figure 3.2. It inherits thefeatures num:plural and case:ablative that attach to its nearest DP node aboveit, plus case:dative from the DP node above that, tama:present from VPγ ,tamt:immediate from S, sej:[+] from S′α, and comp:[+] from S′β(1) ngurruwarra

fishtrap-walathnum:plural

-inabacase:ablative

-maruthcase:dative

-urrkatamt:immediate&sej:[+]‘for the ones from the many fishtraps’ (Evans 1995a: 66)

Two comments are in order at this point, regarding features which are present butwithout receiving overt realization, and features which are not present at all.Not all of the features that percolate down onto a word will be overtly realized.

As in many languages, some features’ realization is precluded by the realization ofothers, that is, there is blocking or disjunctive ordering of certain features’ realization.Specifically, of the two features comp and sej, a word will inflect overtly only forone. Accordingly, the word ngurruwarrawalathinabamaruthurrka in (1) inherits bothcomp:[+] and sej:[+], yet inflects overtly only for sej:[+]. Similarly, a word willinflect overtly only for one of the features tama or tamt in a given clause.2 The wordin (1) inherits both tamt:immediate and tama:present, but inflects overtly only fortamt.A separate matter is that not all syntactic nodes which potentially associate with a

given feature always do associate with it. That is to say, individual clauses, DPs, andNPs differ from one another not only in terms of the specific value of the featuresthat they associate with, but also in terms of whether they associate with a value ofthe feature at all. For example, the clause shown in Figure 3.2 is complementized, andhence the features comp and sej attach to their appropriate nodes, high in the tree.However, in uncomplementized clauses, the features comp and sej are simply absent;they do not attach in the syntax, and hence do not percolate onto any words, and thusno word in the clause will inflect for them.3 The feature neg only appears in negated,

2 Words may inflect for two such features if they originate in two different clauses (i.e. matrix andsubordinate), on which, see Round (: –).3 For example, the bracketed clause in (a) is complementized, and so associates with comp and sej

(of which only sej is realized), while the bracketed clause in (b) is uncomplementized. A third possibility


Erich R. Round

COMP:[+]

TAMA:PRESENT

SEJ:[+]

TAMT:IMMEDIATE

DP

DP

CASE:DATIVE

NUM:PLURAL, CASE:ABLATIVE

...ω...

S�

VPβ

VPα

V�β

V�α

S�α

S�β

VPε

VPγ

VPδ

V

S

Figure . Features which percolate onto word ω

appears in (c). This contains a ‘nonsejunct’ complementized clause (Round : –) which associateswith comp but not sej. In the absence of sej, the comp feature is overtly realized.

(a) Jinaawhere

bijarrb,dugong

[ ngumbaa2sg.sej

kuruluth-arra-nth ] ?kill-past-sej

‘Where is the dugong which you killed?’ (Evans : )

(b) Jinaawhere

bijarrb?dugong

[ Nyingka2sg

kuruluth-arr ] ?kill-past

‘Where is the dugong? Did you kill it?’

(c) Jinaawhere

bijarrb,dugong

[ nyingka2sg.comp

kuruluth-arra-y ] ?kill-past-comp

‘Where is the dugong which you killed?’ (Evans : )



main clauses, thus inmost clausesneg is absent (this is true of the clause in Figure 3.2).Similarly, only some DPs and NPs are associated with case and/or number features.In Figure 3.2, the higher of the two DPs lacks any specification for number.In all cases, the logic behind the analysis of an ‘absent’ feature f is the same. In some

clauses, the distribution of inflection for feature f has allowed us to infer the existenceof a given syntactic node n, to which f attaches. Words below node n inherit f, whilewords above it do not. Then, in a syntactically parallel clause, no such inflection forf is observed; all other evidence suggests that the syntactic structure is equivalent,yet words sitting below node n exhibit a lack of inflection for f equivalent to thatexhibited by words that sit above node n. The analysis is that in such clauses, f isabsent.The system outlined here is one which leads to considerable morphological exuber-

ance, as demonstrated in research by Evans (1995a, 1995b). Our focus, however, willnot be exuberance, but the detail of how Kayardild’s inflectional features are realized.

. Identity of exponence: Rules of referral

In sections 3.3–3.5 I analyse the realizational component of Kayardild’s inflec-tional morphology from the point of view of inferential–realizational morphology(Matthews 1974, Anderson 1992, Stump 2001). My aim is to introduce a range ofempirical properties of inflectional exponence in Kayardild, and as I do so, to orientthem towards certain prominent points in the landscape of formal morphologicaltheory. Since in these few short pages I cannot do justice to the full range of potentialanalyses of the data, my strategy is to draw attention to aspects of the data which arewell- or ill-accommodated by certain, prominent formalisms, with the understandingthat similar issues will arise for at least some related approaches. For explicitness, I willassume that there exists a component of the grammar, the realizational morphology,which takes as its input a lexeme represented by a lexical index L, plus a structure Σ ofinflectional features, and whose output is the underlying phonological representationof an inflected word form. The realization of feature structures Σ is analysed intoconstituent parts, expressible by inflectional realizational rules, the first kind of whichis an inflectional rule of exponence (Infl-RExp) of the type shown in (2), where a singlerule realizes some sub-structure σ of the total feature structure Σ for the lexeme L.Therule takes as one of its inputs the word form as derived so far by prior rules, indicatedas φi.

(2) Infl-RExpσ 〈L, Σ, φi〉 = def 〈L, Σ, φi+/ki/〉The output of an inflectional rule of exponence is a tuple which preserves the lexicalindex L and, following theories such as A-morphous morphology (Anderson 1992)and Paradigm Function Morphology (Stump 2001), also preserves the inflectionalfeature structure Σ. Turning to the realization itself, the rule outputs a phonological


Erich R. Round

form which is typically a modification of the input. In (2) the rule suffixes the string/ki/ to its phonological input.4 To generalize this last expression, we can employan operation variable p as in (3), where p may stand for any kind of phonologicaloperation, such as the suffixing of a string /ki/ (4a), or a nonconcatenative operationsuch as the application of ablaut (4b).

(3) Infl-RExpσ 〈L,Σ,φi〉 = def 〈L,Σ,p(φi)〉(4) a. p(φ) = φ+/ki/

b. p(φ) = ablaut(φ)When analysing the realization of inflectional feature structures in Kayardild, thecentral point of interest is shared exponence, and its many shades of variation. Tobegin, consider for example the set of inflectional feature-values in (5). Each feature-value is distinct from the others. They have different distributions in the clause (i.e.they are analysed as attaching initially to different syntactic nodes), they correlate withdifferent semantics, and they enter into different paradigmatic relationshipswith otherfeature-values in the system. Nevertheless, all of them are realized phonologically asthe suffix /+ki/.(5) a. case:locative → /+ki/

b. tama:present → /+ki/c. tama:instantiated → /+ki/d. comp:[+] → /+ki/e. tamt:immediate → /+ki/

Likewise, all of the feature-values in (6) are realized phonologically by the suffix /inta/.Nor is this unusual in Kayardild. Of the fifty-five feature-values in the Kayardildinflectional system, just under half of them share their phonological exponence exactlywith at least one other feature-value (a full list of feature-values and their exponentsappears in the appendix).

(6) a. case:oblique → /-inta/b. tama:emotive → /-inta/c. tama:continuous → /-inta/d. sej:[+] → /-inta/e. tamt:hortative → /-inta/

Shared exponence, or syncretism, is a core concern for morphological theory. Onestandard method of formalizing syncretism is via a rule of referral (Zwicky 1985,Stump 1993).5 According to this approach, a rule of exponence will exist for one

4 At this juncture, the choice of a suffix /ki/ is purely for illustrative purposes, though as we will see, /ki/is an inflectional suffix in Kayardild.5 Of course many other approaches exist, and for some, similar issues will arise. For example, this will

be true of bi-directional rules of referral (Stump :) and other static declarations of identity ofexponence between cells in the paradigm of a single lexeme.



Table . Features for which lexical stems can inflect

Feature Nominal lexical stem Verbal lexical stem

case � —tama � —comp � —sej � —tamt — �neg — �

feature structure σ, for example for σ={case:locative} in (7a), and then a rule ofreferral will re-route the realization of a second feature structure τ, for exampleτ={tama:present} in (7b), back to the first rule, (7a).

(7) Felicitous inflectional rules of referral in Kayardild

a. Infl-RExpcase:locative 〈L,Σ,φi〉 = def 〈L,Σ, φi+ki〉b. Infl-RReftama:present 〈L,Σ,φi〉 = def Infl-RExpcase:locative 〈L,Σ,φi〉

By positing a rule of referral, the analyst explicitly expresses the fact that the sharedexponence of these two feature structures σ and τ is non-accidental; it arises specif-ically because the realization of both sets is effected by the same rule of exponence, inthis case, by (7a). Many cases of syncretism can be analysed elegantly using rules ofreferral. However, taken in their simplest form, rules of referral will fail under certainconditions. Relevant to this discussion is the scenario in which two feature structuresσ and τ share their exponents, yet stems that inflect for σ don’t inflect for τ, and stemsthat inflect for τ don’t inflect for σ. This case arises in Kayardild, as follows. Table3.3 indicates the features which nominal and verbal lexical stems can inflect for inKayardild.Nominal stems inflect directly for values of case, tama, comp, and sej, but not

tamt or neg, while verbal stems inflect directly for values of tamt and neg, but notcase, tama, comp, or sej. Consequently, no lexical stem can inflect directly for bothcase andtamt, for example. Consider nowa rule of referral which attempts to expressthe identity of the exponents of case:locative and tamt:immediate (cf. (5))such asrule (8a) or (8b).

(8) Infelicitous inflectional rules of referral in Kayardild

a. ∗ Infl-RReftamt:immediate 〈L,Σ,φi〉 = def Infl-RExpcase:locative〈 L,Σ,φi〉b. ∗ Infl-RExpcase:locative 〈L,Σ,φi〉 = def Infl-RReftamt:immediate 〈L,Σ,φi〉

Both rules fail, since in effect they attempt to define ‘a stem’s inflection fortamt:immediate’ with reference to ‘its inflection for case:locative’, which does


Erich R. Round

not exist, or vice versa, ‘a stem’s inflection for case:locative’ with reference to ‘itsinflection for tamt:immediate’, which does not exist. At this point then, it is apparentthat if we wish to express identities of exponence such as those in (5) and (6) in a non-accidental manner, rules of referral alone will not suffice.One solution to this impasse, following Corbett (2007:29), is to introduce ‘virtual

rules’ for at least one set of lexical stems such as (9a). Virtual rule (9a) is never useddirectly to realize case:locative on lexical verbal stems, but since it is present in therule system, it is available to be routed to by a rule of referral such as (9b).6

(9) Felicitous rules of referral in Kayardild (†= ‘virtual rule’)a. † Infl-RExpcase:locative 〈L,Σ,φi〉 = def 〈L, Σ, φi+ki〉b. Infl-RReftamt:immediate 〈L,Σ,φi〉 = def Infl-RExpcase:locative 〈L,Σ,φi〉

Virtual rules (or in a paradigm, virtual cells) are discussed by Corbett (2007) inthe context of deponency. Their function is to allow the inflectional system to makereference to the inflectional exponence, E, of some feature structure σ, in cases whenE does not function as a realisation of σ. To take a classic example of deponentverbs in Latin the ‘passive’ inflectional exponents, E, for a deponent verb do notfunction to realize morphosyntactically passive feature structures, σ, for the lexeme,because the verb is not used in the passive; nevertheless, the inflectional systemdoes refer to the exponents E, deploying them as the verb’s active inflection. In thecase of Kayardild, we would say that the suffixing of /ki/ to a lexical verbal stem(exponent E, generated by (9a)) does not function in Kayardild as the realizationof case:locative (σ), yet the system still makes use of it, by referring to it in(9b). Notwithstanding these parallels, there are some differences. In the Latin case,the virtual-rule analysis captures an intuition that deponent verbs have well-formedpassive exponents that are nevertheless used for an unusual function.The notion of a‘well-formed passive exponent’ relies on a comparison with non-deponent verbs, forwhich the same exponents do indeed realize the passive. In Kayardild however, no

6 A more radical repair to the infelicitous rules in () would be to propose that Kayardild lexical verbalstems participate in a narrow form of ‘interclass mismatch’ (Spencer ). That is, they do not inflect fortamt:immediate at all but instead literally inflect for case:locative as if they were nominal stems.Thereappears to be little to recommend such an analysis of these facts. Consider first, that the features tamtand tama associated with a given clause are disjunctively ordered: a word may never inflect for both, but awordmay freely inflect for both tama and case. If we supposed that a verbal stem literally inflects for caseinstead of tamt, we would require a further stipulation that on verbal, but not on nominal, stems, case isdisjunctively ordered with tama—that is, we would require the morphological distinction between verbaland nominal stems to be maintained, even while saying that the verbal stems are being morphologicallymismatched with nominals. Second, not every tamt value shares an exponent with case, thus we wouldbe forced to say either that verbal stems inflect sometimes for case and other times for tamt, or thatverbal stems always inflect for case, but that some values of case appear only on verbal stems, and not onnominal.



lexical verbal stem ever has a true case:locative inflection, and thus there is littlesense in which rule (9a) is reproducing a ‘well-formed exponent’ in the sense thatis true for Latin deponent verbs.7 Consequently, there must be some concern that avirtual-rule analysis of theKayardild facts is something of a sleight of hand: technicallyfeasible, but with little explanation for why. In the next section I present an alternativewhich avoids any such connotation.To summarize thus far, the appeal of rules of referral is that they express the non-

accidental nature of syncretism. They do so by referring the realization of one set offeature-values τ to the exponence defined for another set σ. When this is taken in itssimplest form, a limitation is that the referral must be to an exponent which actuallyexists. In Kayardild though, some identities of exponence occur in the absence of anysingle lexical stem class that will inflect for all of the features involved (in this sense,they contain a dimension of complexity which does not characterize the phenomenaconsidered in the chapters by Donohue or Koenig and Michelson). A solution whichsuffices for the data considered in this section, is to posit virtual rules of exponencesuch as (9a).On at least one interpretation however, a drawback is that wemust define,for example, ‘well formed case inflections’ for lexical verbal stems, even though nosuch stem in the language actually inflects for case.

. Identity of exponence: the Separation Hypothesis

In this section I introduce additional facts of Kayardild inflection for which theeventual analysis will bemorphomic. I will notmove directly to amorphomic analysishowever, since there is another approach available which handles the data elegantly,even though it will ultimately fall short of accounting for the full range of facts. Thisapproach follows from Beard’s Separation Hypothesis (1995).According to the Separation Hypothesis we may suppose that in any language, the

phonological operations referred to by the morphology are represented independentof the rules in which they figure. The language therefore possesses a repertoire ofoperations, of the kind illustrated by the subset of Kayardild operations shown in (10).

(10) Inventory of phonological operations referred to by the morphology

a. Φ1(φ) = φ+/ki/b. Φ2(φ) = φ-/inta/c. Φ3(φ) = φ-/napa/d. Φ4(φ) = φ-/˙iŋ/e. Φ5(φ) = φ+/kurka/

7 One might argue that this is a case of making the wrong comparison: if we compared lexical verbalstemswith nominal stemswewould see that there arewell-formed exponents of case:locative.However,that comparison is also imperfect, cf. fn..


Erich R. Round

These operations can then figure in rules of exponence as constants, as illustratedin (11).

(11) All rules are felicitous:

a. Infl-RExpcase:locative 〈L,Σ,φi〉 = def 〈L,Σ, Φ1(φi)〉b. Infl-RExptama:present 〈L,Σ,φi〉 = def 〈L,Σ, Φ1(φi)〉c. Infl-RExptamt:immediate 〈L′,Σ,φi〉 = def 〈L′,Σ, Φ1(φi)〉

The rules in (11) cover the same facts of exponence discussed in 3.3, but this time allthree rules are rules of exponence; rules of referral are not necessary. In (11), the factthat shared exponence is non-accidental has been expressed through the use of theoperation constant Φ1 in all three instances. This approach offers a straightforwardaccount of identities of exponence, including those which lack any single lexical stemclass that inflects for all of the feature structures involved (i.e. the cases which requiredvirtual rules in 3.3). Moreover, as Beard (1995) emphasizes, these phonological oper-ations need not be active solely in a language’s inflectional component. The sameoperations can also appear in derivational rules, as indicated in (12).

(12) Deriv-RExpδ〈L→L′,�,φi〉 = def 〈L→L′,�,Φ1(φi)〉The rule in (12) is a derivational rule of exponence, which figures in the derivation ofone lexeme L′ from another L. In its formulation, the same phonological operationΦ1, which in (11) features in the realization of inflectional features, now realizes thederivational feature structure, δ. This aspect of Beard’s approach is directly applicableto Kayardild, where most of the exponents employed by the inflectional system arealso used derivationally. For example, the same /+ki/ which realizes the inflectionalfeatures already discussed also figures in the derivation of place names, such as thosein (13).

(13) Orthographic form: a. Jawar -i b. Makarr -kiUnderlying form: /cawa˙ +ki/ /makark +ki/Literal gloss: ‘oyster sp. (+ki)’ ‘anthill (+ki)’

This ease of relating inflectional and derivational exponence is welcome in Kayardild,where of the fifty-five feature-values in the inflectional system, over two-thirds sharetheir exponence with some or other derivational operation (a list appears in theappendix; see Round 2011 for further analysis).8A third area in which Beard’s Separation Hypothesis offers an elegant account of

Kayardildmorphology is in cases such as (14)–(16). Here, the exponents of the feature-

8 Expressing the same relatedness using a rule of referral would presumably require some augmentationof the rule system, to accommodate the fact that the derivational use of the exponent involves a relationshipbetween two different lexemes, whereas the inflectional use does not.



values in (15) and (16) have parts which are identical to each other and to the wholeof the realization in (14).

(14) case:locative → /+ki/tama:present → /+ki/tama:instantiated → /+ki/comp:[+] → /+ki/tamt:immediate → /+ki/

(15) case:ablative → /+ki-napa/tama:prior → /+ki-napa/tama:precondition → /+ki-napa/

(16) case:allative → /+ki-˙iŋ/tama:directed → /+ki-˙iŋ/tama:directed → /+ki-˙iŋ/

This is an aspect of shared exponence in Kayardild which will present a challengeto formalisms that require paradigm cells to be the basis of shared exponence,since here it is not cells which are identical, but parts of the forms in those cells.That is, the sharing of forms in (14)–(16) involves relationships which are not merelyparadigmatic, but syntagmatic also.The sharing of parts of exponence, as in (14)–(16), is expressed easily if we accept the

Separation Hypothesis. Since phonological operations have an existence independentof the rules which refer to them, and since they are referred to in rules by way ofoperation constants such as Φ1, it is possible to define rules which refer to multipleoperations. Drawing on several of the operations from (10), the inflectional data in(14)–(16) can be analysed in terms of realization rules such as those in (17).

(17) a. Infl-RExptama:present〈L,Σ,φi〉 = def 〈L,Σ, Φ1(φi)〉b. Infl-RExptama:prior〈L,Σ,φi〉 = def 〈L,Σ, Φ3(Φ1(φi))〉c. Infl-RExptama:directed〈L,Σ,φi〉 = def 〈L,Σ, Φ4(Φ1(φi))〉

This is useful in Kayardild, because patterns of partially shared exponence are com-mon. Of the fifty-five feature-values in the inflectional system, more than half enterinto partly shared exponence with some other feature-value.We have now seen three ways in which the inflectional system of Kayardild sys-

tematically exploits identities of exponence in manners which go beyond the simplestcases at the beginning of 3.3. First, identity of exponence regularly transcends anysingle class of stems which could inflect for all of the feature structures involved;second, identity of exponence can cross the inflection–derivation divide; and third,it can exist between the parts, as well as the whole, of feature structures’ exponents.These identities of exponence are expressible in a straightforward manner using anapproach based on Beard’s (1995) Separation Hypothesis, which posits an existence


Erich R. Round

for phonological operations independent of the rules in which they figure; rulesof exponence then refer to phonological operations through the use of operationconstants. Rules of referral are unnecessary.

. Identity of pattern and dimensions of exponence: meromorphomes

As flexible as it is, the approach to identity of exponence based on the SeparationHypothesis is not powerful enough to capture the full range of shared exponence inKayardild inflection. Before proceeding further, one point warrants attention. I havebeen entertaining a formulation of rules that employs operators as constants in orderto generate realizations. A consequence of this is that the full inflection of a lexeme,after a series of realization rules has applied, will be something like (18a), in which aseries of operations applies in an appropriate order to an initial stem form, φ.9

(18) a. 〈L,Σ, Φ17(Φ34(Φ5(Φ1(φ)))) 〉b. 〈L,Σ, 〈Φ17, Φ34, Φ5, Φ1, φ〉〉

This kind of representation could just as well be expressed as in (18b) with the operatorconstants arranged in a list. The list can then itself be regarded as a string over whichcertain wellformedness constraints might hold, and in Kayardild, such constraints doexist. To appreciate their nature, we must first return briefly to the syntax–inflectioninterface.In the general case, the linear order of feature-values’ realization in Kayardild is

reflective of the height of the syntactic node to which the feature-value originallyattaches: the higher in the syntactic tree a feature originates, the further out fromthe stem it is realized. However, there are exceptions to this rule (Evans 1995a: 129–33, 1995b). Some realizations must be last, i.e. farthest from the stem.10 So, what classof realizations falls under such constraints? It is not a coherent morphosyntactic class(e.g. all tama values), nor is it a coherent phonological class (e.g. all unstressed affixes)or a position class (e.g. all inflectional exponents immediately to the right of the stem),rather it is a coherent class expressed in terms of the operator constants themselves.For example, all of the feature-values listed in (6) must be realized last (i.e. they mustoccur leftmost in a list like (18b)), irrespective of their origin in the syntax. Whatunites those feature-values is that their realization adds the operator Φ2 to the overallrepresentation of a word form. Facts such as this (see Round, forthc. for a full account)indicate that a representation like (18b) is not merely a linguists’ convenience, a by-product of some stepwise transformation from inflectional feature structure Σ to a

9 In fact it can be argued that the general case, the stem form φ will possess a similar internal structure(Round ).10 In this discussion I abstract away from the role of a semantically empty suffix, called the ‘termination’

which in fact appears further out than all other suffixes. Regarding the status of the termination see Round(; : –, –; forthc.).



phonological form, but a representation which is linguistically significant. It possessesits own set of wellformedness constraints. Moreover, neither its constituent unitsnor the units in terms of which its constraints are expressed are isomorphic withinflectional features; nor are they isomorphic with phonological categories. That isto say, the representation in (18b) is morphomic. The constants which comprise thatrepresentation are a species of morphome, and the constraints on their linearizationare inherently morphomic constraints.In terms of our overarching topic, Kayardild morphomes are an instance of mor-

phological complexity. They feature prominently in the regulation of the morpho-logical system, yet they do not enhance its expressive capability (cf. Anderson, thisvolume). Indeed, it is a question for future research to ascertain what payoffs, if any,such complications provide.11Returning to the specifics of our analysis, and to reflect its morphomic nature, I will

change the symbols Φi to Mj , so for example, the rule in (19a) becomes (19b) and therepresentation in (20a) becomes (20b). This is more than just a notational variation,for aswewill see shortly, the unitsMj prove to be distinct fromphonological operators.

(19) a. Infl-RExptama:directed〈 L,Σ,φi〉 = def 〈 L,Σ, 〈Φ3, Φ1, φi〉〉b. Infl-RExptama:directed〈 L,Σ,φi〉 = def 〈 L,Σ, 〈M3, M1, φi〉〉

(20) a. 〈L,Σ, 〈Φ17, Φ34, Φ5, Φ1, φ〉〉b. 〈L,Σ, 〈M17, M34, M5, M1, φ〉〉

As foreshadowed earlier, the Separation Hypothesis falls short of providing a fullaccount of identity of exponence in Kayardild. Formally, I had used operator constantsΦi within rules of exponence in order to refer to independently listed phonologicaloperations. The limitation of that approach is that by definition, a phonological oper-ator constant Φi always stands in for one and the same phonological operation, suchas suffixing /+ki/ or applying ablaut.This enforces a one-to-onemapping between theconstant as an element in a representation like (20a), and the operation itself. However,the units Mj in the morphomic representation of a Kayardild word, such as (20b), donot always map onto phonological operations in a one-to-one fashion. Consider onceagain the realizations in (5) and (6), repeated here in (21) and (22).

(21) a. case:locative → /+ki/b. tama:present → /+ki/c. tama:instantiated → /+ki/d. comp:[+] → /+ki/e. tamt:immediate → /+ki/

11 One observation is that at least some morphomes appear to have a reality for speakers, which leadsthem to being retained over time (Maiden ). Elsewhere (Round, forthc.) I note that the structureswhich are now morphomic in Kayardild appear to have endured steadfastly, even as the contentful side ofthe morphological system has undergone extensive upheaval.


Erich R. Round

(22) a. case:oblique → /-inta/b. tama:emotive → /-inta/c. tama:continuous → /-inta/d. sej:[+] → /-inta/e. tamt:hortative → /-inta/

According to the formalism using morphomic Mj units, the features in (21) and (22)are realized respectively by rules of exponence such as (23a), which makes referenceto M1, and (23b), and which makes reference to M2. In itself this differs little fromrules employing Φi constants. However, consider what happens when a word inflectssimultaneously for features in both (21) and (22).

(23) a. Infl-RExpcase:locative 〈L,Σ,φi〉 = def 〈L,Σ, 〈M1, φi〉〉b. Infl-RExptama:emotive 〈L,Σ,φi〉 = def 〈L,Σ, 〈M2, φi〉〉

If a word inflects simultaneously for any of the feature-values in (21) plus any ofthe feature-values in (22), then the exponent for the two is /+kurka/, a cumulative(or portmanteau) suffix. Examples of this are shown in (24)–(26) for the lexemesbalung ‘western’, narra ‘knife’ and kurda ‘coolamon’. The four lines of glossing,from top to bottom are: orthographic; underlying phonological;12 morphomic; andmorphosyntactic. Each example illustrates a different pair of feature-values, one ofwhich is realized as M1, and one as M2.

(24) a. balungkiya b. balunginja c. balungkurrkapaluŋ+ki-a paluŋ-inta paluŋ+kurkabalung,Σ,〈M1,paluŋ〉 balung,Σ,〈M2, paluŋ〉 balung,Σ,〈M2,M1,paluŋ〉Σ = tama:present Σ = sej:[+] Σ = tama:present

&sej:[+](25) a. narraya b. narrantha c. narrawurrka

≈ara+ki-a ≈ara-inta ≈ara+kurkanarra,Σ,〈M1,≈ara〉 narra,Σ,〈M2,≈ara〉 narra,Σ,〈M2,M1,ŋara〉Σ = case:locative Σ = tama:emotive Σ = case:locative

&tama:emotive

(26) a. kurdaya b. kurdantha c. kurdawurrkaku3a+ki-a ku3a-inta ku3a+kurkakurda,Σ,〈M1,ku3a〉 kurda,Σ,〈M2,ku3a〉 kurda,Σ,〈M2,M1,ku3a〉Σ = tama:instantiated Σ = tama:continuous Σ = tama:instantiated

&tama:continuous

This situation can be accounted for by positing the set of mappings shown in (27),from morphomic units to phonological operations. The analysis then is that all of the

12 The /-a/which appears after /+ki/ in someof the phonological forms is the ‘termination’, ameaninglessmorph which has word-final phonological content in certain contexts; see Round (: –).



features in (21) map to M1, and all of the features in (22) map to M2. The morphomicunit M1 usually maps to /+ki/ and M2 usually to /inta/, but the string M2, M1 mapsto /+kurka/.13(27) a. M1 → Φ1 ; Φ1(φ) = φ+/ki/

b. M2 → Φ2 ; Φ2(φ) = φ-/inta/c. M2,M1 → Φ6 ; Φ6(φ) = φ+/kurka/

The success and simplicity of this analysis hinge on the ability of units like M1 andM2to map in a non-one-to-one fashion onto phonological operations. This permits theirmappings to be sensitive to their context within the morphomic representation, andwith that, their mappings are powerful enough to express the subtle relationships ofidentity present in the Kayardild data.More abstractly, rules and representations built on Φi constants are sufficient

for expressing identities of exponence of feature structure (including identitiesof exponence that transcend stem classes, identities of exponence that cross theinflection–derivation divide, and partial identities of exponence), but only rules andrepresentations built onMj units are sufficient for expressing identities of patternsof exponence such as those shared by all of the feature-values in (21) or all of those in(22). All of the inflectional features in (21) share not just an exponent, but a commonpattern of exponence: /+ki/ by default and cumulative /+kurka/ in the environ-ment ofM2. If a formal system is to have the capacity to express identities of patterns ofexponence, it requires rules of exponence that refer to morphomic units (Mj), whichare free to map onto phonological operations in a non-one-to-one manner.Let us now take this further. The morphomic units Mj are abstract; they are

isomorphic neither with inflectional features nor with phonological operations; andthey figure in a linguistically significant representation which is subject to its own,internally defined well formedness constraints. In these respects, the morphomicunits Mj are rather similar to abstract units posited by linguists in other domains ofgrammar. One might ask, therefore, do they exhibit other properties characteristic ofabstract units, such as decomposing into distinctive features? In Kayardild, they do.In the general case, the realization in Kayardild of two morphomic units, Mj and

Mk, can differ from one another in three, independently variable dimensions. In orderto express, in an independently varying manner, whether Mj and Mk are identical toanother or not, on each of these three dimensions, I will decompose our morphomicunits, M, into three distinctive features, with one feature corresponding to each of thethree dimensions of potential difference.The first dimension, which I refer to in Round (2013) as the ‘primary’ morphome,

is the set of phonological strings (i.e. suppletive allomorphs) from which a realization

13 As it happens, the order will always be M2,M1, not M1,M2, because M2 is a unit which must appearleftmost in the list (as mentioned previously).


Erich R. Round

must be drawn. For example, all morphomic units M whose primary morphome is‘μcons’ will be realized by one of the two underlying phonological strings in the set{/ŋarpa/, /ŋara/}; all units whose primary morphome is ‘μprop’ will be realized byone of the two underlying phonological strings in {/ku˙u/, /kuu/}; and all units whoseprimary morphome is ‘μpriv’ will be realized by the same underlying phonologicalstring from the singleton set {/wari/}.The second dimension is the kind of phonologi-cal juncture that precedes the phonological string—Kayardild has two major kinds ofphonological juncture and in the general case a suffix can be preceded by either.Thus,some morphomes whose primary feature is μcons are realized with the first kind ofjuncture before the phonological string, while others are realized with the second;the same is true of morphomes whose primary feature is μpriv. The third dimensionis whether a full set of allomorphs is made available for realization, or whether onlya single, ‘strong’ allomorph is available—Kayardild allomorph sets such as {/ŋarpa/,/ŋara/} and {/ku˙u/, /kuu/}, which contain two members, always contain one ‘strong’form (listed first) and one ‘weak’. Thus, some morphomes whose primary feature isμcons will have realizations in which both {/ŋarpa/, /ŋara/} are potential choices,while others will have realizations in which only the strong form, /ŋarpa/, is possible.The various morphomic units Mj , Mk, Ml . . . in Kayardild make good use of thesesubtle dimensions of variation. As an example wemay take the threemorphosyntacticfeature-values listed in(28), which are realized by three distinct morphomic units.

(28) a. tamt:past →⎡⎣prim : μconsjunc : [+]allo : [+]

⎤⎦ → { /+ŋarpa/, /+ŋara/ }

b. tamt:precondition →⎡⎣prim : μconsjunc : [+]allo : [−]

⎤⎦ → /+ŋarpa/

c case:consequential→⎡⎣prim : μconsjunc : [−]allo : [−]

⎤⎦ → /-ŋarpa/

The morphomic units are displayed in the central column of (28), expressed asvectors of distinctive features. All three units in (28) share the same value, μcons,for the primary morphome feature (prim). This captures the fact that all three willbe realizable as either one or both of the underlying phonological strings in the set{/ŋarpa/, /ŋara/}. Turning to individual forms, in (28a) the allomorphy feature (allo)is set to [+], meaning that the morphomic unit may be realized as either /ŋarpa/or /ŋara/. When the grammar is given an option such as this, the choice for oneallomorph or the other is decided by several factors, including whether the registeris spoken Kayardild or song. So for example, in (29) the tamt:past inflection of thelexeme kurrij ‘see’ is realized using the weak allomorph /ŋara/ in speech, but thestrong allomorph /ŋarpa/ in song.



(29) Spoken kurrijarra Song kurrijarrbakuric+ŋara kuric+ŋarpa

∑ = TAMT:PAST ∑ = TAMT:PAST

kurrij,∑,⟨ ,kuric⟩prim: μcons

junc:[+]allo:[+]

kurrij,∑,⟨ ,kuric⟩prim: μcons

junc:[+]allo:[+]

Now consider (28b), whose allomorphy feature is set to [–]. This means that themorphomic unit can only be realized by the strong allomorph /ŋarpa/. Thus, evenin the spoken register, we get the strong allomorph /ŋarpa/ when the verb kurrij isinflected for tamt:precondition, as in (30).

(30) kurrijarrba warnginyarrbakuric+ŋarpa waɻŋic-ŋarpa

∑ = tamt: precondition ∑ = case:consequential

kurrij,∑,⟨ ,kuric ⟩prim: μcons

junc:[+]allo:[–]

warngij,∑,⟨ ,waɻŋic ⟩prim: μcons

junc:[–]allo:[–]

(31)

At this point, we can also note that in (28a,b) and (29)–(30), the juncture feature (junc)is set to [+].This ensures that at the juncture between the stem and suffixmorphs, theunderlying string /c+ŋ/ surfaces as the plosive [c], written orthographically as j. Thisis in contrast to (28c), which is illustrated in (31) using the lexeme warngij ‘one’, andwhose juncture feature is [–].Thedifferent juncture feature ensures that at the juncturebetween the underlying morphs, the underlying string /c-ŋ/ surfaces as the nasal[«], written orthographically as ny. Thus, even though (30) and (31) share the sameunderlying suffix morph /ŋarpa/ following the same underlying, stem-final plosive/c/, the contrast in phonological juncture leads to different phonological outcomes.To summarize: morphomic units Mi are realized as phonological operators, Φn.

Moreover, the realizations in Kayardild of any two morphomic units, Mj and Mk, candiffer from one another in up to three systematic yet independently varying dimen-sions.Within a morphomic representation, those three dimensions can be formalizedin terms of three distinctive features. The first morphomic distinctive feature is theprimary morphome feature prim. Morphomic units which share their value of primwill have exponents selected from the same set of underlying phonological-stringallomorphs.The second and third distinctive features are the allomorphy and juncturefeatures, which modulate whether one or two allomorphs are made available for use,and which phonological juncture appears to the left of the phonological suffix. Thesethree features represent three fine-grained dimensions along which the exponents ofmorphomic units in Kayardild will be identical or not.In 3.3–3.5, I have shown that identities of exponence within the Kayardild inflec-

tional system are common and diverse. There are complete identities of exponence


Erich R. Round

(5)–(6), including those which transcend stem classes and those which span theinflection–derivation divide; there are identities between segmentable parts of expo-nents (14)–(16); and there are identities of pattern of exponence across differentcontexts (24)–(26). Finally, there are identities in orthogonal dimensions of expo-nence (28)–(31). This diversity of phenomena defies an account in terms of simplerules of referral or one based on the Separation Hypothesis alone. Evidently ourtheory requires a more powerful formalism if it is to express the full breadth of(partial) identities of (aspects of) exponence exhibited by natural languages. To thatend, a formalism was adduced based on morphomic units Mj . All of the variationson identity of exponence in Kayardild submit to a simple and parsimonious accountif it is assumed that realizational rules make reference to morphomic units Mj , whichare decomposable into distinctive features, and which can map in a non-one-to-onemanner onto phonological operations.Having established thatmorphomic unitsMj have a linguistic existence, and having

shown that they cannot in the general case be reduced to simple rules of referral or tophonological operators, in the next section I consider how these morphomic units Mjrelate to other morphomic categories.

. Meromorphomes and metamorphomes

The categories/units Mj are morphological entities which are isomorphic neither withinflectional features nor with phonological operations, that is, they are morphomic inthe classic sense. On the other hand, they are not categories that divide lexemes intoclasses, so they are not rhizomorphomes, rather they are categories that organize theoperations by which individual word forms are composed, piece by piece: they aremeromorphomes. In this section I turn to a third species of morphome and discusshow it relates to meromorphomes and to rhizomorphomes.In 3.3–3.5 I was concerned with matters of exponence, and with the mapping

from inflectional features to phonological exponents. Dealing with that concern inKayardild leads to the positing ofmeromorphomes, categorieswhichmediate betweenmorphosyntactic feature structures and the phonological operations by which indi-vidual pieces of individual word forms are composed. Shifting focus away from themechanics of exponence, we may also consider its end result, that is, the sets ofword forms in a paradigm and how those word forms relate (or do not relate) to agiven meromorphome. For example, given the entire inflectional and/or derivationalparadigm of a root, we might ask which cells contain words whose surface formconsists in part of the realization of a certain meromorphome.14 In that case, weare concerned with the distribution of a certain meromorphome, or equally, withthe portion of a paradigm characterized by the incidence of realizations of that

14 Put another way, these are the set of words in whose morphomic representations the meromor-phomes appear.



meromorphome. For example, the celebrated third stem of Latin (Aronoff 1994),viewed as a piece of the words in which it appears, could be considered the realizationof a single meromorphome, insofar as that meromorphome is anisomorphic with anynatural class of morphosyntactic features, and has a distinctive pattern of identity ofexponence as one surveys the inflectional and derivational paradigm of a root. Thethird stem meromorphome also has a certain distribution within the full paradigmof a root, or equally, there is a portion of the paradigm that is characterized bythe incidence of its realizations. Paradigm-internal distributions such as these aremorphomic categories in themselves (Maiden 2005), and are sometimes referred toas ‘morphomes’ in the literature. Yet they are not meromorphomes, nor are theyrhizomorphomes. A suitable term for them is a metamorphome, since they inherein a pattern to be found across multiple word forms, distributed within a paradigm.

. Commonalities in morphomic complexity

To tie back more explicitly to our theme of complexity I would like to accentuate acommon thread in the nature of complexity in all three species of morphome. In 3.5we saw that Kayardild’s meromorphomes are decomposable into distinctive features.It is well established that rhizomorphomes such as inflection classes also fall intosubclasses, expressible for instance in terms of inheritance relationships (Corbett andFraser 1993). Hierarchical organization in metamorphomes, i.e. in the distributionsof realizations of meromorphomes across a paradigm has been discussed by Maidenand O’Neill (2010) and O’Neill (2011). To the extent that these observations reflect ageneral tendency in the structure of morphomic categories, two conclusions can bedrawn regarding the nature of complexity at the morphomic level.First, not only do morphomic categories exist and play a significant role in the

morphological systems of some languages, but in the general case those categoriesare divisible into subcategories. In one sense, this is extreme complexity. If Anderson(this volume) is correct to claim that the very existence of morphology is a form ofcomplexity, then the existence of purely morphological categories, anisomorphic withother categories in the grammar (i.e. morphomes), is a second layer of complexitywhich is parasitic on the first, and the existence of subcategories among them is athird layer parasitic upon the second.Second however, to the extent that all three species of morphome are divisible

into subcategories, their structure appears much like that of any other grammaticalcategory once subjected to a formal analysis. Thus, while the existence of morphomesand their subcategories may be an instantiation of complexity, their structure, andindeed their organization into an autonomous level of representation, appears to beanything but complex, in the sense that it adds little if anything to the range of typesof architectures (as opposed to the number of domains) attested among grammaticalsubsystems. Morphomic categories subdivide just as other grammatical categories do,


Erich R. Round

and they map onto foreign categories anisomorphically, just as other grammaticalcategories do. Morphomes’ existence may be a matter of complexity, but their formalstructures are of a kind which is nothing if not familiar.

. Conclusion

I have proposed that our theory recognize three species of morphome. Rhizomor-phomes are morphomic categories pertaining to sets of roots.They divide the lexiconinto classes such as conjugations and declensions. Meromorphomes are categoriespertaining to sets of word formation operations, which derive the pieces of individualword forms. Meromorphomes can underlie both complete and partial identities ofexponence, identities of pattern of exponence, and identity of dimensions of expo-nence. Metamorphomes are distributions across a paradigm, of cells which containpieces of exponence that are realizations of meromorphomic categories. Morphomicpatterns such as the L, U, and N morphomes of Romance are metamorphomes.15 Allthree species of morphome are divisible into subcategories, though I have suggestedthat this is unremarkable given that morphomes are categories in a grammaticalsubsystem.In the main part of the chapter, I argued that identity of exponence extends well

beyond the expressive capability of simple rules of referral. A parsimonious analysisof sufficient power is obtained by permitting rules of exponence to make reference tomeromorphomes, which can thenmap to phonological operators in a non-one-to-onefashion.

. Appendix

Table 3.4 lists the seven inflectional features of Round’s (2013) analysis of Kayardild,and for each feature, its possible specified values, and the morphomic exponent(s) ofthose values, cited in terms of primary morphome features (allomorphy and juncturefeatures are not shown). The symbol ‘∗’ indicates that the feature-value shares itsmorphomic exponent exactly with at least one other; ‘†’ indicates that it shares atleast part of its morphomic exponence with at least one other; and ‘‡’ indicates that itsmorphomic exponence is also employed derivationally. Table 3.5 lists the phonologicalexponents for primary morphomes in Table 3.4

15 The ‘form cells’ of the inflectional theory proposed by Stump (, , inter alia) are alsometamorphomes. The analysis of Kayardild adduced here shares several properties with Stump’s theory,and further investigation into their similarities is warranted.



Table . Inflectional features and their morphomic exponents

Feature Values and morphomic exponents

comp: [+]→ μobl∗sej: [+]→ μloc∗†‡neg: [+]→ μneg∗‡tama: antecedent→ μcons∗†‡, continuous→ μobl∗, directed→ μloc-μall∗†‡,

emotive→ μobl∗, functional→ μutil∗, future→ μprop∗†‡, incipient→ μdat-μmid-j†, instantiated→ μloc∗†‡, negatory→ μpriv∗†‡,precondition→ μloc-μabl†‡, present→ μloc∗†‡, prior→ μloc-μabl∗†‡

tamt: actual→ μpriv†‡a, antecedent→ μn-μcons†, apprehensive→ μappr,desiderative→ μdes, directed→ μloc-μall∗†‡, hortative→ μobl∗,immediate→ μloc∗†‡, imperative→ μneg∗‡b, incipient→ μn-μdat-μmid-j†,past→ μcons†‡, potential→ μprop∗†‡, precondition→ μcons†‡,progressive→ μn∗†‡, resultative→ μres‡, nonveridical→ μn-μpriv†

case: ablative→ μloc-μabl∗†‡, allative→ μloc-μall∗†‡, associative→ μassoc‡,collative→ μlloc-μinch-th†, consequential→ μcons∗†‡, dative→ μdat-th†‡, denizen→ μden-j-μn†, donative→ μdon-j†‡, genitive→ μgen‡, human allative→ μallh-j†, instrumental→ μinst, locative→ μloc∗†‡, objective ablative→ μablo-th†, objective evitative→ μevito-th†, oblique→ μobl∗†‡, origin→ μorig, privative→ μpriv∗†‡,proprietive→ μprop∗†‡, purposive→ μallh-μmid-j†‡, subjective ablative→ μablo-μmid-j†, subjective evitative→ μevito-μmid-j†, translative→ μdat-μmid-j†, utilitive→ μutil∗

num: dual→ μdu‡, plural→ μpl

a Cumulative exponence of tamt:actual & neg:[+]b Cumulative exponence of tamt:imperative & neg:[+]

Table . Phonological exponents of morphomes

μabl→ {napa, naa} μgen→ kara≈ μobl→ intaμall→ {˙iŋ, ˙uŋ} μallh→ cani μevito→ wa‰luμappr→ «ara μinch→ wa μorig→ wa‰«μassoc→ ≈uru μinst→ ŋuni μpl→ palattμcons→ {ŋarpa, ŋara} μlloc→ ki‰ μpriv→ wariμcons→ ŋarpa μloc→ ki μprop→ {ku˙u, kuu}μdat→ma˙u μmid→ i μres→ iri«μden→ wi3i μn→ n μutil→maraμdes→ ta μneg→ ≈aŋ j→ cμdon→ wu μablo→ wula th→ tμdu→ kiarŋ μloc&μobl→ kurka


Erich R. Round

Acknowledgements

For the opportunity to present and discuss this research in such a collegial andstimulating forum, I would like to express my gratitude to the organizers and theparticipants of the Conference on Morphological Complexity, held in London, Jan-uary 2012. For invaluable comments and discussion during the development of theseideas, my thanks to Steve Anderson, Mark Aronoff, Matthew Baerman, DunstanBrown, Grev Corbett, Nick Evans, Martin Maiden, Paul O’Neill, Andy Spencer, GregStump and two anonymous OUP reviewers. Institutionally, I wish to acknowledge thesupport of theNSF (grant BCS 844550), Australian Research Council (grant ‘Isolation,Insularity and Change in Island Populations’), the Endangered Languages Project(grants IGS0039 and FTG0025), and the School of Languages and Cultures at theUniversity ofQueensland.All shortcomings, errors or oversights in the chapter remainmy own.


Morphological opacity: Rulesof referral in Kanum verbs

MARK DONOHUE

. Complexity

In recent years the term ‘complexity’ has been used in linguistics publications invarious ways. At least four senses of ‘complexity’ can be discerned:

1. Size. A language (/subsystem of a language) is said to be complex if it has a lot ofmembers. In this sense a complex pronominal system would include a large numberof pronouns; a complex phoneme system would have a lot of phonemes; complexverbal inflection would have many inflectional possibilities (see also section 2.1.1 ofAnderson’s chapter, this volume).

2. Dimensions. A language (/subsystem of a language) is said to be complex if thedescription of its component parts requires a lot of variables. In this sense a complexpronominal system would differentiate multiple numbers, genders, and persons; acomplex phoneme system would require many features to describe; complex verbalinflection would have many inflectional possibilities for a large range of differentgrammatical categories.

3. Rarity. A language (/subsystem of a language) is said to be complex if it includeselements that are cross-linguistically infrequent. In this sense a complex pronominalsystem would include categories that are only occasionally found (such as pronounsthat include reference to generational differences between speaker and addressee, asin various languages of Australia); a complex phoneme system would include unusualphonemes, such as velar laterals or linguolabial fricatives; complex verbal inflectionwould include categories rarely found on verbs, such as night-time status of the event,or heaviness of the object (both attested in Berik).

4. Transparency. A language (/subsystem of a language) is said to be complex ifthere is not a direct mapping from the features present and the expression of thosefeatures. In this sense a complex pronominal system might include irregular corre-


Mark Donohue

spondences of singular and plural pronouns (such as Meryam Mer, in which the firstperson plural inclusive is expressed with the morpheme for second person to which aplural suffix is added); a complex phoneme systemmight require the features [±voice],[±labial], [±coronal], and [±dorsal], but only include the phonemes /p t d k/ (such asFinnish, or Bobot and Emplawas from eastern Indonesia); a complex system of verbalinflection might include suppletive, or portmanteau forms, or irregularly use the‘wrong’morphological forms to express features in some contexts (such as using pluralforms to indicate a passive voice, as in Lingala). (See also section 2.2 of Anderson’schapter.)

These different senses are, at least theoretically, not all independent variables; themore dimensions in a system, the greater the potential size, and the greater thechance that there is non-transparent mapping between potential categories and actualcategories. Similarly, the nature of rarity means that it is more likely to emerge as asmall part of an otherwise ‘well-behaved’ (from a cross-linguistic perspective) system,since rarity is not an absolute, but rather something embedded in commonality.Nonetheless, these different senses have all been used when discussing ‘complexity’.I shall concentrate on some aspects of the last sense, transparency, in the discussionthat follows, while making reference to the other senses as appropriate. The mainmedium for the discussion is verbal inflection in Kanum, a language of southernNew Guinea. In Kanum, which shows verbal inflection for subjects, objects, andtense, we see elaborate systems of oppositions created with a relatively small set ofdistinct morphemes. These morphemes have regular distributions to realize differentinflectional categories, but we also find many instances in which elements of theparadigmaremarked by referral fromother inflectional cells (see discussion inZwicky1985, Stump 2001, Baerman 2004). I will exemplify some of these cases of referral, andshow that there are patterns underlying this irregular behaviour.

. Kanum verbal inflection:

Kanum is a language of southern New Guinea just north-west of the Torres Strait(Boelaars 1950, Drabbe 1947, 1950); the variety described here is known to its speakersas Ngkaolmpw Ngkaontr Knwme. The language has extensive agreement for bothsubject and object on the verb, as well as tense, and has an extensive case-markingsystem, and shows elements of non-configurationality (Donohue 2011).

Verbs agree with the number (and person, if plural) of their subject by suffix,and with person, number, and gender of their object (if bivalent) by prefix; tenseinformation is distributed about the verb.The simplest version of this schema is shownin (1), and is illustrated in the sentences in (2) and (3).1 Comparing the two examples, it

1 Kanum examples are presented in an orthography that follows IPA conventions, with the exceptionthat /ŋ/ is represented by <ng>, /j/ by <y>, /æ/ by <ä> and /'/ by <â>. Many clusters are broken up


Morphological opacity: Rules of referral in Kanum verbs

is not hard to identify the prefix kn- and the suffix -y in (1) and (2) attached to the rootseyerk and ew. Examining (4), it is clear that the prefix kn- marks the second person(singular) object; comparing with (5) we see that the suffix -y marks the first personplural subject.

Basic verbal agreement

(1) OBJECT-verb.root-SUBJECT

(2) (Nynta1pl.erg

mpw)2abs

kwneyerky.we:snuck.up.on:you:yesterday

‘We snuck up on you (yesterday).’

(3) (Nynta1pl.erg

mpw)2abs

kwnamply.we:laughed.at:you:yesterday

‘We laughed at you (yesterday).’

(4) (Nynta1pl.erg

py)3abs

swamply.we:laughed.at:them:yesterday

‘We laughed at them (yesterday).’

(5) (Pynta3pl.erg

mpw)2abs

kwnample.they:laughed.at:you:yesterday

‘They laughed at you (yesterday).’

Neither the prefixes nor the suffixes in (2)–(4) contain only pronominal informa-tion; they also convey tense categories. The full inflectional paradigm for ampl ‘laughat’, showing subject, object, and tense inflections by prefix and suffix, is shown inTable 4.1; forms that we have already seen in (2)–(4) are shown in bold. From theoutset we note that there is no difference in inflection between 2pl and 3pl objects,in fact, these forms are the same as the 3sg.m forms; similarly, the 1pl forms areidentical to the 2sg forms. Given that we know that these forms are distinguishedin the pronominal system of Kanum, as evidenced in the free pronouns shown inTable 4.2 and by the contrasts made in the subject suffixes, we must assume thatthese categories are subject to rules of referral, shown in (6), which assigns the valuesgiven for the 2sg object to the 1pl cell (a pattern that is well attested in New Guinea),and another that assigns the 3sg.m form to the 3pl and 2pl cells. This means thatthe s-/y- object prefixes are better thought of as unspecified for person, number, orgender.They are blocked from appearing with first person, feminine, or 2sg referencebecause of the existence of more highly specified morphemes that do not incur a classof features.

by epenthesis or the syllabification of a liquid or glide. As an example sentence () can be pronounced as[nınmta‰ mpFw kF mnejerkıj].


Mark Donohue

Table . Inflection for ampl ‘laugh at’ with different subjects, objects, and tenses

object

subject 1sg 2sg 3sg.m 3sg.f 1pl 2pl, 3pl

sg F b-√-nt nt-

√-nt sr-

√-nt ta-

√-nt nt-

√-nt sr-

√-nt

P w-√

n-√

y-√

a-√

n-√

y-√

T w-√-y n-

√-y y-

√-y a-

√-y n-

√-y y-

√-y

Y kww-√

kwn-√

sw-√

kwa-√

kwn-√

sw-√

R w-√-w n-

√-w y-

√-w a-

√-w n-

√-w y-

√-w

1pl F br-√-ntey nt-

√-ntey sr-

√-ntey ta-

√-ntey nt-

√-ntey sr-

√-ntey

P w-√-y n-

√-y y-

√-y a-

√-y n-

√-y y-

√-y

T w-√-ns n-

√-ns y-

√-ns a-

√-ns n-

√-ns y-

√-ns

Y kww-√-y kwn-

√-y sw-

√-y kwa-

√-y kwn-

√-y sw-

√-y

R w-√-ay n-

√-ay y-

√-ay a-

√-ay n-

√-ay y-

√-ay

2pl F br-√-ntey nt-

√-ntey sr-

√-ntey ta-

√-ntey nt-

√-ntey sr-

√-ntey

P w-√-e n-

√-e y-

√-e a-

√-e n-

√-e y-

√-e

T w-√-ns n-

√-ns y-

√-ns a-

√-ns n-

√-ns y-

√-ns

Y kww-√-e kwn-

√-e sw-

√-e kwa-

√-e kwn-

√-e sw-

√-e

R w-√-ay n-

√-ay y-

√-ay a-

√-ay n-

√-ay y-

√-ay

3pl F br-√-nteme nt-

√-nteme sr-

√-nteme ta-

√-nteme nt-

√-nteme sr-

√-nteme

P w-√-e n-

√-e y-

√-e a-

√-e n-

√-e y-

√-e

T w-√-ns n-

√-ns y-

√-ns a-

√-ns n-

√-ns y-

√-ns

Y kww-√-e kwn-

√-e sw-

√-e kwa-

√-e kwn-

√-e sw-

√-e

R w-√-ay n-

√-ay y-

√-ay a-

√-ay n-

√-ay y-

√-ay

F: future; P: present; T: today’s past; Y: yesterday’s past; R: remote past;√: verb root.

Table . Free pronouns in Kanum, absolutive, andergative forms

Absolutive Ergative

sg pl sg pl

1 ngkâ ny 1 ngkay nynta2 mpw mpw 2 mpay mpwnta3 py py 3 pyengkw pynta



(6) Object prefixes in the present (from Table 4.1)1sg object = w + stem2sg object = n + stem3sg.m object = y + stem3sg.f object = a + stem1pl object = n + stem2pl object = y + stem3pl object = y + stem

Since the focus of this chapter is the syncretisms in the agreement morphology, wecan unproblematically segment off the tensemorphemes that are not also specified forfeatures of the subject, and assign approximate features to the other affixes, in (8). Notethat there are two suffixes of the form -y, and two of the form -e; Kanummorphologyadmits high levels of syncretism, as is also clear from the absolutive pronominal formsin Table 4.2 (see Baerman 2004, Baerman et al. 2005 for a more extended discussionof kinds of syncretisms and their modelling). The template for the verb is shown inTable 4.3, which shows which feature types are coded in which positions, and wherewe can see that the syncretic verbal suffixes occupy different positions in the verbtemplate. Since some of the verb roots are suppletive for number of object, or fortense, the verb root too is marked as potentially showing features of the sort found inthe inflectional affixes. Of the three inflectional categories, tense, subject, and object,very few affixes are constrained to mark features of only one category, a fact that willbecome relevant later in the exposition.

(7)Tense affixes not portmanteau with subject features (from Table 4.1)k- YESTERDAY’S PAST Not generic objectsw- YESTERDAY’S PASTr- FUTURE Not 2sg/1pl, or feminine objects-nt FUTURE

Table . Kanum verbal inflection

−4 −3 −2 −1√ +1 +2 +3

Object + + + (+)Subject + + +Tense + + (+) + + +Number + + + (+) + +Person + + + +

k- w- b-/w- r- -nt -e1 -mes-/y nt-/n- -y1 -a -y2

ta-/a- -ns -e2


Mark Donohue

(8)Affixes primarily marking agreement (from Table 4.1)b-/w- 1sg objectnt-/n- 2sg/1pl objectta-/a- 3sg.f objects-/y- object (= 3sg.m, 2pl, 3pl)-y1 sg subject (Today’s past)-w sg subject (Remote past)-e1 pl subject (Future)-y2 pl subject-e2 2pl/3pl subject (Present, Yesterday’s past)-me 3pl subject (Future)-ns pl subject (Today’s past)-a pl subject (Remote past)

Based on the preceding discussion, we can trivially predict the forms in (9),expanding from (2) and (3). When we examine forms with a 2pl object, however, wefind a problem in the regular productivity of the paradigms. The non-future formsare as predicted for both verbs. For ‘laugh at’ the rest of the paradigm follows Table4.1 (unsurprising, since Table 4.1 is based on the regular paradigm found with theverb ampl ‘laugh at’). For eyerk, however, the future member of the paradigm shows atakeover of the 2pl by the forms used for 2sg and 1pl object, and subsequent referral ofthese forms to the 3pl as well (shown in (13)).There are three points of note associatedwith this change in the paradigm. First, the takeover is restricted to the future; thesyncretism between 3sg.m, 2pl, and 3pl still holds in the non-future tenses. Second,the takeover affects only part of the syncretic {3sg.m, 2pl, 3pl} set. In (11) we can seethat a 3pl object does not have the same paradigm as a 2pl object (even excludingthe eligibility for the r- tense prefix): a rule that affects the 2pl need not necessarilyaffect the 3pl forms. Although the new referral of the 2sg/1pl forms to the 2pl cell hasdisrupted the referral of the 3sg.m to the 2pl, it has not disrupted the 3sg.m → 3plreferral, and does not affect the forms seen in (12) for a 3sg.m object.

(9) ‘We ___ you.’ ‘sneak up on’ ‘laugh at’FUTURE nt-eyerk-nt-e-y s-r-ampl-nt-e-yPRESENT n-eyerk-y n-ampl-yTODAY’S PAST n-eyerk-ns n-ampl-nsYESTERDAY’S PAST k-w-n-eyerk-y k-w-n-ampl-yREMOTE PAST n-eyerk-a-y n-ampl-a-y

(10) ‘We ___ you.pl.’ ‘sneak up on’ ‘laugh at’FUTURE nt-eyerk-nt-e-y s-ampl-nt-e-yPRESENT y-eyerk-y y-ampl-y



TODAY’S PAST y-eyerk-ns y-ampl-nsYESTERDAY’S PAST s-w-eyerk-y s-w-ampl-yREMOTE PAST y-eyerk-a-y y-ampl-a-y

(11) ‘We ___ them.’ ‘sneak up on’ ‘laugh at’FUTURE s-r-eyerk-nt-e-y s-r-ampl-nt-e-yPRESENT y-eyerk-y y-ampl-yTODAY’S PAST y-eyerk-ns y-ampl-nsYESTERDAY’S PAST s-w-eyerk-y s-w-ampl-yREMOTE PAST y-eyerk-a-y y-ampl-a-y

(12) ‘We ___ him.’ ‘sneak up on’ ‘laugh at’FUTURE s-r-eyerk-nt-e-y s-r-ampl-nt-e-yPRESENT y-eyerk-y y-ampl-yTODAY’S PAST y-eyerk-ns y-ampl-nsYESTERDAY’S PAST s-w-eyerk-y s-w-ampl-yREMOTE PAST y-eyerk-a-y y-ampl-a-y

Additional referral for object prefixes in the future seen in (10), for eyerk ‘sneakup on’

(13) 1sg object = w + stem2sg object = n + stem3sg.m object = y + stem3sg.f object = a + stem1pl object = n + stem2pl object = n + stem3pl object = y + stem

. Further kinds of rules of referral in Kanum verbs

The two verbs diverge in other parts of their paradigms as well. With a singularsubject we see that the 2sg object prefix in the future is not nt-, as predicted, but s-n-,combining the generic object with the appropriate 2sg/1pl object prefix. With a thirdperson feminine object we see the entire paradigm in all tenses has been taken overby the 1sg forms. These complications are not found with ampl.

(14) ‘I ___ you.’ ‘sneak up on’ ‘laugh at’FUTURE s-n-eyerk-nt nt-ampl-ntPRESENT n-eyerk n-amplTODAY’S PAST n-eyerk-y n-ampl-yYESTERDAY’S PAST k-w-n-eyerk k-w-n-amplREMOTE PAST n-eyerk-w n-ampl-w


Mark Donohue

(15) ‘I ___ her.’ ‘sneak up on’ ‘laugh at’FUTURE b-eyerk-nt t(a)-ampl-ntPRESENT w-eyerk (a)-amplTODAY’S PAST w-eyerk-y (a)-ampl-yYESTERDAY’S PAST k-w-eyerk k-w-a-amplREMOTE PAST w-eyerk-w (a)-ampl-w

Additional referrals seen in (14) and (15) for eyerk

(16) generic object → 2sg object / future1sg object → feminine object

Our analysis becomes more complex when we consider data from different verbs.With wr ‘bite’ we find that the 1sg object forms take over the 3sg.f objects, but onlywith yesterday’s past tense. We do not find the 2sg/1pl takeover of the 2pl that wasseen in (10), but we do see a spread of the singular subject suffixes to cells with pluralsubjects, but only when the object is generic. The forms in (17) show the expectedaffixes for the first column, we>you. With the second (we>him) column we seethat either the expected plural affixes are absent, or else they have been replacedwith singular subject affixes. This is not the case for 2pl and 3pl objects, which arecompatible with plural subject suffixes. In the third column (we>her) we find theexpected plural subject suffixes, but with the 1sg object prefix in place of the feminineobject prefix. The verb ‘bite’ shows takeovers for subjects as well as objects.

(17) ‘bite’ ‘We __ you.’ ‘We __ him.’ ‘We __ her.’FUTURE nt-wr-nt-e-y s-wr-nt-Ø-Ø ta-wr-nt-e-yPRESENT n-wr-y y-wr-Ø a-wr-yTODAY’S PAST n-wr-ns y-wr-y a-wr-nsYESTERDAY’S PAST k-w-n-wr-y s-w-wr-Ø k-w-w-wr-yREMOTE PAST n-wr-a-y y-wr-w a-wr-ay

Referrals seen in (17) for wr

(18) sg subject → pl subject / 3sg.m object1sg object → feminine object / yesterday’s past

I conclude this section with data from one further, highly idiosyncratic verb, makr‘roast’. Table 4.4 shows the expected inflection for this verb, and the attested inflections.We should first note that there is a suppletive form of the verb for the present, today’spast, and yesterday’s past tenses, ekr.2 The contrast between different plurals subjectsis lost entirely, plural subject suffixes invade the singular paradigm and take over othertenses, the present and today’s past object prefixes are taken over, and the yesterday’spast prefix is lost. Combined with the suppletive verb forms there is a high degree

2 The selection of makr, rather than ekr, as the root follows from the nominalized form, makr-ay.



Table . Inflection for makr ‘roast’ withdifferent subjects, objects, and tenses

subject expected attested

sg F s-(r-)√-nt s-r-makr-nt

P y-√

s-ekr-ayT y-

√-y s-ekr-ay

Y sw-√

s-Ø-ekr-ntR y-

√-w y-makr-w

1/2pl F s-√-nt-e-y s-r-makr-nt-e-y

P y-√-y s-ekr-ay

T y-√-ns s-ekr-ay

Y sw-√-y s-w-makr-y

R y-√-ay y-makr-ay

3pl F s-r-√-nt-e-me s-r-makr-nt-e-y

P y-√-e s-ekr-ay

T y-√-ns s-ekr-ay

Y sw-√-e s-w-makr-y

R y-√-ay y-makr-ay

of irregularity with this verb, but only one clear case of suppletion, the makr/ekralternation.

Referrals seen in Table 4.4 for makr

(19) future/yesterday’s past object prefixes → present, today’s pastfuture suffix → yesterday’s past / sg subjectpl subject remote past suffix → present, today’s pastfuture 1/2pl subject suffix → 3pl futureyesterday’s past 1/2pl subject suffix → 3pl yesterday’s past

The forms and analysis presented in Tables 4.1 and 4.2, and in (7) and (8), mightseem, in the light of our extended explication of the verb eyerk ‘sneak up on’, combinedwith data from wr ‘bite’ and makr ‘roast’, to be hopelessly optimistic. The followingsection will consolidate the well-attested patterns that we do find in Kanum verbs,and in section 4.5 we examine patterns of regularity in the rules of referral.

. Opacity and Kanum verbal inflection

To this point we have an agreement system that requires reference to features dis-tinguishing two degrees of number (singular, plural) and two degrees of person(local, nonlocal); this is less dimensions than are attested in other languages, thoughperhaps the features required for the person axis are somewhat unusual (invokingcomplexity-by-rarity). Since the size of the contrast set is smaller than predicted from


Mark Donohue

Table . Opacity and transparency in pronominal systems

(very) transparent Transparent (translucent) Opaquesg pl sg pl sg pl sg pl

local α α− γ local α β local α β local α ‚nonlocal β β − γ nonlocal γ δ nonlocal α γ nonlocal · α

Table . Tense distinctions portmanteau withagreement affixes

object subject

Future α αPresent β βToday’s past β γ

Yesterday’s past γ βRemote past β δ

the dimensions involved, however, there is a level of opacity involved. For example,Table 4.5 uses the dimensions that we have seen are relevant for a description ofKanum, and explores some of the possibilities along a cline between transparent andopaque, using the Greek letters α, β, γ, and δ to designate the contrasts that the systemmarks. In a transparent system all of the possible oppositions defined by the featuresdistinguished along the different dimensions of variation are attested. In a maximallyopaque system the two dimensions, with two degrees of variation each, are requiredto make a distinction between only two marked categories. The Kanum person andnumber system falls between these two extremes, marking three differences by notdistinguishing person in the singular.

We have seen that tense is marked differently on the portmanteau affixes that indexsubjects and objects, as shown inTable 4.6 (based on the data inTable 4.1; again, the useof α, β, etc. is only intended to represent contrasts within each of the paradigms). Onlyfor the future do the categoriesmatch; because of themismatches, five tense categoriesare distinguished in a distributed fashion. Clearly we need to investigate the markingof other types of objects in order to understand the workings of the Kanum verb inorder to understand things better.

. A wider survey of rules of referral in Kanum verbs

We can construct a set of regular agreement affixes for verbs, shown in Table 4.1;based on these forms, which show the syncretisms displayed in Table 4.7, we can plotthe irregularities found with object agreement in eyerk ‘sneak up on’ and wr ‘bite’ in



Table . Pronominal distinctions in the idealizedagreement affixes

Object Subjectsg plsg pl

α (β)α (γ)α (δ)

1233f α

1 α β2 β γ3 γ γ3f δ

Table . Takeovers in the object prefixes for two verbs

wr Idealised eyerk

1sg2sg3sg3sg.f1pl2pl3pl



Table 4.8. While unpredictable, the takeover of paradigmatic cells always proceedsfrom singular forms to plurals. The complexity we see is irregular, but does appearto follow some rules: no new forms are found, and only a small subset of all possibletakeovers is attested.

(Maximally two distinctions are made in the plural forms for subjects as well asobjects. Depending on the tense, β = γ = δ, β = γ �= δ, or β �= γ = δ. Acrosstenses, α = β is attested.)

We do find most variation (from the idealized paradigm in Tables 4.13 and 4.14) inthe object prefixes, but the other inflectional categories, subject and tense, also showreferral. In the following section we shall examine the results of a survey of verb formsin Kanum, for the kinds of rules of referral they employ (see also section 5.5 in Koenigand Michelson’s chapter from this volume).

. Patterns in the rules of referral

We can examine a wider selection of verbs, and examine not just the rules of referralfound with object agreement, but also with subjects and tense. The results of thissurvey are shown in Tables 4.9–4.11, which show which verbs exemplify a particular


Mark Donohue

Table . Referrals in the object prefixes

1sg 2sg 3sg, 2/3pl 3sg.f 1pl 2/3pl

1sg → wr �= eyerk,rmpatwr,wr

âwmp, wr �=

2sg, 1pl → �= arwar, yntakn �= âwmp,eyerk

3sg, 2/3pl → �= e, eyerk,rsa, yntakn

�= wkâ �=

3sg.f → �= �= �= �= �=pl → �= �= �= �= �= �=

Table . Referrals in tense affixes

Future Present, Yesterday Today’s Past,Remote Past

Future → âwmp, e, erm, mak �=Present,

Yesterday’s Past,Remote Past

→ atwa, âw, âwâ, erm,lmpa, nkw

�=

Today’s past → �= ayngkâ, ey, lmpa, wâw �=

Table . Referrals in the subject suffixes

sg 1pl 2pl 3pl

sg → �= ânta, wme, wr ânta, wr1pl → �= âwmp eyr2/3pl → aprmngk ayngkâ, eyr, wâw

extension of their idealized paradigm to another cell.3 Of this sample, fully twenty-twoof the thirty verbs show some area in which the form in one paradigmatic cell extendsto another. In terms of object takeovers, the plural forms, and the feminine form, neverextend to mark other person/numbers.The 1sg object is immune from takeovers, and

3 The thirty verbs examined are: âlmyn ‘track’; ampl ‘laugh (at), have fun (with)’; ânta ‘feverish.pl.pst’;aprmngk ‘make’; arwar ‘call’; atwa ‘vomit’; âw ‘see’; âwâ ‘fetch’; âwmp ‘wash’; ayngkâ ‘fall’; e ‘tell’; erm‘shoot.sg’; ew ‘see’; ey ‘die’; eyerk ‘sneak up on’; eyr ‘sleep’; lmpa ‘be angrywith’;makr ‘roast’; nkw ‘hit linearly’;rmpatwr ‘jump (at)’; rmpwl ‘hit’; rsa ‘hit’; rwar ‘call’; wâ(w) ‘be’; wkâ ‘see’; wme ‘stay at’; wmpe ‘wash (tr.)’;wr ‘bite’; yntakn ‘trick’. Verbs that appear in more than one of Tables – are shown in bold; only one verb,âwmp ‘wash’, shows takeovers for all of subject, object, and tense inflection. A ‘�=’ in a cell indicates that noextension occurs in that cell, and shading indicates the impossibility of extension.



1sg is the only form that can take over the 3sg.f role. In a similar vein, the 2sg/1plforms are the only ones that can extend to the 3sg/2pl/3pl category, in one casebreaking the uniformity of that group, and the converse is also true. In all, nine ofthe thirty verbs in the survey show paradigmatic takeovers for object agreement. Interms of tense takeovers, the Today’s Past and Remote Past are immune from othertense forms taking over their role. The future is never taken over by the Today’s Past,and the spread of a Present form to the Future is the most common form of tensetakeovers. Eleven of the thirty verbs show paradigm takeovers for tense. With thesubject suffixes eight of the thirty verbs show paradigm takeovers, though becauseof the extreme portmanteauing of subject and tense we should consider the numberof cases of takeovers for subjects to be sixteen, almost twice as many as for objects andincluding nearly all of the verbs in the survey that do show referrals. Further, there arefewer constraints on the direction of takeovers for subject suffixes than those seen forobject prefixes, or tense affixes, with the only restriction being on the non-swapabilityof 1pl and sg forms.

Given that there are constraints on the kinds of takeovers attested in Kanum, andthat the forms that occupy the cells are not only not randomly varying, but are inall cases regular members of the paradigms, it seems inelegant to simply describethe differences in inflectional paradigms in the verbs as ‘irregular’. Similarly, we arenot discussing some kind of defective or deponent verb behaviour, since all of theinflectional elements are present on the verb. What we have, rather, is a looselyand erratically constrained pattern of alternations in the realization of inflectionalcategories, and in the kinds and number of contrasts that are realized. Nonetheless,the patterns of alternations are constrained, and do not represent wholesale suppletionwithin the paradigm.While there are no new elements in the paradigm to increase thesize of the system, and no extensions of the dimensions of the inflectional paradigm(even portmanteau forms), there is a clear loss of transparency between the featuresspecified for an inflectional cell and the morphemes representing those features. Thefollowing section briefly offers some examples of a loss of transparency in terms ofhow (much) an inflectional category is realized.

. Opacity in the paradigms realized

In addition to the (partial) collapse in transparency we have seen in Kanum, due to theprevalence of rules of referral, degrees of morphological opacity can also be describedin other languages with perfectly regular inflectional paradigms as a result of whatis essentially lexically unpredictable inflectional ‘exuberance’. This can be illustratedwith the Iha examples in (20) and (21). In (20) we can see that two verbal predicatescompounded together take one agreement suffix that applies to the combined verbalcompound; it is not possible for either of the verbal roots to take separate agreementinflection, even though ‘descend’ is eligible to take the same inflectional suffixes as


Mark Donohue

‘fall’ when predicating. In (21) we see that when the evidential morpheme is addedit is added to an already suffixed verb, and that the evidential morpheme itself takessuffixal agreement. It is not grammatical for one of the agreement morphemes to beomitted, apparently presenting the converse of the case in (20), and so establishing adegree of complexity in the predictability of verbal inflection. The functional reasonbehind the difference is evident in (21c): the evidential suffix is capable of taking‘subject’ agreement suffixes that are referentially distinct from those used on themain verb, with a corresponding difference in the kind of evidentiality asserted.The fact that the inflected evidential morpheme alone is enough to form a (short)sentence, as in (22), indicates that the uses in (9b) and (9c) represent an only slightlymodified grammaticalization from an independent verb, making the parallels withthe compound in (20) more striking. Although there is a functional explanation forthe differences in inflection, we are still faced with the fact that two different kindsof combinations of contentful verb-like morphemes show very different inflectionalbehaviour. Since suffixed auxiliaries in Iha, such as in the example shown in (23),pattern more like the serial verb construction seen in (20) we really cannot finda purely morphological explanation for the differential behaviour of the evidentialmorpheme with respect to inflection.

(20)

Iha

a. Ih-mofruit-that

hu-hoqpow-dya.descend-fall-3tpst

‘That fruit fell down.’

b. ∗Ihmohudya-hoqpowdya

(21) a. Ki-kne-dya.2-see-3tpst‘They saw you yesterday.’

b. Ki-kne-dy(a)-e-da.2-see-3tpst-evid-3‘They saw you yesterday, they say.’

c. ∗kikneedya, *kikneeda, *kiknedyae

d. Ki-kne-dya-te-n.2-see-3tpst-evid-1sg‘They saw you yesterday, I assert.’

(22) Te-n.evid-1‘Really (I saw it with my own eyes)!’



(23) Ki-qpreg-myo-n-do-mb-on.2-follow-just-conj-aux-ypst-1‘I was just following you.’

These different kinds of opacity, the unpredictability of takeovers in Kanum,and the non-predictability of inflectional realization in Iha, shows that there is aspace between regularity and suppletion. Comparing the data on serial verbs fromPalu’e (Austronesian, Southern Indonesia) and Skou (Skou family, north-central NewGuinea; Donohue 2008) we can see that this kind of morphological unpredictabilityextends across word boundaries: in Palu’e serial verbs allow for only one agreementprefix, while in Skou each predicate in a serial verb requires an agreement prefix.This,however, is consistent on a language level, but represents a level of opacity in the designof grammatical forms.

(24) Palu’e

a. Kam-kha1pl-eat

lama-pue.rice-mung.bean

‘We ate rice mixed with mung beans.’

b. Kam-1oa1pl-descend

phalugo

laebe.at

nua.house

‘We went down to the house, . . .’

c. ∗Kamboa kampalu . . .

(25) Skou

a. Mè2sg

habag

mè=m-a.2sg=2sg-carry

‘You carried a bag.’

b. Mè2sg

habag

mè=b-é2sg=2sg-get

m-a2sg-carry

me2sg:go

m-e2sg-ascend

pá=ing a.house=def

‘You carried a bag away up into the house.’

c. ∗Mè ha mè=b-é ha re e pá=ing a.

The occurrence of multiple agreement markers is well attested; Aikhenvald (2003)presents an interesting case from Tariana, where identical morphological agreementis found on verbs which do not share the same arguments, and both Anderson(1992) and Ortmann (1999) discuss the complications found in Dargwa. Cases ofhistorical accretion of inflectional morphology, corresponding to different periods ina language’s history, are also well documented (van Driem 1987, Donohue 1999).

The difference between these examples of the unpredictable appearance of agree-ment morphology, whether multiply on the same verb, or in different places in thecomplex predicate, is that Kanum does not show any optionality in the appearance


Mark Donohue

of agreement. Rather, the optionality is in how the relevant paradigm is represented.While some verbs are attested in which the inflectional morphology directly repre-sents the features of the arguments, or tense category, we have seen that it is quitecommon for verbs inKanum to extend the range of one cell of a paradigm into another.While there are constraints on the kind of extensions found, the appearance of suchan extension in a verb’s paradigm is not predictable.

. The middle ground between productivity and suppletion

The preceding discussion has raised the questions of predictability, productivity, andcomplexity (see also Bauer 2004). The presence of rules of referral in any verb’sparadigm is lexically unpredictable, but is highly productive. It does not involve theuse of novel or suppletive forms, but rather involves the extension (or, in some cases,transfer) of a form from one cell of a paradigm to another. In Kanumwe have seen thattakeovers operate most strongly for tense, but that object agreement is most stronglyimplicated in the more complicated (in the sense of unpredictable) verbal paradigms.Whether these patterns can be found in other languages or not remains to be seen.


Morphological complexityà la Oneida

JEAN-PIERRE KOENIG AND KARIN MICHELSON

This chapter is about one particularly rich part of the verbal inflection of Oneida, apolysynthetic Iroquoian language. Morphological referencing of event participantsin Oneida is achieved via a system of fifty-eight pronominal prefixes that are anobligatory part of the inflection of verbs. The sheer number of prefixes, and therelations between them, afford us the opportunity to ask what we believe is a uniqueset of questions about morphological complexity.

Morphological complexity differs from syntactic complexity in significant ways.Issues of morphosyntactic complexity have been mostly approached from two per-spectives: computational complexity, which results from the concatenation of compo-nents in a construction (a phrase, or a sentence), and algorithmic complexity, whichis due to processing sentences in real time. Whether complexity is evaluated at thecomputational or algorithmic level, the kind of complexity that has interested syntac-ticians is syntagmatic, that is, it arises from the fact that the formatives of a sentence arelinearly sequenced. In contrast, morphological complexity is paradigmatic in nature.Thus issues of morphological complexity stem, for example, from the need to selectan affix (in the case of affixal morphology) among many possible choices and theneed to segment words into their formatives (or at least analyse words for purposes ofclassifying inflectedwords into the right paradigm; see Blevins 2013 for discussion). Ofcourse, that morphological complexity differs from syntactic complexity is not novel,and some of the issues that have been addressed specifically about inflectional systemsare the limits on inflectional classes (e.g. Carstairs 1983, Carstairs-McCarthy 1994, anddiscussion in Blevins 2004), optimal principal part systems (e.g. Finkel and Stump2007, Finkel and Stump 2013), and entropy (e.g. Ackerman et al. 2009). Our focushere is to show that the two aspects of language production and comprehension thatwerementioned as specific tomorphology—the paradigmatic notions of selection and


Jean-Pierre Koenig and Karin Michelson

segmentation—can lead to enormous complexity evenwhen one considers just a singleblock of inflectional realizations (or, in traditional terms, a single position class).1

Oneida (Iroquoian) verbs include bound, obligatory prefixes that reference par-ticipants in the situations described by verbs. Thus, the verb form in (1) includesthe prefix luwa-, which indicates that the described situation includes a third personfeminine singular, third person indefinite, or third person dual or plural agent actingon (represented by “>”) a third personmasculine singular patient, while the verb formin (2) includes the prefix shukwa-, which indicates that the described situation includesa third person masculine singular agent acting on a first person plural patient.2

(1) luwa-hlo.lí-he≈3>3m.sg-tell-hab‘she, someone, or they tell him’

(2) shukwa-hlo.lí-he≈3m.sg>1pl-tell-hab‘he tells us’

The prefixes illustrated in (1) and (2)—traditionally labelled pronominal prefixes—occur in a single position class (i.e. in incremental or realizational terms, are theoutput of a single rule block). There are fifty-eight of these prefixes and ever sinceLounsbury’s (1953) seminal work, the complexity of Oneida pronominal prefixes hasbeen considered a hallmark of Iroquoian languages (and a challenge for secondlanguage learners; see Abrams 2006). Oneida pronominal prefixes, then, provide arather unique case of paradigmatic complexity.

The bulk of our chapter is devoted to identifying the formal dimensions alongwhichthe Oneida pronominal prefix system may qualify as paradigmatically complex andwe come back to the nature of paradigmatic complexity in the conclusion. Section5.1 briefly describes Oneida pronominal prefixes. Section 5.2 identifies parameters of

1 See Hankamer () on Turkish, Fortescue () on Greelandic Eskimo, as well as Anderson (thisvolume), for cases where morphology is syntagmatically relatively complex.

2 In the Oneida examples R is a mid, central, nasalized vowel, and u is a high ormid-to-high, back, nasal-ized vowel. A raised period represents vowel length. Voicing is not contrastive. Abbreviations used in themorpheme glosses and in Table . are: a(gent), caus(ative), dp (dual or plural), du(al), ex(clusive person),fact(ual mode), f(eminine), fi (third person feminine singular or third person indefinite), fz(feminine-zoic), hab(itual aspect), indef(third person indefinite), jn (joiner vowel), m(asculine), n(euter), p(atient),pl(ural), pnc (punctual aspect), rep(etitive), sg (singular). The symbol> indicates a proto-agent acting ona proto-patient; for example, m.sg> sg should be understood as third person masculine singular actingon first person singular. / indicates that proto-role is underspecified for the prefix (i.e. the prefix referencessemantic properties of twoparticipants, but notwhich of them is a proto-agent andwhich is a proto-patient).A comma before du or pl indicates that the dual or plural number is specified either for the proto-agent orthe proto-patient; for example, >,du means that there is a first person acting on a second person, wheneither first or second person is dual, or both are dual. The bare numeral , unaccompanied by any numberor gender, abbreviates third person indefinite, third person feminine singular, third person masculine dualor plural, and third person feminine-zoic dual or plural.


Oneida morphological complexity

paradigmatic complexity. Sections 5.3, 5.4, and 5.5 focus on how Oneida pronominalprefixes stack up on three of these parameters (size of the space of morphologicaldistinctions to be marked, semantic ambiguity, and directness). Section 5.6 discussesone aspect of paradigmatic complexity specific to Oneida pronominal prefixes that,to our knowledge, has not been discussed in the literature, and that is the possiblemisidentification of pronominal prefixes in verb forms. Section 5.7 concludes thechapter.

. A brief description of Oneida pronominal prefixes

The Oneida verb has an elaborate internal structure. Stems can be complex, derivedvia prefixation, suffixation, and noun incorporation. In addition to the obligatorypronominal prefixes, verb forms must have an aspect suffix or an imperative ending(often ‘zero’). The aspectual categories are habitual (basically imperfective), punctual(perfective), and stative (having stative, perfect, or progressive meaning dependingon the verb). Verbs in the punctual aspect always occur with one of three modalprefixes, the factual, future, or optative (usually the factual in this chapter). In additionto a modal prefix, verb forms can have one or more of eight other prepronominalprefixes.

An example of a typical verb in Oneida is given in (3).

(3) s-a-huwRn-ákt-a-ht-e≈rep-fact-3>3m.dp-go.to.a.point-jn-caus-pnc‘they pushed them back, they made them retreat’

Pronominal prefixes are portmanteau-like. Although it is often possible to associateparts of the prefixes with some features (e.g. ti or wa with plurality), it is widelyaccepted that a single prefix references one or two participants as a whole. Thus,although one may associate the initial l with masculine in the prefix luwa- ‘3>3m.sg’,one cannot segment the prefix into two subparts, each referencing a distinct partic-ipant in the described situation. More generally, even parts of prefixes most easilyassociated with particular attribute-value pairs do not allow a segmentation intoproto-agent and proto-patient parts. Consider the prefixes li- (referencing a 1sg proto-agent acting on a 3m.sg proto-patient) and lak- (referencing a 3m.sg proto-agent actingon a 1sg proto-patient). One can recognize l in both as markingmasculine gender, butit marks the masculine gender of the proto-patient in the first case and the masculinegender of the proto-agent in the second case.

Semantic categories distinguished by the pronominal prefixes are person (first,second, third, plus an inclusive/exclusive distinction), number (singular, dual, plural),and gender (masculine, feminine, feminine-zoic, neuter). In the singular, feminine-zoic gender is used for some female persons (see Abbott 1984) and for animals thatare not personified and marked with the masculine. It is also used for all female



persons in the dual and plural, as the feminine gender is restricted to the singular. Notethat neuter gender is a semantic category only (as explained later). In addition, thereis a third person indefinite (‘indefinite’ in Lounsbury 1953, or ‘nonspecific’ in Chafe1977) translated as ‘one, people, they, someone’. Because of the number of propertiesthe prefixes can reference and because, as we will elaborate, prefixes mark up to twosemantic arguments, the number of prefixes for this inflectional slot is quite large,fifty-eight in total. The fifty-eight prefixes are given in Table 5.1, based on Table 6 inLounsbury 1953; the prefixes are numbered as in Lounsbury, and later on, when werefer to specific prefixes, we often identify the prefix with this number, preceded by ‘L’(for Lounsbury).

The prefixes fall into two categories. ‘Transitive’ prefixes mark two animate argu-ments, as in the example in (4), repeated from (2). In Table 5.1, the properties of theproto-agent that aremarked by transitive prefixes are given in the leftmost column andproperties of the proto-patient are given in the top row. The prefix in example (4) isused for a third person masculine singular proto-agent acting on a first person pluralproto-patient.

(4) shukwa-hlo.lí-he≈3m.sg>1pl-tell-hab‘he tells us’

‘Intransitive’ prefixes mark the single argument of monadic verbs. There are twocategories of intransitive prefixes: an A(gent) set (‘subjective’ in Lounsbury 1953) anda P(atient) set (‘objective’ in Lounsbury 1953), exemplified in (5) and (7), respectively.In Table 5.1 agent prefixes are given in the column headed by (for no patient) andpatient prefixes are given in the row labelled (for no agent). Verbs lexically selectfor agent/patient, although the distribution is not without semantic generalizations.Intransitive agent and patient prefixes are used also with dyadic or triadic verbswhen there is only a single animate semantic argument and the other argument(s)is inanimate, as shown in (6), which has the same agent prefix as the example in (5).

(5) wa-ha-ya.kR-ne≈fact-3m.sg.a-go.out-pnc‘he went out’

(6) wa-ha-yRtho-≈fact-3m.sg.a-plant-pnc‘he planted (it)’

(7) lo-nolú.se-he≈3m.sg.p-lazy-hab‘he is lazy’

OU

PC

OR

REC

TEDPRO

OF

–FIN

AL,6/3/2015,SPi

Oneida

morphologicalcom

plexity

Table . Oneida prenominal prefixes (C-stem allomorphs)

1sg

1sg 1du 1pl 2sg 2du 2pl 3m.sg 3fz.sg(agent)

3f.sg3indef 3m.dp 3fz.dp (3n)

52. ku- 53. kni- 54. kwa- 25. li- 1. k- 39. khe-

1 ex.du 26. shakni- 2. yakni- 40. yakhi-

1 ex.pl 27. shakwa- 3. yakwa-

1 in.du 28. ethni- 4. tni- 41. yethi-

1 in.pl 29. ethwa- 5. twa-

2sg 55. sk- 56. skni- 57. skwa- 30. etsh- 6. s- 42. she-

2du 31. etsni- 7. sni- 43. yetshi-

2pl 32. etswa- 8. swa-

3m.sg 34. lak- 35. shukni- 36. shukwa- 37. ya- 31. etsni- 32. etswa- 21. lo- 10. la- 38. shako-

3fz.sg 16. wak- 17. yukni- 18. yukwa- 19. sa- 7. sni- 8. swa- 20. yo- 9. ka- 22. yako-

(patient)

24. loti- 23. yoti- 20. yo-

3f.sg

3indef

46. yuk- 47. yukhi- 48. yesa- 43. yetshi- 33. luwa- 49. kuwa- 11.ye- 58. yutat- 51. luwati- 50. kuwati-

3m.du 14. ni- 45. shakoti-

3m.pl 15. lati-

3fz.du 12. kni- 44. yakoti-

3fz.pl 13. kuti-

(3n) 9. ka-



Theexample in (6) shows that in dyadic verbs only animate arguments aremarked. Butsince all verbs must have a pronominal prefix, verbs without any animate argumentsdefault to an agent or patient feminine-zoic singular prefix, as shown in (8). In otherwords, neuter gender (for inanimates) is a semantic category only; it is not relevantmorphologically (see Koenig andMichelson 2012).3 (In Table 5.1 the default feminine-zoic prefix is given in a cell labelled ‘(3N).’)

(8) wa≈-ka-ná.nawR-≈fact-3fz.sg.a-get.wet-pnc‘it got wet’

All fifty-eight prefixes have varying forms (‘allomorphs’), at least two and up to five,depending on the initial sound of the following stem. There are five stem-classes:C-stems, i-stems, o-/u-stems, e-/R-stems, and a-stems. In addition, thirty-nine ofthe fifty-eight prefixes have two variants depending on what occurs to their left(e.g. whether they occur word-initially or not). Allomorphy is exemplified in (9)with L39 khe-/ khey- and with L33 luwa-/luwR-/luway-/luw-/-huwa-/-huwR-/-huway-/-huw-. In total, there are 326 allomorphs, with an average of about five allomorphsper prefix. Note that the allomorph khe- of L39 occurs with stems that begin with aconsonant or i, while the allomorph khey- occurs with o-/u-stems, e-/R-stems, and a-stems.This is one possible grouping, i.e.C- and i-stems, versus o-/u-stems, e-/R-stems,and a-stems. The allomorphs of prefix L33 luwa-/luwR-/luway-/luw-/-huwa-/-huwR-/-huway-/-huw- exhibit a different grouping, namely e-/R-stems and a-stems (-huw-)versus o-/u-stems (-huway-) versus i-stems (-huwR-) versus C-stems (-huwa-). Thereare eleven such groupings.

(9) a-stem a-stemwa≈-khey-atnúhtuht-e≈ wa-huw-atnúhtuht-e≈fact-1sg>3-wait.for-pnc fact-3>3m.sg-wait.for-pnc‘I waited for her or them’ ‘she or they waited for him’e-/R-stem e-/R-stemwa≈-khey-Rhahs-e≈ wa-huw-Rhahs-e≈fact-1sg>3-belittle-pnc fact-3>3m.sg-belittle-pnc‘I belittled her or them’ ‘she or they belittled him’

3 The morphological irrelevance of neuter raises the question of whether neuter is relevant at all forOneida morphosyntax, as a reviewer points out. The answer is that it still is, as it is a fact of Oneidagrammar that only animate arguments are referenced morphologically. The statement of this fact requiresthe semantic differentiation of animate and inanimate semantic indices (see rule () and Koenig andMichelson for details and Koenig and Michelson, in press, for how depersonalization can be usedfor communicative purposes).



o-/u-stem o-/u-stemwa≈-khey-ótyak-e≈ wa-huway-ótyak-e≈fact-1sg>3-raise-pnc fact-3>3m.sg-raise-pnc‘I raised her or them’ ‘she or they raised him’i-stem i-stemwa≈-khe-(i)hnúks-a≈ wa-huwR-(i)hnúks-a≈fact-1sg>3-fetch-pnc fact-3>3m.sg-fetch-pnc‘I went after, fetched her or them’ ‘she or they went after, fetched him’C-stem C-stemwa≈-khé-kwaht-e≈ wa-huwá-kwáht-e≈fact-1sg>3-invite-pnc fact-3>3m.sg-invite-pnc‘I invited her or them’ ‘she or they invited him’

C-stem (initial allomorph)luwa-kwát-ha≈3>3m.sg-invite-hab‘she or they invite(s) him’

. Evaluating paradigmatic complexity

As we mentioned in the introduction, our goal in this chapter is to delineate a form ofcomplexity only morphology exhibits, what we call paradigmatic complexity. We willanchor our discussion of paradigmatic complexity to the number and kinds of rulesneeded for realizing the morphological distinctions expressed by Oneida pronominalprefixes. By discussing the issue in the context of rules for realizing morphologicalfeature bundles (so-called rules of exponence, see Stump 2001), we can more easilyprovide possible measures of paradigmatic complexity and evaluate Oneida pronom-inal prefixes on these possible measures. We provide more speculative remarks onhow and why these measures can serve as indices of paradigmatic complexity in theconclusion.

Since pronominal prefixes encode properties of participants in the event describedby a verb, one can think of the proper use of pronominal prefixes by speakers asthe result of the correct application of two sets of rules (or constraints). The firstset of rules maps semantic properties of participants in the described situation ontomorphosyntactic feature sets; the second maps morphosyntactic feature sets ontophonologicalmarks.The first set of rules relates the semantic categories of participantsin the situation types described by verbs and the values of the morphological agrattribute; the second relates the values of the morphological agr attribute to thephonological reflexes of those values. We have nothing to say about the first setof rules, except to note that that set of rules is simple in the case of Oneida, asthe morphosyntactic features are easily inferrable from observable properties ofparticipants in situations. In other words, Oneida prefixes exhibit what one could



call semantic ˆ feature sets. Pronominal prefixes reference properties of one or twoanimate participants in situations, number marking corresponds to model-theoreticplurality, gender marking to model-theoretic gender, and so forth.4

So, what makes the Oneida prononimal prefix system complex? One way toapproach this question is to think about what inflectional rules are. At a conceptuallevel, we conceive of inflectional rules as in (10a); a very informal example fromOneida is provided in (10b) (see below for more formal examples).

(10) a. Morphosyntactic Feature Set ⇒ Form

b. 3m.sg>1pl ⇒ shukwa

With (10) in mind, we can distinguish at least four kinds of complexity:

1. Size:(a) What is the number of rules of the form (10)? Obviously morphological

complexity increases with the number of inflectional rules. For Oneidapronominal prefixes, we need at most 326 rules (see Zwicky 1985 and laterin this section for more details), since there are 326 forms (or allomorphs).

(b) What is the size of the morphosyntactic feature space? In other words, whatis the number of distinctions that can be marked (abstracting away fromneutralization)? Assessing that number depends on the linguistic model ofthe morphosyntactic feature set, as explained later in this section.

2. Semantic ambiguity: On average, how many semantic distinctions are expressedby the same form? In the case of Oneida pronominal prefixes, on average, how manycombinations of participant categories are expressed by a single prefix?

3. Directness: Can we account for all the forms with rules that have the formin (10)? The form of inflectional rules in (10) is simple: it assumes a direct linkbetween morphosyntactic feature sets and phonological forms. The question iswhether a particular inflectional block (in this case, Oneida pronominal prefixes)needs anythingmore than that.There is evidence, as we elaborate later, which suggeststhat the association between bundles of morphosyntactic features and forms canbe indirect and in some cases the kind of rule that is required is one where theoutput (form) of one rule is extended to, or identified with, the output (form) ofanother rule (i.e. requires reference to a function, akin to a paradigm function inStump 2001).

4. Segmentation and generalizability: How difficult is it to recognize a pronom-inal prefix that is present in a particular verb form? For example, given the formshukwahlo.líhe≈ ‘he tells us’, given earlier in (4), how easy is it to separate the prefixfrom the stem so that speakers can then identify the prefix in lakhlo.líhe≈ ‘he tells me’?

4 As can be expected, this is an oversimplification, but the extent to which features do not correspond tomodel-theoretic properties can be attributed to ordinary grammar ‘leakage’.



One way to think about this kind of potential difficulty is to ask how difficult it is tobackward-chain from the consequent of the conditional to its antecedent (see RussellandNorvig 2009 on backward chaining). Another way of thinking about this questionis to ask how difficult it is to infer the entire set of inflectional rules from a newlylearned inflected form (or set of inflected forms). So, for example, how difficult is it topredict the form khehlo.líhe≈ ‘I tell her or them’ from either lakhlo.líhe≈ ‘he tells me’or shukwahlo.líhe≈ ‘he tells us’ (or both)?This issue is sometimes raised in discussionsof principal parts or discussions of conditional entropy (see Finkel and Stump 2007;Ackerman et al. 2009), but the specific kinds of problems that result from being ledastray in segmenting a verb form and generalizing from it is rarely discussed in theliterature, as far as we know, and it is a particularly interesting aspect of what makesOneida pronominal prefixes complex.

. Size

Oneida pronominal prefixes are rather complex if our measure is number of rules.The upper bound on the number of rules is 326. There are fifty-eight morphs, with onaverage a little over five allomorphs per morph. We use the term morph here ratherthanmorpheme to stress that we are talking about a class of forms and not committingourselves to a morpheme-based theory of inflection. The number of allomorphs is aupper bound on this complexity dimension because if inflectional morphology is a setof rules of the form Properties ⇒ Forms, there are at most as many rules as there arenumber of distinct forms.

But if 326 is the maximum number of rules—something quite large for a singleinflectional block—another possible measure of paradigmatic complexity is the spaceof possible morphosyntactic properties that the particular inflectional block servesto realize or mark. Oneida prefixes can mark up to two animate arguments, and ifthey mark one animate argument they can belong to the Agent or Patient series of(intransitive) prefixes. If the features were all orthogonal, a pronominal prefix couldreference, in the worst-case scenario, (4 (persons)× 3 (numbers)× 4 (genders))+ 1indefinite= 49 combinations of features. Since transitive prefixes reference two argu-ments, transitive pronominal prefixes could reference up to 49 × 49 = 2,401 featurecombinations and Agent and Patient intransitive prefixes could reference 49 featurecombinations each, resulting in a total of 2,499 feature combinations. The space ofpossible antecedents of rules of exponence for Oneida pronominal prefixes would bequite large indeed.

Whether 2,499 is a useful measure of paradigmatic complexity is not entirely clear,though. This is because argument properties that can be referenced by prefixes arenot orthogonal. Much of the work on category structure in the 1980s (see Gazdaret al. 1988 for a nice overview), following more traditional structuralist work, reducesthe space of feature combinations we just outlined quite significantly. So, a more



realistic measure of how size may lead to the complexity of the Oneida pronominalprefixes is theminimumnumber ofmorphological distinctions thatmay be referencedphonologically if one adopts the most parsimonious and motivated analysis of thespace of feature combinations.

InOneida, as inmost, if not all, languages, there are restrictions on the combinationof properties of ˆ feature bundles. For example, gender is a relevant morphosyntacticfeature only for third person. Some of these restrictions are typologically, or logically,expected (the onewe justmentioned or the fact that the person value inclusive requiresdual or plural number). Some are idiosyncrasies of the morphology of Oneida, suchas the fact that the feminine gender occurs only in the singular. (As mentionedearlier, when referring to two or more females, the feminine-zoic gender is used.)Formally, restrictions on the space of possible combinations of ˆ features result fromtwo kinds of mechanisms or constraints: type (or sort) appropriateness conditionsand feature co-occurrence restrictions. The first set of constraints encodes systematicrestrictions on categories. For example, (11b) says that if the nominal index is of type3-n-indef-index (denotes a discourse referent that is third person and not requiredto be unspecified/indefinite), then and only then the feature gend is appropriate.By restricting features to the appropriate categories of nominal indices, type appro-priateness conditions are a way of encoding that, as far as Oneida’s morphologyis concerned, gender is an attribute that only makes sense for non-indefinite thirdperson participants. The second kind of constraints model idiosyncratic restrictionson combinations of properties. Thus, (12b) says that if a nominal index is of femininegender, the value of the num attribute is singular. It so happens that dual or pluralnumber and feminine gender are incompatible in Oneida. Feature co-occurrencerestrictions encode this unpredictable incompatibility of attribute values.

(11) a. Only third person nominal indices that are not indefinite bear genderinformation.

b. 3-n-indef-index ⇒[gend gender

]

(12) a. Feminine nominal indices are singular.

b.[gend fem

]⇒

[num sg

]

The net effect of type appropriateness and feature co-occurrence constraints oncombinations of ˆ features is to reduce the number of combinations from forty-nine(if features were truly orthogonal to each other) to nineteen, shown in Table 5.2.

But the number of possible ˆ feature combinations is further reduced by onegeneral constraint. Participants in described situations that are inanimate are neverreferenced morphologically. Thus, one can speak of four semantic genders in Oneida(and Iroquoian, in general), namely, masculine, feminine, feminine-zoic, and neuter,



Table . The nineteen possible categoriesof semantic indices

1st sg/du/pl2nd sg/du/plIncl du/pl3rd-indef3rd masc/feminine-zoic/neuter sg/du/pl3rd feminine sg

but only three morphological genders. We dub this constraint the Animate ArgumentConstraint. It is stated in (13). (Recall that verbs that have only inanimate (neuter)arguments default to the feminine-zoic gender.) The upshot of this constraint isthat out of the nineteen semantic categories of nominal indices, only sixteen aremorphologically relevant.

(13) All and only indices for animate semantic arguments of verbs are members ofthe value of the agr attribute.

All in all, linguistic analysis allows us to reduce the space of possible featurecombinations from 2,499 to 16×16+2×16 = 288 possible combinations. In addition,for first person acting on first person, and for second person or inclusive acting onsecond person, a reflexive construction (prefix) is used, and this further reduces thenumber of possible combinations from 288 to 248. Quite large, but not inconceivable.5

. Semantic ambiguity

Reducing the number of rules by reducing the number of possible ˆ combinations(from 2,499 to 288/248, in this case) will always lead to a reduction in complexity, as itsimply reduces the space of morphosyntactic properties to reference morphologically.However, a reduction in the number of possible antecedents of rules of exponence(and consequently a reduction in the number of rules) does not result in a reductionof complexity unequivocally.This is because a reduction in rule antecedents is possibleonly because not all of the possible feature combinations are realized by distinct formsand this fact results in semantic ambiguity, as we explain in this section.

5 Another way of measuring the complexity that may arise from the sheer size of Oneida pronominalprefixes paradigm is to use Carstairs’ () approach to complexity and compute the number of logicallypossible paradigms from the number of affixes and the number of allomorphs for each affix. The numberof such ‘possible paradigms’ for the Latin nominal declension is a little over 2 × 104; the number of such‘possible paradigms’ for Oneida pronominal prefixes is a little under 4 × 1025. Whether Carstairs’ measureis useful or not is a matter of debate (see Blevins ), but the difference in size indicates that Oneidapronominal prefixes are in another league from the Latin nominal declension.



nominal-index

3-indef-indexperson 3

n-indef-indexnum number

sp-part-indexpers sp-part

3-n-indef-indexpers 3gend gender

Figure . A hierarchy of nom-index

person

3 sp-part

Incl Excl

1-Excl 2-Excl

Figure . A hierarchy of person values

gender

neuter anim

fem other-anim

fem-zoic masc

Figure . A hierarchy of gender values

number

sg dp

pl dual

Figure . A hierarchy of number values



Although the total number of morphosyntactic distinctions which can be markedin Oneida is 288/248, only fifty-eight of those are marked. To model the reductionfrom the number of potential morphosyntactic distinctions to the number of actualmorphosyntactic distinctions, we make use of underspecification. Technically, under-specification in our rules of exponence is achieved by letting rules of exponence makereference to more or less specific types of nominal indices or properties of nominalindices. So, a rule that applies to all animate participants will have the value of thegend attribute of the corresponding nominal index be animate, but a rule that onlyapplies tomasculine gender participantswill have the value of the gend attribute of thecorresponding nominal index be masc. The hierarchies of types of nominal indices aswell as the hierarchies of feature-values relevant for Oneidamorphology are presentedin Figures 5.1–5.4, where each non-leaf node in a hierarchy represents a more generaltype referred to by at least one rule of exponence.6

An example of an underspecified rule of exponence is given in (14).

(14) a. If a stembelongs to the consonant class and references a first person exclusivedual or plural proto-agent acting on a third person feminine singular proto-patient, prefix yakhi- to its phonological form.7

b.

morph

pdgm

agr pers exclnum dp , gendfem

num sg

2

stemclass c

1

⇒ expo (yakhi ⊕ 1 , 2 )

The antecedent of this rule applies to all lexemes that are in the consonant stemclass and that describe situations where a first person exclusive dual or plural setof entities acts on a third person feminine singular entity. The antecedent leavesunderspecified the number of the proto-agent argument by having as value for thenumber attribute the non-leaf type dp. In other words, the type dp covers all non-singular numbers, i.e. both plural and dual participants. The consequent simplyconcatenates the pronominal prefix yakhi- to the consonant-initial stem. Now, thenumber of distinct values of the agr features that serve in antecedents of rules ofexponence is fifty-eight, and each of the fifty-eight distinct agr values is associatedwith a set of allomorphs. So, in addition to the rule in (14), we also have the rule in (15),which targets the same agr value (that is, references the same participant categories),but applies to lexemes that belong to the a-, e-/R-, and o-/u-stem classes.

6 By Excl in Figure . we mean first person to the exclusion of second person or second person to theexclusion of first person.

7 The prefix yakhi- also applies when the proto-patient is third person indefinite/unspecified, or mascu-line or feminine-zoic dual or plural as the result of the application of three distinct rules of referral, as wemention later on.



(15) a. If a stem belongs to the a, e/R, or o/u class and references a first personexclusive dual or plural proto-agent acting on a third person femininesingular proto-patient, prefix yakhiy- to its phonological form.

b.

morph

pdgm

agrpers exclnum dp ,

gend femnum sg

2class a ∨ e/Λ ∨ o/ustem 1

⇒ expo (yakhiy ⊕ 1 , 2 )

As alluded to earlier, to avoid committing ourselves to a morpheme-based view ofinflection, we will call morph a class of allomorphs that share agr values, that is, aclass of consequents of rules whose antecedent agr values are identical. For purposesof presentation, we will refer to morphs by their consonant-stem allomorphs. So,the consequents of (14b) and (15b) belong to the same morph. The set of distinctcells in Table 5.1 (or Table 6 in Lounsbury 1953) is the set of morphs in Oneida.Allomorphs of twenty-three of the fifty-eight morphs realize underspecified agrvalues.8 Underspecification allows us to further reduce the number of morphs from248 to fifty-eight, i.e. by almost one-half order of magnitude, the set of agreementdistinctions relevant to Oneida pronominal prefix rules of exponence.

On the surface, the reduction in number of actual morphosyntactic distinctionsthrough underspecification leads to a simplification of Oneida pronominal morph-ology. Just as the reduction of possible distinctions to be referenced morphologicallyis accomplished by structuring nominal indices, the underspecification of agr valuesreduces the number of rules of exponence. But the reduction in number of rules doesnot mean unequivocally a reduction in complexity of Oneida pronominal morphol-ogy. There is a distinction here between formal complexity and what one could callusage complexity.

Themorphological systemofOneida (measured in the number of rules, constraints,and the like) is certainly made simpler by the use of underspecified agr values in theantecedents of rules of exponence. But, underspecification comes at a cost for the user,particularly in comprehension. This is because upon hearing many verb forms (anyone of the twenty-three that are described by rules that exploit underspecification),native speakers will not be able to determine important properties of the participantsin the situation described by the verb form (all the more so, since less than a quarter

8 Tenmorphs realize underspecified number (L yoti-, L loti-, L yakhi-, L yethi-, L yakoti-, Lshakoti-, L kuwati-, L luwati-, L kni-, L kwa-). Four morphs realize underspecified incl/excl (Lyukni-, L yukwa-, L shukni-, L shukwa-). Four morphs leave underspecified the proto-role of the twoparticipants they reference (L sni-, L swa-, L etsni-, L etswa-). One morph realizes underspecifiedgender (L lo-). Three morphs realize underspecified number and inclusive/exclusive (L yukhi-, Lskni-, L skwa-), and one morph realizes underspecified number and role (L yetshi-).



of semantic arguments are further specified by external phrases in Oneida). Considerthe prefix etsni- that attaches to stems beginning in a consonant. This prefix isused whenever the described situation involves a third person masculine singularparticipant and a second person dual participant. But, the prefix leaves underspecifiedwhich proto-role these two participants play. So, it can be used either when a thirdperson masculine singular proto-agent acts on a second person dual proto-patient, ora second person dual proto-agent acts on a third person masculine singular proto-patient. Underspecification of proto-roles associated with the prefix etsni- means thatupon hearing a word form containing this prefix, hearers may be uncertain as towho was the agent and who was the patient. In other words, underspecification ofproto-role, while reducing the number of rules of exponence, makes correspondingutterances more ambiguous and is thus a source of complexity for hearers. True, someunderspecification may not be resolved by hearers; for example, hearers may not careif the proto-patient of a form like yakhi- references a first person exclusive dual orfirst person exclusive plural proto-agent (acting on a third person feminine singular).They may leave that issue unresolved until context resolves it. But the fact that, mostplausibly, hearers will, for at least some forms (or some situations), attempt to resolvethe semantic ambiguity arising from the underspecification of agr values means thatunderspecification is not an unqualified blessing.

. Directness

Rules such as (14) and (15) directly associate a morphosyntactic feature set with aform. The question is whether the realization of all morphosyntactically relevantagr values takes this direct form in Oneida. The answer is No. There are goodreasons to posit rules of referral of the kind Zwicky (1985) discusses. The rules ofexponence we have talked about until now have used underspecification of val-ues of a portion of the agr attribute to model syncretism. Implicit in this use ofunderspecification is the assumption that the relevant underspecified (nominal) indexmakes semantic sense. For example, the exponence rule that references 3m.dp actingon 3f.sg (the rule for the prefix shakoti-) can underspecify the dual/plural numberdistinction of the proto-agent index because non-singularity is a semantically coherentcategory. In contrast, there is no semantically coherent category corresponding toparticipants that are either third person indefinite or nonspecific (irrespective ofnumber and gender) or third person feminine singular. The absence of a semanticnatural class that would group together these two semantic indices is modelled in ouraccount through the absence of an underspecified index that would cover these twoindices.

To account for the formal similarity of exponents of third person indefi-nite/nonspecific and third person feminine singular arguments, we could, of course,use a disjunction in the agr value of rules of exponence, which amounts to having two



sets of rules of exponence (as (p∨q) → r is logically equivalent to (p → q)∧(p → r)),one for agr values that include third person feminine singular indices, and one foragr values that include third person indefinite/nonspecific. Such an analysis amountsto treating the similarity in exponence as a phonological accident. And certainlythose kinds of accidental formal identity arise in inflectional systems. But, in the caseof 3indef and 3f.sg this is unlikely since the identity of exponence is systematic:All prefixes that reference a 3indef participant are formally identical to the prefixesreferencing a corresponding 3f.sg participant (see Chafe 1977 for a possible historicalexplanation for that systematic formal identity).

To model what is not an accidental formal identity of exponents we use rulesof referral that systematically relate the exponence of one feature bundle to that ofanother feature bundle. Of course, the need for rules of referral or how to modelthem is well known (see Stump 2001 for a detailed overview). What interests us here,though, is that rules of referral introduce another, more formal, source of complexity,something that has not been stressed before. In particular, representing rules ofreferral requires us to mediate the association between a morphosyntactic feature setand a form by making explicit reference to a function that takes the morphosyntacticfeature set as input and outputs the form. In plain English, rather than having rulesthat say something like ‘spell out agr value · as x’, we now have rules that say ‘the valueof the function that spells out · is the same as the value of the function that spells out ‚’.This is because a rule of referral basically says that the form marking a feature set · isthe same as the formmarking a feature set ‚. In other words, it says that the values of an(exponence) function are the same. Hence, rules of referral cannot be stated withoutthe introduction of such exponence functions. The need for this additional level ofabstraction (i.e. reference to exponence functions) is an additional complexity of theOneida pronominal prefix system.

There is evidence for five such rules of exponence in Oneida. One such rule isstated in (16). (16) says that intransitive prefixes of the A(gent) class that realize theagr value < · > are the same as transitive prefixes that realize the agr value < ·,feminine-zoic sg>, provided · is a speech participant or third masculine singular. Anexample prefix for this rule of referral is L5 twa-, which attaches to consonant-initialstems and marks either first person inclusive plural A(gent) participants, or a firstperson inclusive plural proto-agent acting on a third person feminine-zoic singularproto-patient. A similar rule says that intransitive prefixes of the P(atient) class thatrealize the agr value < · > are the same as transitive prefixes that realize the agrvalue <feminine-zoic sg, · >, provided · is not third person masculine dual/pluralor third person feminine-zoic dual/plural (in which case the prefixes loti- and yoti-,respectively, are used).

(16) a. The pronominal prefix for stems that reference a single A(gent) speechparticipant or third masculine singular nominal index · is the same as the



pronominal prefix for stems that reference a proto-agent · acting on a thirdperson feminine-zoic singular proto-patient.

b.

expo ( 2 , morph

pdgm

agr 1 (sp-part-index ∨gend mascnum sg

))

affix-type a

morph agr 1 ,gend zoicnum sg

)= expo ( 2 ,

Inherent in rule (16) is the assumption that this is a case of asymmetric neu-tralization, as defined in Stump (2001) and Baerman et al. (2005). Evidence forthe asymmetry comes from two prefixes, L7 sni- and L8 swa- (we cite, as usual, theallomorphs for consonant-initial stems), which, as transitive prefixes, neutralize thedistinction between proto-agent and proto-patient arguments. When morphologi-cally referencing two participants, sni- references either a third person feminine-zoic singular proto-agent acting on a second person dual proto-patient or a secondperson dual proto-agent acting on a third person feminine-zoic singular proto-patient.When referencing a single participant for stems that select A(gent) prefixes, sni-references a second person dual participant and when referencing a single participantfor stems that select P(atient) prefixes, sni- likewise references a second persondual participant. The prefix swa- has a similar distribution: when referencing twoparticipants, swa- references either a third person feminine-zoic singular proto-agentacting on a second person plural proto-patient or a second person plural proto-agentacting on a third person feminine-zoic singular proto-patient. When referencing asingle participant for stems that select A(gent) prefixes, swa- references a secondperson plural participant, and when referencing a single participant for stems thatselect P(atient) prefixes, swa- likewise references a second person plural participant.(Note that sni- and swa- are the only morphs that occur in both intransitive agentand intransitive patient paradigms.) Now, if we assume the rules of exponence thatcorrespond to sni- and swa- (and the other rules for the allomorphs of the samemorph) have transitive agr values as antecedents, the equality in (16) can derive thecorresponding intransitive uses of these prefixes. What is significant is the fact thatthe underspecified proto-role in the transitive uses of sni- and swa- is paralleled bythe underspecified proto-role in three other prefixes: L31 etsni-, L32 etswa-, and L43yetshi-. Rules of referral that extend transitive uses of sni- and swa- to intransitive usesallow us to capture this parallel and thus leads to a conceptually simpler model (albeit,not by much).

Three additional rules of referral are needed. One ‘extends’ the third person fem-inine singular to the third person indefinite/nonspecific; the other two extend thethird person feminine singular to transitive prefixes with a third person masculine



dual/plural or third person feminine-zoic dual/plural proto-agent, and to transitiveprefixes with a third person masculine dual/plural or third person feminine-zoicdual/plural proto-patient.

. Segmentation difficulties

The question we set out to answer in this chapter is whether there are aspects ofwhat we call paradigmatic complexity that are quite different from the more oftenstudied syntagmatic complexity. The paradigmatic complexity we have focused oninvolves a single inflectional block, and one dimension that is specific to paradigmaticcomplexity is the complexity that arises from combining an affix with a stem. InOneida, there are twoways that complexity can arise fromconcatenating a pronominalprefix with a stem. Both are due to phonological adjustments at the boundary betweenpronominal prefix and stem, and both have the effect of impeding the ability ofspeakers to generalize from an observed verb form to all the other verb forms basedon the same stem.9 Consider first the forms in (17), all three of which include thepronominal prefix L38 shako-, which references a third person masculine singularproto-agent acting on a ‘3’ proto-patient. This prefix induces the deletion of a stem-initial a or i, as (17a) and (17b) indicate. As a result, the three verb forms in (17) areambiguous as to stem class. Each of the forms could be an a-stem, an i-stem, or aC-stem. This kind of ambiguity is similar to the better-known ambiguities found inIndo-European languages, e.g. the conjugation class ambiguity of Latin amo (is it afirst or third conjugation class form?).

(17) a. wa-shako-anúhtuht-e≈fact-3m.sg>3-wait.for-pnc

⇒ washakonúhtuhte≈

‘he waited for her or them’

b. wa-shako-ihnúks-a≈fact-3m.sg>3-fetch-pnc

⇒ washakohnúksa≈

‘he went after, fetched her or them’

c. wa-shako-li≈wanu.tú.s-e≈fact-3m.sg>3-ask.someone-pnc

⇒ washakoli≈wanu.tú.se≈

‘he asked her or them’

Consider next the forms in (18) which show different allomorphs of the prefixL34 lak-, which indicates that the described event involves a third person masculinesingular proto-agent acting on a first person singular proto-patient. The allomorphhakw- occurs with e-/R-stems, while the allomorph hak- occurs with consonant stems.

9 See Bank and Trommer (this volume) for a discussion of automatic learning of morphologicalsegmentation.



But, the forms in (18) can be parsed two different ways, depending on whether or notone assumes the w is part of the pronominal prefix or part of the stem.

(18) 3m.sg>1sg hak/hakw : C-stem or e-/R-stem?

a. wa-hakw-Rhahs-e≈fact-3m.sg>1sg-belittle-pnc‘he belittled me’

b. wa-hak-wRnahnóthahs-e≈fact-3m.sg>1sg-read.to-pnc‘he read to me’

Both the types of situations exemplified in (17) and (18) lead to complexity because,in both cases, the class to which the stem belongs cannot be unambiguously deter-mined from the forms given. As a consequence, one cannot infer in either case all theother forms in the paradigm of the stem, as selection of pronominal prefix allomorphsdepends on stem class. The reason for this latter fact is specific to Iroquoian: Eachmorph has several allomorphs conditioned by the class to which the stem belongs,and that stem class is determined by the identity of the initial sound of the stem.In fact, the mapping between allomorphs and stem classes is itself rather complexbecause different morphs associate different groups of stem classes with the differentallomorphs. For example, the morph referencing a first person exclusive plural agenthas four allomorphs depending on the class to which the stem belongs, yakwa- if thefollowing stem starts with a consonant, yakwR- if it starts with an i, yakw- if it startswith a, e, or R, and, finally, yaky- if it starts with o or u. The morph referencing afirst person singular proto-agent acting on a ‘3’ proto-patient, on the other hand, hasonly two allomorphs conditioned by the stem class, khe- before consonant-stems andi-stems and khey- before a-, e-/R-, and o-/u-stems. Overall, there are eleven distinctgroupings of stem classes for the fifty-eight pronominal prefix morphs. (Groupingswere mentioned earlier at the end of section 5.1 in connection with the distribution ofthe allomorphs of prefix L39 khe- and prefix L33 luwa-.)

Why do the segmentation difficulties we have just mentioned—i.e. determining theclass to which the stem belongs either because the stem-initial segment is obscureddue to phonological processes or because the boundary between prefix and stem canbe located in more than one place—lead to complexities? First, speakers must learn,for eachmorph,which group of stem classes goeswith each allomorph.There are somesubregularities, of course. But, ultimately, nothing can save speakers from having tolearn that a-, e-/R-, and o-/u-stems require khey- as the allomorph for the 1sg>3 prefix.Second, even after having learned which allomorph of which morph goes with whichgrouping of stem classes, speakers cannot necessarily generalize from a given form ofa verb to all the other forms of that verb.

We can quantify somewhat the complexity introduced by segmentation difficultiesby asking what is the number of morphs in each stem class from which the other



Table . Number of morphs in each stem classfrom which all other fifty-seven forms can bededuced

C-stem i-stem o-/u-stem e-/R-stem a-stem

1 11 45 24–40/20–30 26

fifty-seven forms of the stem class can be deduced. In a perfect system, each of thefifty-eight morphs for all five stem classes would allow one to infer the fifty-sevenother forms. Table 5.3 shows that reality is far from perfect. Two notes regardingTable 5.3 are: (1) Lounsbury (1953: 56) reports that some speakers use prefixes withi-stems that are similar to those found on C-stems, while other speakers use prefixesthat otherwise occur with stems with an initial vowel. The speakers that Michelsonhas worked with in Ontario since 1979 use the vowel-stem prefixes with only twoverbs, -ihlu- ‘say’ and -ihey- ‘die’. The number of i-stem forms is based on the prefixesthat overlap with C-stems since these represent the majority. (2) An innovation isthat some of the transitive prefixes have either extended an allomorph ending in yfrom o-/u-stems to e/R-stems (L33 luway-, L48 yesay-, L49 kuway-) or developed newallomorphs in y before both o-/u-stems and e-/R-stems (e.g. L27 shukway- or L38shakoy-). The table gives two numbers each for e-stems and R-stems; the first numberis the number of morphs from which the other morphs of the same stem class canbe deduced assuming the older forms without y; the second number is the number ofmorphs needed assuming the innovative forms with y are used instead.

Table 5.3 shows that, depending on the stem class, between 1.72 per cent and 78per cent of verb forms allow one to infer the fifty-seven other verb forms (again,restricting ourselves for now to the combination of pronominal prefixes and stems). AsAckerman et al. (2009) have argued though, the complexity of an inflectional systempartially depends on how frequent certain ambiguities are. In this particular case, thegeneralizability penalty is more or less severe depending on the relative frequencyof the various stem classes. If C-stems, for example, are very rare, then the lack ofgeneralizability for all but one morph is not as worrisome as if C-stems were veryfrequent. Table 5.4 shows the number of stems of each class in twelve pseudo-randomlychosen naturally occurring texts and the proportion of stems of each class across thesetwelve texts.

It seems that C-stems are the most frequent ones, accounting for 45 per cent ofthe stems in these twelve texts.10 So, the fact that only one of the fifty-eight forms

10 Despite themuchmore frequent occurrence ofC-stems tokens in our sample texts, there is no intuitivesense in which C-stems forms are the default anymore than, hypothetically, finding out that first declensionforms are significantly more frequent in some Latin texts would lead us to say that first declension is thedefault declension class of Latin nouns.



Table . Number and percentage of stems ofeach class in twelve texts

Total

C-stems 504 45a-stems 373 34e-/R-stems 109 10i-stems 102 9o-/u-stems 26 2Total 1,114 100

for C-stems allow deduction of all other forms is worrisome. The average lack ofgeneralizability fromone form to all others ismore severe thanTable 5.3would suggest.

. Conclusion

Much of the discussion on what is a human language over the last sixty years hasfocused on properties of the syntactic and semantic combinatorics, the ‘discreteinfinity’ displayed by both syntax and semantics.Measures of complexities (including,for example, where natural languages fall on the Chomsky hierarchy) have, as aconsequence, focused on syntagmatic complexity, i.e. the result of puttingmorphemes,words, and phrases together. In this chapter, we discussed another kind of possiblecomplexity, namely the complexity that a single position class or rule block canexhibit, what we call paradigmatic complexity. By focusing on a single block we believeit is easier to ask questions about what makes inflection complex and explore towhat extent the kind of complexity exhibited by inflection differs from syntagmaticcomplexities.

The first kind of complexity a single inflectional block can exhibit is the sheerrange of choices one can make. In the case of Oneida pronominal prefixes this meansfifty-eight morphs and 326 allomorphs (roughly five allomorphs per morph). So, notunsurprisingly, paradigmatic complexity is first a matter of number of choices. Butwhy are inflectional choices difficult? After all, one’s vocabulary is about three ordersof magnitude larger than the fifty-eight morphs in the Oneida pronominal paradigm,at least according to some estimates of the average vocabulary size of English speakers.So learning or choosing words does not seem to be all that difficult. However, at leastaccording to subjective accounts from second-language learners of English versusOneida, learning or choosing inflected forms of a verb in Oneida is significantlymore difficult than learning or choosing content words (unfortunately data suggestingthat learning or using pronominal prefixes is easy or hard for native speakers is notavailable, as no children are learning Oneida as a first language). So, what is it thatmakes the retrieval and selection of a pronominal prefix difficult?



We can only offer speculative remarks here. Nevertheless, we think the issue is ofsufficient theoretical interest to make it worth speculating about. First, the use of aparticular pronominal prefix is obligatory. In other words, given the situation beingdescribed there is a single appropriate prefix that speakers must choose. Second, inchoosing or interpreting pronominal prefixes speakers and hearers must attend toseveral very general properties that are not necessarily the most salient participantproperties in the situation (the properties are not basic level properties, so to speak).We can quantify this putative underlying cause of perceived complexity by languagelearners by the number of decisions speakersmustmake when choosing a pronominalprefix (we leave aside allomorphy for now).They must decide if the situation involvesone or two participants. If it involves one (animate) participant, they must decideif the verb selects Agent or Patient intransitive prefixes. For each participant that ismorphologically referenced, theymustmake decisions about person and number, andfor third person participants, whether it is indefinite/nonspecific or not, and, if not,what the participant’s gender is.

Thus, to choose the pronominal prefix L10 la-, a speaker must have decided that thesituation involved two animate participants (1), that the proto-agent was third person(2), and if not-indefinite (3) that it was masculine (4), and singular (5) and that theproto-patient was third person (6), and if not-indefinite (7) that it was feminine-zoic(8), and singular (9). In total, speakers must make, in the worst case, between fourand nine decisions (each decision requires making a choice between two and fouralternatives) before they can select the appropriate morph.11

When comparing this number with the typical number of decisions required toselect nominal or verbal suffixes in Indo-European the number is quite high. Toproperly decline Latin nouns speakers must only decide on the noun’s number andgender (we omit case, as it is syntagmatically determined, or declension class, as it ismore similar to the effect of stem-classes in Oneida). To properly conjugate, say, anancient Greek verb only five decisions must be made (voice; mood and tense; personand number). Moreover, these five decisions are not required to select onemorph, butrather a sequence of several morphs.The perceived complexity of Oneida pronominalprefixes, we surmise, is partly due to the large number of decisions speakersmustmakewhen choosing a single morph. These first two possible factors that may lead to thecomplexity of the Oneida pronominal prefix slot can be grouped together under therubric attentional complexity: To make one choice (which pronominal prefix to use),speakersmust attend tomany distinct and concurrent properties of participants in thedescribed situation.

11 Underspecification often helps speakers, as it reduces the number of decisions to be made. Thus, tochoose the prefix L yuk- (3>sg), speakers need only make four choices, as the prefix neutralizes alldistinctions among third person except masculine singular and feminine-zoic singular. The trade-off is anincrease in semantic ambiguity for hearers, as we discussed in section ..



Third, because of neutralization, some properties are relevant some of the time,but not all of the time. For example, the distinction between dual and plural is neverrelevant for proto-patients; it is also not relevant for some proto-agents, namely justwhen a first or second person proto-agent acts on a ‘3’ proto-patient. Speakers andhearers must therefore attend to the contrast between two or more first or secondperson proto-agents . . .unless the situation involves a ‘3’ proto-patient. Furthermore,because most neutralizations are not systematic (except for the distinction betweendual and plural third person proto-patients), speakers and hearers cannot rely onblanket statements on when to ignore some morphologically relevant participantproperties.

Fourth, having chosen which of the fifty-eight morphs is the right one for the situa-tion speakersmust then choose an allomorph. And this is not easy. Eachmorph has onaverage five allomorphs and allomorphy, although often motivated, is not automatic;it is something that speakers must learn for each morph (and appropriately use inproduction). Moreover, as we have illustrated, allomorphy can lead to segmentationambiguities for the hearer.

If number of morphs (or allomorphs) within the block leads to complexities, itwould seem that any reduction in size might lead to reductions in complexities.Interestingly, we tried to show that this is not necessarily the case. We distinguishedbetween two kinds of reduction in number of morphs. The first kind reduces thespace of possible morphosyntactic distinctions to mark or reference. We showedthat a skilled linguistic analysis can reduce the number of such distinctions from2,499 for an unstructured space to 288/248 when general and particular constraintson feature and feature-value combinations are in effect. Reductions in the numberof morphosyntactic distinctions that can be marked always lead to a reduction incomplexity. But, the second kind of reduction is not unmitigatedly a simplification.This is the reduction from the 288/248 possible morphosyntactic combinations tothe 58 Oneida pronominal morphs. We model this reduction mostly through theuse of underspecification in the statement of rules of exponence. Rules of exponencecan leave underspecified the type of nominal indices or features of nominal indices.Underspecification results in a reduction in the number of rules, but at the cost ofsemantic ambiguity. Given a verb form with a particular pronominal prefix hearerscannot be sure of all the properties of participants in the situation that the pronominalprefix references. In certain cases, it is possible hearers do not resolve this ambiguity,but, as we showed, some of the time what is underspecified is so important to thespeaker’s communicative intent (what situation is being described), that hearers aremost likely to have to disambiguate what is being referenced by the prefix, as whenthe proto-role is being underspecified.

In the end, whatOneida pronominal prefixes tell us is that there ismore to linguisticcomplexity than the combinatorial issues we are most familiar with from work insyntax and semantics. The need for speakers to make a set of inter-related choices



to retrieve and select a form can also lead to complexities. The importance andrelevance of accessing forms when producing or understanding languages will notsurprise psycholinguists.Whatmay bemore interesting is that sometimes a language’sgrammatical system can make this retrieval quite a difficult matter.

Acknowledgements

We thank Karin Michelson’s Oneida collaborators, and especially Norma Kennedy.We are also grateful to Matthew Baerman, Greville Corbett, and Hanni Woodburyfor reading earlier drafts of this chapter, and to Cifford Abbott, Michael K. Foster,and HanniWoodbury for discussion of Iroquoian pronominal prefixes. We thank ourcolleagues Rui Chaves, Matthew Dryer, David Fertig, and Jeff Good for their input.Finally, this chapter would not have been written without Farrell Ackerman and JimBlevins spurring our interest in this topic during their 2011 LSA Institute class.


Gender–number marking in Archi:Small is complex

MARINA CHUMAKINA AND GREVILLE G. CORBETT

. The problem

The Nakh–Daghestanian language Archi has a small paradigm of markers real-izing gender and number. Though small, this paradigm proves complex. We seethis complexity in the inventory of inflectional targets, since almost all parts ofspeech can mark gender and number but not all lexemes within the same partof speech behave alike. Predicting which will show gender and number is notstraightforward. More difficult is specifying the position of the gender and numbermarkers: many items have infixal marking, and these are found in some instanceswhere prefixal marking would be felicitous on purely phonological grounds. Thatis, Archi exhibits the typologically rare phenomenon of ‘frivolous’ infixation. Wepropose a number of factors which bear on the presence and position of the gender–number markers, and also on their forms and syncretism pattern; these factorsoverlap in ways which make it hard to isolate the impact of individual factors.Our approach will be to give the defaults, and the more specific overrides to thesedefaults. It is ironic that this complexity is found in the relatively small genderand number paradigm, since Archi is famed rather for the sheer size of its otherparadigms.

. Description of the system

For describing and analysing the morphological marking of gender–number agree-ment in Archi we use two main sources of data. First, there is the extensive workof Aleksandr Kibrik and his colleagues: the Archi grammar published in Russianin 1977 (here we use volumes I, II, and III, referred to as Kibrik et al. 1977,Kibrik 1977a, and Kibrik 1977b respectively) and the online collection of Archi


Marina Chumakina and Greville G. Corbett

texts (Kibrik et al. 2007). Our second source is the electronic dictionary of Archi(Chumakina et al. 2007) compiled in the Surrey Morphology Group in 2004–7. Thiswork involved several field trips to the Archi village and the creation of a databasewith lexical, phonological, and morphological information on 4,464 Archi words.Examples given in the chapter come mainly from the database and from field-work.

We start with an example which demonstrates that every part of speech in Archi(except nouns) can be an agreement target:

(1) b-isi/ii.pl-1sg.gen

o‹b›q»a-t‰-ib‹i/ii.pl›leave.pfv-attr-pl

χ‰»eleguest(i/ii.pl)[abs]

b-ezi/ii.pl-1sg.dat

dit‰a‹b›usoon‹i/ii.pl›

e‹b›χniforget‹i/ii.pl›pfv

‘I soon forgot my guests, who had left.’

Archi is a strongly ergative language, hence agreement is controlled by the absolutiveargument; in (1) it is χ‰»ele ‘guests’. The verb eχmus ‘forget’ takes two arguments,following the pattern of verbs of perception: the dative (experiencer, the person whoforgets) and the absolutive (perceived entity, here: forgotten entity). Agreement onthis verb is realized by the infixal marker b. We follow the Leipzig glossing rulesin indicating infixes within angle brackets; information inferred from the bare stemis given within square brackets. In (1) every word is an agreement target (except,naturally, χ‰»ele ‘guests’ which is the controller). We consider first the more familiaragreement targets:

• the verb e‹b›χni ‘forget’ agrees with the absolutive χ‰»ele ‘guests’ in gender andnumber;

• the participle o‹b›q»at‰-ib ‘having left’ agrees with its head noun χ‰»ele ‘guests’ intwo places: the infix realizes agreement in gender and number, and the suffix innumber only;

• the genitive pronoun b-is ‘my’ agrees with its head noun χ‰»ele ‘guests’ in genderand number.

Those instances of agreement are unsurprising: verbs often agreewith their arguments,and attributives commonly agreewith their heads.There are two less familiar examplesof agreement in (1):

• the pronoun b-ez (the first singular pronoun in the dative case) agrees with theabsolutive argument χ‰»ele ‘guests’ in gender and number; agreement is realizedby a prefix.

• the adverb dita‰‹b›u ‘soon’ agrees with the absolutive argument χ‰»ele ‘guests’ ingender and number; agreement is realized by an infix.


Gender–number marking in Archi: Small is complex

Neither of these targets has a direct syntactic connection to the absolutive argumentχ‰»ele ‘guests’, yet both agree with it1.

Example (1) illustrates most of the characteristics of the Archi agreement systemwhich will be important for us: the features involved in agreement, namely gender andnumber, the morphological positions for agreement markers (prefixes, infixes, andsuffixes), and the parts of speech serving as agreement targets. While (1) shows thatverbs, participles, pronouns, and adverbs can have amorphological slot for agreement,other parts of speech such as adjectives, particles, and a postposition can also serve asagreement targets in Archi.

Thus the range of agreement targets in Archi is particularly extensive. Conversely,when we consider the lexicon, the range is more restricted, since there is no part ofspeech where agreement is found in all lexical items. Table 6.1 presents data from theArchi dictionary showing the proportion of lexical items which show agreement.

Table . Agreeing lexical items in theArchi dictionary

total agreeing agreeing

adjectives 446 313 70.2verbs 1248 399 32.0adverbs 383 13 3.4postpositions 34 1 2.9enclitic particles 4 1 (25.0)

Based on Chumakina and Corbett (2008: 188)

From Table 6.1 we see clearly that within each part of speech it is only some itemswhich agree. For adjectives, the majority agree and it is possible to define formalproperties of agreeing items (see 6.3 and 6.4). The three bottom lines of Table 6.1represent parts of speech for which the need to agree must be fully specified in thelexicon, as there are no phonological, morphological, or semantic regularities. Verbsare specially interesting in this respect, and we discuss them in 6.5–6.6.

Personal pronouns were not included in Table 6.1, because it would be difficult togive meaningful figures. As we saw in example (1), some forms of some pronounshave a slot for agreement. To produce the right form, one needs both lexical andmorphological information: we cannot just list (or count) which pronouns agree, wemust also specify which cells of their paradigm have agreement slots. We discuss thisin section 6.3.

1 For more on the challenge of the agreeing adverbs in Archi, see the Wiki site ‘From compet-ing theories to fieldwork’ (http://fahs-wiki.soh.surrey.ac.uk/wiki/projects/fromcompetingtheoriestofield-workarchi/Archi.html), particularly the discussion in the seminar dedicated to topic ‘The domainproblem’.

2 The paper cited gives different numbers for the adverbs ( total, agreeing). These numbers havebeen corrected recently, as we have reanalysed some of the lexical items.

http://fahs-wiki.soh.surrey.ac.uk/wiki/projects/fromcompetingtheoriestofieldworkarchi/Archi.html

http://fahs-wiki.soh.surrey.ac.uk/wiki/projects/fromcompetingtheoriestofieldworkarchi/Archi.html



For the items that agree, agreement is in gender and number.3 In this respect, Archiprovides a picture fairly typical for Daghestanian languages: there are two numbers(singular and plural) and four genders. Genders I and II comprise male and femalehumans respectively. The rest of the noun lexicon is distributed between genders IIIand IV, where III comprises animates, all insects, and some inanimates, and genderIV includes some animates, some inanimates, and abstracts.

We can now examine how all this comes together in paradigms. We will look atsome representative paradigms, to illustrate the points made so far.

(2) Adjective ha˛du ‘real, reliable’

gender

number sg pl

I ha˛du-(w)4

II ha˛du-rIII ha˛du-b

ha˛d-ib

IV ha˛du-t

The paradigm of the adjective in (2) shows a singular–plural distinction, with fourgender values distinguished in the singular but no differentiation of gender in theplural. All the marking is suffixal. In contrast, (3) illustrates one of the possibilitiesfor verbs:

(3) Verb aχas ‘lie down’, perfective stem:

gender

number sg pl

I a‹w›χuII a‹r›χu

a‹b›χu

III a‹b›χuIV aχu

aχu

Again, four gender values are distinguished in the singular, and in the plural we seejust a two-way distinction: genders I/II versus III/IV, which amounts to a distinctionbetween human and non-human. Note the interesting syncretisms: genders I and II inthe plural have the form of gender III singular, while genders III and IV plural have theform of gender IV singular. For more on gender syncretism in the nominal domain,see Milizia (this volume). In the paradigm in (3) the agreement markers are all infixal.The adverb in (4) provides a contrast with (3) in terms of agreement exponents:

3 We have argued elsewhere, most recently in Corbett (: –), that a person feature is requiredin the morphosyntax of Archi. The interesting complications need not concern us here, since the personforms are always syncretic with one of the forms we shall be analysing, and so person does not add to thedata we need to account for in this analysis of morphological forms.

4 The suffix [w] is not always pronounced, but it surfaces if there is a vowel following it.



(4) Adverb k’ellej‹t’›u ‘entirely’

gender

number sg pl

I k’ellej‹w›uk’ellej‹b›u

II k’ellej‹r›uIII k’ellej‹b›u

k’ellej‹t’›uIV k’ellej‹t’›u

This is one of theminority of adverbs which agree.The number of values distinguishedby agreement and the pattern of syncretism are as we saw with the verb, but themarking is different. Though both the verb and the adverb use infixes, the verb marksgenders III and IV plural and gender IV singular by the bare stem (zero marking),whereas the adverb has a overt marker (t’). Finally, we consider a pronoun:

(5) First person singular pronoun, dative case:

gender

number5 sg pl

I w-ezb-ez

II d-ezIII b-ez

ezIV ez

Here we see another variation on the theme.The pattern of syncretism is as we saw in(3) and (4); the markers are similar to the verbal ones in (3), but here they are prefixes(note the difference in the realization of gender II singular: [r] when infixed, [d] whenprefixed). This pronoun agrees with the absolutive argument, as we saw in (1).

For those items that mark agreement, we have seen two different patterns ofsyncretism in the paradigms in (2)–(5). If there were a specific exponent for eachgender–number combination, we would have eight separate agreement markers. Inreality, there are nomore than five exponents for each part of speech. Abstracting awayfrom the actual forms in our examples, we can summarize the patterns of syncretism,A and B, as shown in Tables 6.2 and 6.3.

We have seen, too, that Archi employs prefixes, suffixes, and infixes to realizeagreement. Their distribution is not straightforward. To some extent it is determinedby the part of speech: adjectives realize agreement solely by suffixes, while adverbs,postpositions, particles, and some pronouns realize agreement by the infixes shown

5 Here, ‘gender’ and ‘number’ refer to the gender and number of the absolutive of the clause with whichthe dative pronoun agrees.



Table . Syncretism pattern A

sg pl

I 13II 2

III 34IV 4

Table . Syncretism pattern B

sg pl

I 15II 2

III 3IV 4

Table . Archi prefixes

sg pl

I w-b-II d-

III b- IV

Table . Archi infixes, Set I

sg pl

I ‹w›‹b›II ‹r›

III ‹b›‹t’›IV ‹t’›

in (4) and given as Set I6. Some pronouns employ prefixes, and verbs use prefixesand the infixes shown in (3) and given as Set II. Tables 6.4–6.7 show all the possibleagreement affixes.

6 There are some exceptions such as the adverb b-allej‹b›u ‘for free’, where the agreement is realized twice:as a prefix and as an infix. This information is lexically specified. Double exponence can also be observedin the dative case of the first person plural exclusive pronoun (Table .).



Table . Archi infixes, Set II

sg pl

I ‹w›‹b›II ‹r›

III ‹b›‹›IV ‹›

Table . Archi suffixes

sg pl

I -w

-ibII -rIII -bIV -t

To get to grips with the system of gender–number inflection in Archi, we canseparate out three ‘decisions’ required to establish the right agreement form:

1. determine whether the lexical item is an agreeing one;2. choose the right exponent, whether it is a prefix, suffix, or infix (the syncretism

patterns follow from this choice, only suffixes take syncretism pattern B);3. choose the actual shape of the affix (in those instances where there is a choice).

Here the inventory of the exponents does not show great phonological variability; itis not the richness of material that contributes to the complexity, but the number ofchoices required to produce the appropriate word form.This situation resembles thatin Oneida (Koenig and Michelson, this volume), where the number of choices in asingle position class or rule block produces considerable complexity of the language.

This is the bare bones of the problem. The next sections present the factors whichdetermine these choices.

. Factors regulating the system: an overview

There are several interconnected and overlapping factors which determine the shapeand position of the agreement marker, if any. We sketch them here, and then analysethem in turn. Phonology plays a role: for instance, stems beginning in a vowel arelikely to accept a prefixal agreement marker (as in the verb b-ak‰us ‘see.iii.sg’), whilethose with an initial consonant typically do not (ga‹b›χ‰»as ‘take off.iii.sg’).Then thereare derivational suffixes which bring with them an agreement slot; for instance, theoriginally emphatic suffix -ej‹›u licenses an agreement marker within itself (as in



k’ellej‹b›u ‘entirely.iii.sg’). In addition, there are generalizations based on the item’spart of speech: most adjectives and most verbs of a certain morphological type(namely, simple dynamic verbs) agree, while most adverbs do not. Naturally we try toappeal to fewer factors. We might imagine, for instance, that generalizations based onphonological and morphological forms might allow us to account for the agreementmarkers, and that we would then not need to appeal to part of speech classification.It turns out that this move fails, as we see when we consider adjectives. Thus, we cansuggest a phonological generalization, which states that vowel–initial stems are likelyto accept an agreement prefix. However, there are adjectives like o»ro»s ‘Russian’ whichare vowel–initial, and do not inflect for gender and number. This happens becauseagreeing adjectives are formed with the suffix -t‰u / -du / -nnu, which brings a suffixalagreement slot, so the adjectives not formed by suffixation do not take agreementmarkers. There are also adjectives like aburi-t‰u-b ‘tidy.iii.sg’, which are vowel–initialand are formed by suffixation; these do agree, but only by the suffix following thederivational suffix. Thus the phonological generalization works only for a subset ofitems, namely certain parts of speech (verbs) and not for adjectives. Consonant–initialadjectives, unsurprisingly, inflect for gender and number only if they are formed bysuffixation. This shows that adjectives disregard the phonological generalization, andagree suffixally or not at all.

Consider next the adverbs. The majority (ten out of thirteen) adverbs which agree,have the suffix -ej‹›u (or its variant -ij‹›u for younger speakers). This is historicallyan emphatic marker, but in some current adverbs, such as mumat‰ij‹b›u ‘while Iam asking you nicely’ there is no obvious emphatic semantics. In some instances-ej‹›u/-ij‹›u is synchronically a derivational marker in the sense that it is affixed toan independently attested base such as jella ‘this’ from which jellej‹t’›u ‘in this way’ isderived. In others, however, there is no independently attested base in modern Archi,as in allej‹t’›u ‘for free’.

Thus, it is true that every agreeing adverb has a final -u and the agreementmarker is prefixed to it. However, it is not the case that there is a synchronicallyjustifiable derivational rule, which would form adverbs and predict their ability toshow agreement. To add to the picture, we should note that there are also adverbs likeda»šo‰»nu ‘anywhere’ which happen to end in -u but which do not agree.

More generally, it becomes clear that we need reference both to phonological shapeand to part of speech, as well as to morphological and lexical factors at work withinparts of speech. To pull these factors apart, we first look at the (partial) generalizationsbased on the item’s part of speech (6.4).

. Part of speech determining the position of the agreement marker

Establishing the part of speech of an item is relatively straightforward in Archiaccording to syntactic tests. We have seen that items within all parts of speech except



nouns can realize agreement in number and gender. In addition, nouns inflect forcase, while verbs inflect for aspect and mood, among other things. Adverbs, particles,and postpositions do not inflect (except for gender and number).7 Given the part ofspeech of a particular lexical item, we can predict the pattern of syncretism, and wehave some knowledge of the exponent shape, though some parts of speech still allowa choice. We can give default predictions for the agreement forms for the following:

• adjectives: these follow pattern B (suffixes)• adverbs: pattern A (infixes, set I), (there is one exception: the adverb b-allej‹b›u‘for free.III.sg’ which has multiple exponence, see footnote 7)

• postpositions: pattern A (infixes, set I)• enclitic particles: pattern A (infixes, set I)8

Adjectives deserve a little more discussion. As already noted, inflected adjectivesare easily recognized by their form, as they all contain the suffix -t‰u/-du/-nnu.Many of them are participles of stative verbs (more on the stative–dynamic division in6.5). For example, na»μ-du-t ‘blue’ is a participle of the verb na»k≠’ ‘be blue’. However,there are also adjectives produced by the same suffix from adverbs: hinc-du-t ‘present,actual’ (now-attr-iv.sg), jak-du-t ‘deep’ (inside-attr-iv.sg) and loanwords: zor-t‰u-t ‘strong’, χas-du-t ‘special’, mašhur-t‰u-t ‘famous’. When used attributively, adjectivesinflect for number and gender only, but when used independently they can alsoinflect for case. The non-agreeing adjectives do not make a formally or semanticallycoherent group.

Pronouns employ prefixes and infixes (Set I) as exponents of agreement, but thisis restricted in two ways: only the paradigm of the first person pronoun is involved,and within it, only a few cells of the paradigm are affected. Table 6.8 shows a partialparadigm of the first person pronoun and, for a comparison, the second personpronoun. All cases where agreement occurs are shown, but the paradigm is partialbecause there are several more cases in Archi which are not given here.

Shaded cells show the agreeing forms. For those cells which agree, the genitivepronoun agrees with the head of its noun phrase or with the absolutive of the clause, ifthe genitive is governed by the verb; the dative and ergative agree with the absolutiveof the clause.We give just the singular forms (following syncretism patternA, the formfor genders I/II plural is as gender III singular, and the plural of genders III/IV pluralis as gender IVsingular). Note the instance of multiple exponence (prefix and infix) inthe dative of the first person plural inclusive.

Finally, verbs use prefixal and infixal exponents. The system is quite complex,with phonological and morphological factors determining the choice. The next three

7 Some adverbs have the possibility of adding different locative case endings, but for these it is not easyto decide whether we are dealing with inflection or derivation.

8 Verbs are not included in this list since they do not allow ready predictions (see .); for this reason,infix Set II, found only with verbs, is not in the list.



Table . Partial paradigm of first and second person pronouns

un nen

nen‹t’›u žwen

zari

nena‹w›(u)nena‹r›unena‹b›unen‹t’›u etc.

w-is ulud-is d-olo la‹r›u

la‹w›u

b-is wit b-olo la‹b›uis etc. olo etc. la‹t’›u etc. wišw-ez w-el w-ela‹w›(u)d-ez d-el d-ela‹r›ub-ez was b-el b-ela‹b›uez etc. el etc. el‹t’›u etc. wež

≠‰u wa≠‰u la≠‰u žwa≠‰u

waχur

sections provide a more detailed description of verbal agreement and discussion ofthe factors involved.

. Agreement marking in simple dynamic verbs

Archi verbs can be divided into dynamic verbs and stative verbs. To simplify theexposition, we postpone discussion of stative verbs until 6.6. Dynamic verbs arefurther divided into simple and complex.The distinguishing feature of a complex verbis that it consists of an uninflected part9 plus an inflected simple dynamic verb, so thediscussion of agreement inflection involves the simple verbs only. Dynamic verbs havefour aspectual stems (perfective, imperfective, finalis, and potential), and a form of theimperative which also serves as a stem and is often irregular. The verbal stems can beused independently or serve as base for further inflectional forms such as participles,converbs, and various moods. The realization of perfective, imperfective, finalis, andimperative is often irregular and it is not possible to postulate rules to produce onestem from another. Thus, these four forms are the principal parts of the Archi verbal

9 Some complex verbs have a first part derived from the simple verb class; these can take gender–numberagreement.



paradigm. They all take agreement inflection.10 Example 6 shows the principal partsin all four genders and two numbers.

(6) Simple dynamic verb aχas ‘lie down’: aspectual stems and the imperative, infixSet II

perfective imperfective finalis imperativesg pl sg pl sg pl sg pl

i a‹w›χua‹b›χu

w-a‹r›χa-rb-a‹r›χa-r

a‹w›χa-sa‹b›χa-s

w-aχab-aχa

ii a‹r›χu d-a‹r›χa-r a‹r›χa-s d-aχaiii a‹b›χu

aχub-a‹r›χa-r

a‹r›χa-ra‹b›χa-s

aχa-sb-aχa

aχaiv aχu a‹r›χa-r aχa-s aχa

For the verb aχas ‘lie down’ agreement is realized by prefixes and infixes (we discussother possibilities later). It is important to note that the agreement exponents arenot the only infixes employed in the paradigm: the imperfective here is realized bya combination of an infix ‹r› and a suffix -r (to be further discussed when we considerexample (9)). The presence of the imperfective infix implies the presence of the suffix,but not the other way round: the imperfective can be realized by the suffix only.

We can observe some regularities in the paradigm in (6): the perfective andfinalis have infixes, the imperfective and imperative have prefixes. We will returnto this regularity in 6.5.2. Until then, the discussion will focus on the perfectiveand imperfective stems. The four stems presented in (6) have an uneven frequencydistribution: in terms of types, the number of forms produced from the perfective andthe imperfective is higher than the number of forms produced from the finalis andthe imperative (there are numerous converbs and moods based on the perfectiveand the imperfective). In terms of token frequency, the perfective outnumbers theother stems quite dramatically: for 1,525 tokens found in Kibrik’s texts (Kibrik et al.2007), there are 1,037 forms of the perfective and its derivatives, and 138 forms of theimperfective and its derivatives (Chumakina 2011: 21–2). The perfective is also lesscomplex morphologically: it is expressed by the stem only, whereas the imperfective,besides a distinct stem, can have a phonologically distinct exponent (a suffix), veryoften two (suffix and infix) and sometimes three (reduplication, suffix, and infix).

The plural form of a verb is always syncretic with some form of the singular, so theplural forms will also be excluded from further discussion.When comparing differentverbs, we will present the III and IV gender singular forms as they show the position ofthe agreement marker most clearly. In this way we avoid the complications of gendersI and II: gender I singular can be realized as a labialization of the first consonant ratherthan prefix w-, and the gender II infix ‹r› is homonymous with the imperfective infix(see the discussion of (9)).

10 The potential stem is produced by adding the suffix -qi to the perfective stem.There are no irregular-ities in the formation, and the agreement marking is the same as it is in the perfective.



In the Archi grammar (Kibrik 1977b) there are 163 simple dynamic verbs. Of these,144 have a morphological slot for agreement.

Of the 144 inflecting verbs, there are two which deserve mentioning, namelybat‰eš‰as ‘fulfil’ and bije≠‰as ‘begin’. Morphologically, they are not simple, but complexverbs: in the imperfective, their second gender singular forms are batderš‰ar andbijder≠‰ar respectively, i.e. the prefixal form of the gender II marker d- is used, andnot the infixal ‹r›. Synchronically, they are unanalysable, and were included in the listof simple verbs in Kibrik (1977b).We consider them complex verbs and do not includethem in the count. Our count is therefore different from Kibrik’s: 161 simple dynamicverbs, of which 142 have a morphological slot for agreement.

Therefore, we have a frequency-based expectation for the dynamic verbs to realizeagreement overtly. In this section, we discuss the non-inflecting verbs, and then weestablish the types of simple dynamic verbs according to the position of the agreementmarker, before addressing the factors which determine the type to which a verbbelongs.

The first logical step in predicting the form of agreement is to establish whetherthe verb agrees at all. There are nineteen non-inflecting simple verbs in Archi, andthe information on whether the verb agrees appears to be lexically specified. All thesimple dynamic non-agreeing verbs are listed in (7), where we provide the form of theperfective stem.

(7) Non-inflecting simple dynamic verbs (perfective stem)abc’u ‘hew’ bo ‘speak’ dorq’u ‘moan’ o≠‰u ‘be silent’babχ‰»u ‘swell’ boq’»o ‘return’ dub≠‰u ‘sew’ sesu ‘roast’ba»k≠ni ‘press’ dab≠u ‘unlock’ dubqu ‘destroy’ χ»eχ»ni ‘ferment’barhu ‘babysit’ da»šni ‘cloud up’ gobχ‰u ‘scratch’ χwet‰u ‘swear’bec‰’u ‘be able’ dat‰i ‘clear up’ ja‰hu ‘winnow’

There is no apparent semantic homogeneity in this group. As for formal regularities,they can only be formulated as expectations which have partial coverage. First, thereis an expectation for vowel–initial verbs to inflect: only one verb, o≠‰u ‘be silent’, outof nineteen non-inflecting verbs, is vowel–initial. Second, there is an expectation forverbs with initial b- to be non-inflecting: there are seven b-initial verbs in (7), andno b-initial simple inflecting verbs. In this list, there are six non-inflecting d-initialverbs, but for them no expectation can be formulated, as there are also four inflectingd-initial verbs. We might assume that the fact that b- initial verbs do not inflect isa consequence of b- being one of the gender–number markers. However, d- is also agender–numbermarker, andwefind both inflecting andnon-inflecting d-initial verbs:there are six non-inflecting ones in (7) and four inflecting d-initial verbs11.

11 It might appear that the non-inflecting verbs end in a high or mid-high vowel. However, this ispart of a more general pattern, irrespective of the inflecting/non-inflecting distinction. Inflecting verbs aredistributed as follows: twenty-five end in -i, seventy-two in -u, twenty in -e, fourteen in -o, and eleven in -a.



The rest of the simple dynamic verbs have a morphological slot for agreement,which can be prefixal or infixal. Based on this, three types of morphological markingof verb agreement can be distinguished, and we present them in turn.

Prefixal: the agreement marker is a prefix in all forms:12

(8) acu ‘milk’perfective imperfective finalis imperativeiv.sg iii.sg iv.sg iii.sg iv.sg iiisg iv.sg iii.sgacu b-acu a‹r›ca-r b-a‹r›ca-r aca-s b-aca-s aca b-aca

This is the most simple type for two reasons: first, all the principal parts have the sameposition for the agreementmarker. Second, since this position is prefixal, aswe observein the forms for gender III, there is no conflict in the imperfective between the infixalimperfective marker ‹r› and the agreement marker.

Infixal: the agreement marker is an infix in all forms:

(9) caχu ‘throw’perfective imperfective finalis imperative

iv.sg iii.sg iv.sg iii.sg iv.sg iii.sg iv.sg iii.sgcaχu ca‹b›χu ca‹r›χa-r ca‹b›χa-r caχa-s ca‹b›χa-s caχa ca‹b›χa

This type is slightly more complex than the prefixal one: although all forms havethe same position for the agreement marker, this position is infixal, which createscompetitionwith the imperfective. Compare the gender IV singular form carχar in theimperfective with the gender III singular: cabχar. The agreement marker of gender IVis zero; this leaves the infixal position ‘vacant’, and this infixal position is taken by theimperfective exponent ‹r›. In contrast, in gender III singular, all we see is the agreementmarker ‹b›, which blocks the imperfective marker from appearing infixally. However,the imperfective still has an overt realization -r in the suffixal position.

Mixed: the agreement marker is an infix in the perfective and finalis, and a prefixin the imperfective and imperative:

(10) ak≠u ‘put through’

perfective imperfective finalis imperativeiv.sg iii.sg iv.sg iii.sg iv.sg iii.sg iv.sg iii.sgak≠u a‹b›k≠u a‹r›k≠a-r b-a‹r›k≠a-r ak≠a-s a‹b›k≠a-s ak≠a b-ak≠a

12 Here we show the forms of genders III and IV only to indicate the position of the agreement marker.There is no overt IV gender singular exponent. Recall that the potential stem is formed in a fully productiveway and so it is not included here.



In one respect, this type is less complex than the infixal one in that there is nocompetition between markers in the imperfective. But it is more complex than bothof these types in that there is no single position for the agreement marker in theparadigm.

The mixed type demonstrates clearly that infixation in Archi is indeed ‘frivolous’,that is, it is not phonologically conditioned (Yu 2007: 41–2); see also Anderson (thisvolume) on the non-canonicity of infixation. The gender III form of the imperfectiveb-a‹r›k≠a-r shows that there is no phonological condition preventing gender–numberbeing marked by a prefix. In this form, the gender–number marker takes prefixalposition, the aspectual marker takes the infixal one. Similarly, the imperative b-ak≠ashows prefixal marking. However, in the perfective and finalis both positions are openfor the gender–numbermarker (compare (8) where perfective and finalis have prefixalagreements). Yet all verbs which belong to the mixed type choose the infixal positionin the perfective and finalis for the agreement exponent, in spite of the prefixal positionbeing phonologically available.

The distribution of verbs over these three types (prefixal, infixal, and mixed) ispartly determined by the phonology. We discuss the phonological factors in the nextsection.

.. Phonological factors determining the position of the agreement marker

Phonological shape can be a predictor for the verb’s type of agreement realization(prefixal, infixal, or mixed). We need to take the following into account:

• initial phoneme type (consonant or vowel)• number of syllables• place of stress• individual phonemes

The major division is into consonant–initial and vowel–initial verbs, and we discussthe other factors within this broad division. Verbs which are vowel–initial are allbisyllabic with the stress on the first syllable. Consonant–initial verbs display greatervariety in their phonological shape: they can be mono- or polysyllabic, with stress oneither the initial or the second syllable.

Vowel–initial verbs can belong to the prefixal or mixed type, consonant–initialverbs can be infixal or prefixal. In the following sections (6.5.1.1 through 6.5.1.5)we will discuss each shape in turn and define the type of agreement that it takes(prefixal, infixal or mixed). It is the consonant–initial verbs where the placement ofthe agreement marker is fully phonologically conditioned, so we will start with thediscussion of these.



... Consonant–initial polysyllabic verbs with the stress on the first syllable: infixesIf the verb is consonant–initial, polysyllabic, and the stress falls on the first syllable,then gender–number agreement is realized by infixes both in the imperfective and theperfective. In the imperfective the agreement marker replaces the imperfective infixeverywhere except where agreement has no realization, as in gender IV singular, andthere the imperfective infix ‹r› surfaces:

(11) perfective imperfectiveiv.sg iii.sg iv.sg iii.sg

throw cáχu cá‹b›χu cá‹r›χa-r cá‹b›χa-rmust kwášu kwá‹b›šu kwá‹r›ša-r kwá‹b›ša-rsearch χ‰wá‰k’u χ‰wá‹b›k’u χ‰wá‹r›k’a-r χ‰wá‹b›k’a-r

Out of 142 inflecting simple dynamic verbs there are twenty-two verbs that meet thesephonological conditions, but only twenty of them realize agreement by an infix.Thereare two verbs which have prefixes:


frown k≠’ók≠’u bo-k≠’ók≠’u k≠’ók≠’u-r bo-k≠’ók≠’u-rget lost q’»ák’a ba-q‰’»á-ba-k’a q’»é-k’e‹r›k’i-r be-q‰’»é-be-k’e‹r›k’i-r

The verb k≠’ók≠’u ‘frown.pfv.iv.sg’ is formed by derivational reduplication (as opposedto the inflectional reduplication used in the production of the imperfective), and thisis exceptional in the Archi verbal system. The verb q’»ák’a ‘get lost.pfv[iv.sg]’ lookslike a complex verb consisting of two parts, q’»á and k’a. Double exponence of gender–number and the usage of a prefix in the second position (as in dá-q‰’»á-da-k’a ‘getlost.pfv.ii.sg’) suggest the independence of the parts, but other characteristics arguein favour of considering this to be one word. First, it has a single stress. Second, if itwere a complex verb, then the fact that both parts take agreement inflection wouldindicate that both parts were verbal. If that were the case, they should both realizethe imperfective, but while the second part k’e‹r›k’i-r does this by familiar means(reduplication, vowel change, a suffix, and an infix), the first part only changes thevowel, which would be a unique way of forming the imperfective in Archi. And finally,the two parts have no separatemeanings.Thus the two verbs k≠’ók≠’u ‘frown’ and q’»ák’a‘get lost’ do not follow the pattern of the main group.

... Consonant–initial monosyllabic verbs: prefixes If a verb consists of one sylla-ble only, the agreement is realized by prefixes with an epenthetic vowel before the firstconsonant of the stem:




melt c’o bo-c‰’ó13 c’a-r ba-c‰’á-rlick čo bo-čó ča-r ba-čá-rdie k’a ba-k’á k’a-r ba-k’á-rignite k’u bu-k’ú k’wa-r bu-k’á-rslaughter k≠’u bu-k≠’ú k≠’wa-r bu-k≠’á-rdivide q’»o bo-q‰’»ó q’»a-r ba-q‰’»á-rtouch s‰o bo-s‰ó s‰a-r ba-s‰á-rget up χ‰o bo-χ‰ó χ‰a-r ba-χ‰á-rcarry χ»o bo-χ»ó χ»a-r ba-χ»á-r

Note that except for the verb k’a ‘die’, all of these verbs have [o] or [u] in the perfectivestem.The consonants showmore variety, in terms of place andmanner of articulation:there is no restriction on the initial consonant. The imperfective is realized by vowelchange and the suffix -r.When the agreement prefix attaches, the stress remains on thestem, so the epenthetic vowel is pronounced as schwa, as there is no unstressed [a] or[o] in Archi. We follow the orthographic convention suggested by Kibrik et al. (1977:351) and spell the pretonic vowel with the same character as the stressed vowel. Theprefixal [u] in the third gender forms of ‘ignite’ and ‘slaughter’ is a reflex of a labializedconsonant (seen in k’wa-r and k≠’wa-r respectively).

An interesting variation of this phonological shape is the verbs which are mono-syllabic only in the perfective, and reduplicated in the imperfective. They, too, realizeagreement by prefixes:


rot ša» ba»-šá» šé»‹r›ši-r be»-šé»‹r›ši-r14

win χa ba-χá χé‹r›χi-r be-χé‹r›χi-r

We should ask why there is no agreement infix in the imperfective, i.e. why it isbe»-šé»‹r›šir, not ∗šé»‹b›šir. If the rules were entirely phonological, then šé»‹r›ši-r in(14) should have had the infixal agreement marker, just as cá‹b›χa-r in (9). This isbecause it is a consonant–initial bisyllabic stem with stress on the first syllable, i.e. ithas the phonological properties which favour infixation. Instead of this, it maintains

13 We show the position of the stress only where it is relevant to the discussion.14 The vowel after b- in perfective and imperfective is the same vowel (some kind of schwa); the apparent

difference here is an artefact of the spelling convention.



the morphological properties of the base lexeme. In fact, we do not find an instancewhere the agreement is infixal in the imperfective and prefixal in the perfectiveanywhere in the verbal system of Archi. The perfective is morphologically simplerthan the imperfective and therefore it is not surprising that more morphologicallycomplex (infixal) form for gender–number occurs in just the perfective, or in bothaspectual stems but never in the imperfective alone.Thus, in order to predict the type(prefixal, infixal, or mixed), it is enough to describe the phonological shape of theperfective. Morphology is therefore playing a role here because these shapes describethe perfective stems, but the result is valid for the whole paradigm, indicating thatthis is a property associated with the lexeme.15 This regularity is complete: out of 144inflected simple dynamic verbs, there are twenty-one consonant–initial verbs whichhave a monosyllabic perfective, and they are all of the prefixal type.

... Consonant–initial polysyllabic verbs with the stress on the second syllable: pre-fixes If the verb is consonant–initial and polysyllabic, and the stress falls on thesecond syllable, the agreement is realized by a prefix with an epenthetic vowel (15).Theverbs of this shape are underlyingly monosyllabic (in the perfective) so the realizationof agreement is unsurprisingly the same as that of the monosyllabic verbs discussedpreviously.

(15) PERFECTIVE IMPERFECTIVEIV.SG III.SG IV.SG III.SG

sift c’enné be-c’né c’émc‰’in be-c‰’émc‰’inpress č’e»nné» be»-č’né» č’a» n ba» -č‰’á» nreconcile q’oc’ó bo-q’c’ó q’ac’á-r ba-q’c’á-reat kunné bu-kné kwan bu-kánpull k≠enné be-k≠né k≠an ba-k≠ántickle ≠oró≠ni bo-≠ró≠ni ≠oró≠in bo-≠ró≠in

The forms with overt agreement (iii.sg here) look very much like the forms we saw in(13): there is an epenthetic vowel after the agreementmarker, and the form is bisyllabicwith the stress on the second syllable. The first vowel in the gender IV singular form,as in c’enné ‘sift’, is an epenthetic vowel, and the gender III singular be-c’né has the samesyllabic structure as the gender III singular of ‘lick’ (bo-čó).

15 Note, however, that the perfective only predicts the verb behaviour in terms of gender–numbermarking. The form of the imperfective itself cannot be predicted by the perfective.



It is the shape of the perfective which predicts that the verb belongs to the prefixaltype: all the perfectives of this type are bisyllabic with the stress on the secondsyllable (underlyingly monosyllabic). The imperfective can be monosyllabic (as inč’a»n ‘press.ipfv[iv.sg]’), reduplicated (c’émc‰’in ‘sift.ipfv[iv.sg]’), or bisyllabic (q’ac’ár‘reconcile.ipfv[iv.sg]’). But all these verbs realize agreement in the same way, by aprefix. There are twenty-six such verbs.

... Vowel–initial verbs Vowel–initial (inflecting) verbs show a different picture.They number seventy-three and, except for the single verb as ‘do’, they are all bisyllabicwith the stress on the first syllable. The stress does not move, so we do not indicate ithere. Thirty-four vowel–initial verbs belong to the prefixal type, as in (16):


soften a‰q’»u b-a‰q’»u a‹r›q’»u-r b-a‹r›q’»u-rmilk acu b-acu a‹r›ca-r b-a‹r›ca-r

see ak‰u b-ak‰u ak‰u-r b-ak‰u-rleave akdi b-akdi a‹r›k‰i-r b-a‹r›k‰i-rbite eq‰’u b-eq‰’u e‹r›q‰’u-r b-e‹r›q’u-r

pour eχu b-eχu e‹r›χu-r b-e‹r›χu-r

Thirty-nine vowel–initial verbs belong to the mixed type, that is, they have infixes inthe perfective (and finalis), and prefixes in the imperfective (and the imperative) asillustrated in (17):


break aq»u a‹b›q»u a‹r›q»a-r b-arq»a-r

do aču a‹b›ču a‹r›ča-r b-arča-rbreak ak’u a‹b›k’u a‹r›k’u-r b-ark’u-rstay eχ‰u e‹b›χ‰u e‹r›χ‰u-r b-erχ‰u-r

rain eχdi e‹b›χdi e‹r›χi-r b-erχi-r

There seems to be no apparent difference in the phonological shape of verbs whichbelong to the different types: the syllabic structure and the stress placement are thesame, and both prefixal and mixed verbs have affricates, fortis consonants, pharyn-gealized and ejective consonants. Table 6.9 compares verbs of similar phonologicalshape which have different types of placement of the agreement marker.



Table . Verbs of similar phonology, but different patterns of realizing agree-ment

gloss consonant typeperfective imperfective

iv.sg iii.sg iv.sg iii.sg

soilfortis

prefixal aχ‰u b-aχ‰u a‹r›χ‰u-r b-a‹r›χ‰u-rstay mixed eχ‰u e‹b›χ‰u e‹r›χ‰u-r b-e‹r›χ‰u-rbreak

pharyngealizedprefixal aq»u a‹b›q»u a‹r›q»a-r b-a‹r›q»a-r

switch off mixed aχ»u a‹r›χ»u a‹r›χ»u-r b-a‹r›χ»u-rbreak ejective mixed ak’u a‹b›k’u a‹r›k’u-r b-a‹r›k’u-rwake up affricate mixed ek≠u e‹b›k≠u e‹r›k≠u-r b-e‹r›k≠u-rsweep

ejective affricateprefixal ek≠’u b-ek≠’u e‹r›k≠’u-r b-e‹r›k≠’u-r

sober up mixed o»č’ni o»‹b›č’ni o»‹r›č’in b-o»‹r›č’in

A phonological regularity can be observed for vowel–initial verbs, but it concernsonly a minority of them: verbs which have [r] in the perfective stem in combinationwith some other consonant, are prefixal. Some examples:


hide erq’w»ni b-erq’w»ni erq’w»in b-erq’w»inscoop erχ‰»u b-erχ‰»u erχ‰»u-r b-erχ‰»u-r

cool down o»rču b-o»rču o»rču-r b-o»rču-rdunk ors‰u b-ors‰u ors‰u-r b-ors‰u-r

There are twelve such verbs. For the rest of the vowel–initial verbs, the placement ofthe agreement marker is lexically specified.

... Summary The analysis of the verbs according to their phonological shapeand the number of instances is brought together in Figure 6.1. Recall that the phono-logical characteristics of the verb concern its perfective stem only, but the result (typemembership) is for the whole paradigm, hence there is a morphological factor in playhere as well as a phonological one.

Figure 6.1 should be read as follows: there are sixty-nine consonant–initial simpledynamic verbs out of which there are twenty-one monosyllabic and forty-eight poly-syllabic verbs.16 The polysyllabic verbs are divided according to the stress placement:twenty-two verbs have the stress on the first syllable and they are infixal. Twenty-sixpolysyllabic verbs have the stress on the second syllable and they are all prefixal. And

16 Recall that the vowel–initial verbs are all bisyllabic.



infixal

prefixal

mixed

(22)

(39)

(20)

(2)

vowel-initial (73)

consonant-initial (69) polysyllabic (48)first-syllable stress (22)

second-syllable stress (26)

stem: [r+C(C)] (12)

monosyllabic (21)

stem: other (61)

Figure . Types of simple dynamic verbs according to the phonological shape of theirperfective stem (total 142)

so on.This accounts for 142 inflecting verbs.We can draw several conclusions. First, inabsolute terms, there is a tendency for Archi simple dynamic verbs to choose prefixalmarking: eighty-three in total out of 142. (To get to this number, one needs to add allnumbers above the arrows pointing to the prefixal type: 2+26+21+ 12+22.) Second,for just over a half of the verbs (eighty-one out of 142) the placement of the agreementmarker is regulated by phonological factors (all arrows except the lower two, shownin bold, define the phonological shape of the verb which predicts the realization of theagreement); these phonological factors predict prefixes more frequently than infixes(sixty-one against twenty). And third, there are sixty-one verbs remaining for whichthe placement of the marker is lexical information (two bold arrows). Of these sixty-one verbs, thirty-nine are maximally complex: besides requiring lexical specificationof the realization of agreementmarkers, they show a split in the paradigm (some stemstake infixes, some take prefixes), and this is turn indicates that they show frivolousinfixation. In this respect, Archi is more complex than some other Daghestanianlanguages. For instance, inTsez phonology is the decisive factor, and only vowel–initialverbs agree (Polinsky and Potsdam 2001: 586). Figure 6.1 shows clearly the status ofthe phonological factors: on the left we have these factors and the number of verbsto which they apply; on the right we have decision points with no factor attached,indicating that we have not been able to find an operative phonological factor here.

.. Morphological factors

So far, we have shown the formation and the agreement position of two stems only,perfective and imperfective. We should consider the rest of the paradigm. As we



noted earlier, the only other two forms which we need to consider are the finalisand imperative, given that the rest of the very large paradigm can be predicted fromthem. For the placement of the agreement marker, we need to discuss the mixed typeonly. For this, there are the following paradigm regularities: in the finalis, the gender–number markers are placed as they are in the perfective, while the imperative has thesame gender–number marking as the imperfective17. This is shown in (19):

(19) gloss PERFECTIVE IMPERFECTIVEIV. SG III. SG IV. SG III. SG

‘die out’ (of flame)‘wake up’

FINALIS IMPERATIVEIV. SG III. SG IV. SG III. SG

aχ u a‹b›χ u a‹r›χ u-r b-a‹r›χ u-rek u e‹b›k u e‹r›k u-r b-e‹r›k u-r

aχ a-s a‹b›χ a-s aχ a b-aχ aek a-s e‹b›k a-s ek a b-ek a

However, this paradigmatic regularity holds only if the phonological shape of thefinalis stem allows it. For four verbs of themixed type this is not the case: their finalis isirregular and has a phonological shape that requires prefixes (rather than the expectedinfixes):

(20) PERFECTIVE IMPERFECTIVE FINALIS IMPERATIVEgloss IV.SG III.SG IV.SG III.SG IV.SG III.SG IV.SG III.SGbecome édi é‹b›di ke-r be-ké-r ke-s be-ké-s ká ba-kágo óq»a ó‹b›q»a ó‹r›q»i-r b-ó‹r›q»i-r q»e-s be-q»é-s óq»a b-óq»atake away óχ‰a ó‹b›χ‰a ó‹r›χ‰i-r b-ó‹r›χ‰i-r χ‰e-s be-χ‰é-s χ‰éχ‰a be-χ‰éχ‰atake along óka ó‹b›ka ó‹r›ki-r b-ó‹r›ki-r kará-s ba-krá-s karáka ba-kráka

In the first three verbs, édi ‘become’, óq»a ‘go’, and óχ‰a ‘take away’, the finalis stem isconsonant–initial and monosyllabic. As we saw in the discussion of the perfectives,forms shaped this way realize agreement by prefixes with an epenthetic vowel, andthe finalis here follows this rule. The finalis stem of the fourth verb, óka ‘take along’, isbisyllabic with the stress on the second syllable, and such stems, as we saw in 6.5.1.3,also take prefixes with an epenthetic vowel.

Thus the morphological factor regulating the shape of the paradigm holds only ifthe realization of gender–number does not violate the phonological rules.

17 This regularity applies trivially across the rest of the system, since prefixal verbs have prefixesthroughout and infixal verbs have infixes throughout.



.. Conclusions for simple dynamic verbs

We have exemplified the phonological factors that have an impact on gender–numbermarking in simple dynamic verbs. We have also demonstrated that the phonologicalshape of the perfective determines the behaviour of the rest of paradigm. There isone mixed type possible, and this rests on an implicational relation of stems in theparadigm, which shows that morphological factors have a role. A point of generalinterest is that the Archi system does indeed have frivolous infixing, that is, infixingwhich is not determined by phonological factors.This is clear from the instanceswhereinfixal agreement occurs even though a viable prefixal slot is available. We showedthat there are verbs (sixty in total) where phonological factors allow both infixes andprefixes. These are vowel–initial verbs (there are seventy-five of them altogether, butfifteen of them have [r] in the stem, and infixes are not possible in this environment).Out of sixty verbs which, in principle, could belong to the prefixal or to the mixedtype, forty belong to the mixed type, that is, they have infixes in the perfective andfinalis where there is no apparent reason for them not to have prefixal marking. Suchverbs also demonstrate that phonological andmorphological factors are not the wholestory. For a significant number of verbs, the placement of the agreement marker islexically specified, and the majority of such verbs belong to the more complex mixedtype, which involves a split in the paradigm with frivolous infixation in a part of it.Our discussion in this section has covered the dynamic verbs. We now turn to thecontrasting situation found in the stative verbs.

. Stative verbs

Themost salient and clearest characteristic of stative verbs is that they lack themultiplestems of the dynamic verbs. In fact, they have only one stem, the imperfective, whichcan be used as an independent predicate:

(21) ja-rthis-ii.sg

laha-s‰-ugirl(ii).sg.obl-dat-and

hanwhat(iv)[sg.abs]

sini?know

‘What then does this girl know?’18

Stative verbs also have all the usual forms derived from the imperfective stem, suchas participles, converbs, and adverbial nouns, for example: sin-ši ‘knowing’ (converb),sini-t‰u-t ‘known’ (participle), sini-kul ‘knowledge’ (adverbial noun). The dynamic–stative distinction corresponds largely to a semantic distinction. Stative verbs denotea state rather than an action, for example: do‰»z ‘be big’, aχ ‘be far’, hiba ‘be good’, ja»t’an

18 The example is sentence from text of the online collection of Archi texts (Kibrik et al. ). Incontext the force is: ‘How would she know she had been lied to?’ The experiencer stands in the dative andthe perceived entity in the absolutive.



‘be red’. However, sometimes the semantic difference is less clear; compare the stativeverbs k≠’an ‘love, want’, sini ‘know, know how’, and dynamic arhas ‘think, worry’.

Wemight expect that stative verbs would take agreementmorphology like dynamicverbs, to the extent that is possible given their single stem. However, unlike dynamicverbs, stative verbs are mostly non-inflecting: out of 190 simple stative verbs in thecurrent version of the Archi dictionary (Chumakina et al. 2007) only seven inflect.This is the full list of inflecting stative verbs:

(22) gloss iv.sg iii.sgache, hurt ác‰’ar b-ác‰’arbe enough aχ» b-aχ»

be tasty ic’ b-ic’be hungry íq‰’»a b-íq‰’»a

be heavy iq’w» b-iq’w»

be wide q’wa b-uq’abe better χáli b-aχáli

For agreement inflection, stative verbs use prefixes only. In all instances but onethis is in accordance with the rules for dynamic verbs. That is, although we do notfind as many stative verbs realizing agreement as we would expect if they followed therules for dynamic verbs, when they do show agreement morphology, it is normallyaccording to the same rules as for dynamic verbs. Five out of seven verbs are vowel–initial and therefore allow prefixes, though in principle they could have infixes as well.The verb q’wa ‘be wide’ is a consonant–initial monosyllabic and takes the agreementprefixwith an epenthetic vowel (realized as [u] due to the root consonant labialization)just like dynamic verbs do.

The verb χáli ‘be better’ behaves differently. It is a bisyllabic verb with initial stress.If it were a dynamic verb, we would expect an infix here, like, cáχas ‘throw’ whose IIIgender singular is cá‹b›χas. However, χáli takes a prefix.

The agreement behaviour of stative verbs, therefore, confirms their separate status.The fact that a verb belongs to the class of statives, which is morphologically deter-mined by the fact of having only an imperfective stem, provides a default predictionthat it will not agree. For seven verbs the fact that they do agree is lexically specifiedinformation (compare the dynamic verbs where 142 out of 161 agree, and for most ofthe non-agreeing dynamic verbs this is lexical information). When stative verbs doshow agreement, it is generally in line with the factors operating for dynamic verbs.However, just one verb takes a prefix where an infix would be expected. The effect isthat those stative verbs which agree (the minority) all take prefixes.



. Conclusions

We have focused on a small part of the extensive paradigms of Archi, the realizationof gender–number agreement. We have shown that the behaviour of a given lexicalitem can be predicted in part, on the basis of information as to its part of speech:thus, for instance, most adverbs do not agree, but those that do take infixes of Set Iand so their syncretisms follow Pattern A. We saw the important role of phonologicalfactors, particularly in our discussion of verbs.These factors had to be complementedby morphological predictions: within the verbs there are opposing predictions fordynamic verbs (by default these agree) and stative verbs (by default these do not).Moreover, there is a hierarchical relation of stems, in that the perfective determinesthe type of the paradigm, and the mixed type of verb (with prefixes and infixes)involves just two pairings of stems. However, even allowing for all of these differentand overlapping predictive factors, our analysis allows for a significant residue. It isimportant to recall thatmany verbs belong to a complex type, with frivolous infixationin a part of the paradigm. Thus within this small area of the system, the gender–number paradigm, we have found considerable complexity, since for all the differentfactors we demonstrated, we still had to appeal to lexical specification for a significantproportion of the lexical items.

Acknowledgements

The support of the AHRC (grant AH/I027193/1 From competing theories to fieldwork:the challenge of an extreme agreement system) and of the ERC (grant ERC-2008-AdG-230268 MORPHOLOGY) is gratefully acknowledged. Versions of the chapterwere presented at the Conference on Morphological Complexity in London (January2012) and at the 15th International Morphology Meeting in Vienna (February 2012);we thank both audiences for their helpful comment. We are also grateful to JohnHarris and Erich Round for useful advice and comments, and to Lisa Mack for helpin preparing the chapter.


Part III

Measuring Complexity



Contrasting modes of representationfor inflectional systems: Someimplications for computingmorphological complexity

GREGORY STUMP AND RAPHAEL A. FINKEL

. Introduction

Recent research has drawn attention to the possibility of measuring an inflection-classsystem’s complexity by means of computations based on a formal representation ofthat system (see, for example, Moscoso del Prado Martín et al. 2004, Ackerman et al.2009, Finkel and Stump 2009,Milin et al. 2009). Proposed computations include bothinformation-theoretic measures (the conditional entropy of cells and of paradigms)and set-theoretic measures (number of principal parts, cell predictiveness, the pre-dictability of a cell’s word form and of a lexeme’s inflection-class membership). Someof this research has neglected to take account of how critically these computationsdepend on the manner in which inflection-class systems are represented. The samesystem may be represented in different ways, each giving rise to a different set ofresults.In our work, we represent a language’s inflectional system as a plat1: a table with

morphosyntactic property sets (MPSs) on the horizontal axis, inflection classes (ICs)on the vertical axis, and the exponence of a particular property set in a particularinflection class in the appropriate column and row (Stump and Finkel 2013). As weshow, the construction of a plat for a given language entails a series of choices:What arethe MPSs for which lexemes inflect in that language? What are its ICs? And, perhapsmost importantly, how are its inflectional exponents to be represented?

1 In American English, a plat is ‘a plan, map, or chart of a piece of land with actual or proposed features(as lots)’ [Merriam-Webster online].


Gregory Stump and Raphael A. Finkel

By way of illustration, consider the forty English verbs in (1). A plat representingthe inflectional differences among these verbs minimally includes the MPSs in (2).2We assume a variety of English in which the verbs in (1) have the forms representedorthographically in Table 7.1; but because the characteristics of English orthographyintroduce issues that are orthogonal to our present concerns, we focus our attentionon phonological and morphophonological representations of a verb’s exponence. Theinflectional differences among the verbs in (1) are always situated in or after the stem’ssyllable nucleus; for this reason, the inflectional exponents listed in our plats consistof a verb stem’s rime and its inflectional suffix (if there is one). In this mode ofrepresentation, each of the verbs in (1) inflects differently; we accordingly use eachverb as the name of the inflection class to which it belongs. Thus, the plats that wediscuss here have five columns (one for each of the property sets in (2)) and forty rows(one for each verb in (1)).

(1) band build do flee hide load pass ride send standbite buy draw fly last lose pay say shop stingbring cast feel hang lean make peel see sing teachbudge choose fight have light mean read seek slide write

(2) {infinitive / subjunctive / default present} e.g. to sing / that she sing / we sing{3sg present indicative} he sings{past / irrealis} she sang / if he sang tomorrow{present participle} singing{past participle} sung

We consider two possible plats for the verbs in (1). In the first of these, given inTable 7.2, we represent a verb form’s exponence (its stem’s rime and—if one is present—its inflectional suffix) in phonemic transcription. (Throughout, transcriptions depicta standard American English pronunciation.) In the second, given in Table 7.3, weuse morphophonemic transcription augmented by morph boundaries and indicesthat identify aspects of a word’s morphological structure; these include the indicesin Table 7.4.It is clear that these plats represent different things. The plat in Table 7.2 represents

differences among the forms in Table 7.1 that are directly accessible to auditory per-ception (specifically, those differences that are phonemically contrastive); we thereforecall the first plat a hearer-oriented (H-O) plat. In this plat, the past participles of castand pass are alike (both have the exponence /æst/), as are those ofmean and send (withexponence /εnt/).Theplat in Table 7.3, by contrast, represents a fluent English speaker’s

2 If be were among these verbs, we would have to assume a somewhat more elaborate system ofmorphosyntactic contrasts, since be has a first-person singular present indicative form (am); its infini-tive/subjunctive form be is distinct from its default present form are; it exhibits a special singular pastindicative form (was); and it exhibits a number of negatively inflected forms (aren’t, ain’t, isn’t, wasn’t,weren’t).


Contrasting modes of representation for inflectional systems

Table . Orthographic forms of the verbs in (1)

Lexeme {inf} {3sgPresInd} {past} {presPple} {pastPple}

band band bands banded banding bandedbite bite bites bit biting bittenbring bring brings brought bringing broughtbudge budge budges budged budging budgedbuild build builds built building builtbuy buy buys bought buying boughtcast cast casts cast casting castchoose choose chooses chose choosing chosendo do does did doing donedraw draw draws drew drawing drawnfeel feel feels felt feeling feltfight fight fights fought fighting foughtflee flee flees fled fleeing fledfly fly flies flew flying flownhang hang hangs hung hanging hunghave have has had having hadhide hide hides hid hiding hiddenlast last lasts lasted lasting lastedlean lean leans leaned leaning leanedlight light lights lit lighting litload load loads loaded loading loadedlose lose loses lost losing lostmake make makes made making mademean mean means meant meaning meantpass pass passes passed passing passedpay pay pays paid paying paidpeel peel peels peeled peeling peeledread read reads read reading readride ride rides rode riding riddensay say says said saying saidsee see sees saw seeing seenseek seek seeks sought seeking soughtsend send sends sent sending sentshop shop shops shopped shopping shoppedsing sing sings sang singing sungslide slide slides slid sliding slidstand stand stands stood standing stoodsting sting stings stung stinging stungteach teach teaches taught teaching taughtwrite write writes wrote writing written



Table . Hearer-oriented plat for the verbs in (1)


band ænd ændz ændәd ændëŋ ændәdbite aët aëts ët aëtëŋ ëtnbring ëŋ ëŋz :t ëŋëŋ :tbudge RM RMәz RMd RMëŋ RMdbuild ëld ëldz ëlt ëldëŋ ëltbuy aë aëz :t aëëŋ :tcast æst æsts æst æstëŋ æstchoose uz uzәz oFz uzëŋ oFzndo u Rz ëd uëŋ Rndraw : :z u :ëŋ :nfeel il ilz εlt ilëŋ εltfight aët aëts :t aëtëŋ :tflee i iz Fd iëŋ εdfly aë aëz u aëëŋ oFnhang æŋ æŋz Rŋ æŋëŋ Rŋhave æv æz æd ævëŋ ædhide aëd aëdz ëd aëdëŋ ëdnlast æst æsts æstәd æstëŋ æstәdlean in inz ind inëŋ indlight aët aëts ët aëtëŋ ëtload oFd oFdz oFdәd oFdëŋ oFdәdlose uz uzәz :st uzëŋ :stmake eëk eëks eëd eëkëŋ eëdmean in inz εnt inëŋ εntpass æs æsәz æst æsëŋ æstpay eë eëz eëd eëëŋ eëdpeel il ilz ild ilëŋ ildread id idz εd idëŋ εdride aëd aëdz oFd aëdëŋ ëdnsay eë εz :d eëŋŋ εdsee i iz : iëŋ inseek ik iks :t ikëŋ :tsend εnd εndz εnt εndëŋ εntshop &p &ps &pt &pëŋ &ptsing ëŋ ëŋz æŋ ëŋëŋ Rŋslide aëd aëdz ëd aëdëŋ ëdstand ænd ændz Fd ændëŋ Fdsting ëŋ ëŋz Rŋ ëŋëŋ Rŋteach i< i<z :t i<ëŋ :twrite aët aëts oFt aëtëŋ ëtn



Table . Speaker-oriented plat for the verbs in (1)


band ænd ænd-z ænd-d ænd-ëŋ ænd-dbite aët aët-z lax(ët) aët-ëŋ lax(ët)-nbring ëŋ ëŋ-z rime(:t) ëŋ-ëŋ rime(:t)budge RM RM-z RM-d RM-ëŋ RM-dbuild ëld ëld-z ëld-t ëld-ëŋ ëld-tbuy aë aë-z rime(:t) aë-ëŋ rime(:t)cast æst æst-z æst æst-ëŋ æstchoose uz uz-z ablaut(oFz) uz-ëŋ ablaut(oFz)-ndo u ablaut(R)-z ablaut(ë)-d u-ëŋ ablaut(R)-ndraw : :-z ablaut(u) :-ëŋ :-nfeel il il-z lax(εl)-t il-ëŋ lax(εl)-tfight aët aët-z rime(:t) aët-ëŋ rime(:t)flee i i-z lax(ε)-d i-ëŋ lax(ε)-dfly aë aë-z ablaut(u) aë-ëŋ ablaut(oF)-nhang æŋ æŋ-z ablaut(Rŋ) æŋ-ëŋ ablaut(Rŋ)have æv subtractCoda(æ)-z subtractCoda(æ)-d æv-ëŋ subtractCoda(æ)-dhide aëd aëd-z lax(ëd) aëd-ëŋ lax(ëd)-nlast æst æst-z æst-d æst-ëŋ æst-dlean in in-z in-d in-ëŋ in-dlight aët aët-z lax(ët) aët-ëŋ lax(ët)load oFd oFd-z oFd-d oFd-ëŋ oFd-dlose uz uz-z lax(:z)-t uz-ëŋ lax(εz)-tmake eëk eëk-z subtractCoda(eë)-d eëk-ëŋ subtractCoda(eë)-dmean in in-z lax(εn)-t in-ëŋ lax(εn)-tpass æs æs-z æs-d æs-ëŋ æs-dpay eë eë-z eë-d eë-ëŋ eë-dpeel il il-z il-d il-ëŋ il-dread id id-z lax(εd) id-ëŋ lax(ëd)ride aëd aëd-z ablaut(oFd) aëd-ëŋ ablaut(ëd)-nsay eë lax(ε)-z lax(ε)-d eë-ëŋ lax(ε)-dsee i i-z ablaut(:) i-ëŋ i-nseek ik ik-z rime(:t) ik-ëŋ rime(:t)send εnd εnd-z εnd-t εnd-ëŋ εnd-tshop &p &p-z &p-d &p-ëŋ &p-dsing ëŋ ëŋ-z ablaut(æŋ) ëŋ-ëŋ ablaut(Rŋ)slide aëd aëd-z lax(ëd) aëd-ëŋ lax(ëd)stand ænd ænd-z rime(Fd) ænd-ëŋ rime(Fd)sting ëŋ ëŋ-z ablaut(Rŋ) ëŋ-ëŋ ablaut(Rŋ)teach i< i<-z rime(:t) i<-ëŋ rime(:t)write aët aët-z ablaut(oFt) aët-ëŋ ablaut(ët)-n



Table . Morphological indices employed in the speaker-oriented plat

Notation Significance

ablaut(x) the vowel in x is an ablaut alternant of the stem vowellax(x) the vowel in x is the stem vowel’s lax morphophonological alternantrime(x) x stands in place of the stem’s rimesubtractCoda(x) the stem’s coda is absent from x

awareness of each word form’s morphology; we therefore call it a speaker-oriented(S-O) plat. In this plat, neither the past participles of cast and pass nor those of meanand send are alike in their morphology: cast has the past participle exponence /æst/,pass has /æs-d/, mean has lax(εn)-t, and send has /εnd-t/.A clarification is in order here. Our apparently dichotomous terminology (‘hearer-

oriented’ vs ‘speaker-oriented’) might be taken to suggest that only two sorts of platsare possible for a given inflection-class system. Such is certainly not the case.TheH-Oplat in Table 7.2 and the S-O plat in Table 7.3 belong to a continuum of imaginableplats for the fragment of English verb morphology in Table 7.1. For instance, onecan construct a plat in which morph boundaries are represented but in which noinformation about nonconcatenative morphology is represented; in such a plat, theexponence of ridden might be represented as /ëd-n/ rather than as /ëdn/ (Table 7.2),or as ‘ablaut(ëd)-n’ (Table 7.3). Such a plat would be more speaker-oriented than theplat in Table 7.2, but more hearer-oriented than the plat in Table 7.3. Our S-O platis rather deeply speaker-oriented, since it encodes both syntagmatic morphologicalinformation (morph boundaries) and paradigmaticmorphological information (non-concatenative deviations from a lexeme’s default stem form); exactly how much mor-phological information is included in a plat is naturally a matter of choice. The sameis true of phonological information; in our H-O plat, we have not, for example, repre-sented stress or syllable structure, but we could have, e.g. [trochee [σ. . .æn] [σdәd] ].Our two plats represent affixal alternations and stem alternations together, but onecould also profitably construct a plat based purely on the affixal differences amongthe forms in Table 7.1 and compare its properties with those of a different plat basedpurely on the patterns of stem alternation exhibited by these forms. In short, we do notmean to imply that we regard the inflection-class system embodied by the verbs in (1)as being reducible to exactly the two plats in Tables 7.2 and 7.3; nor are we profferingeither of these two plats as ‘the right analysis’ of the English inflection-class system.Rather, we are using the differences between these two plats tomake amethodologicalpoint relevant to any analysis of this system. Thus, our H-O and S-O plats are onlytwo of the possible plats that we might have brought forward for discussion. Here,we are less interested in motivating the choice of any particular approach to platconstruction than in showing how dramatically such choices affect the measurement



Table . Two hypothetical plats

Plat A Plat Bρ σ τ ρ σ τ

I a b c I a a aII d e f II b a aIII g h i III a b aIV j k l IV a a bV m n o V b b aVI p q r VI b a bVII s t u VII a b bVIII v w x VIII b b b

of morphological complexity. Next, we review a number of information-theoretic andset-theoretic measures that have been proposed for morphological complexity, and aswe show, they yield different results when applied to the contrasting representationsof the English conjugational system in Tables 7.2 and 7.3.We regard the complexity of an inflectional system as the extent to which it inhibits

motivated inferences about the word forms in a lexeme’s realized paradigm.3 Thus,consider the two hypothetical plats in Table 7.5. Plat A is maximally simple: every oneof the exponences a through x in this plat reveals the inflection-class membership of aword form bearing that exponence. By contrast, Plat B is maximally complex: in orderto know a word form’s inflection-class membership, onemust simply know its ρ-form,its σ-form, and its τ-form; none of these can be motivatedly inferred from the others.4Motivated inferences about the word forms in a lexeme’s realized paradigm can be

inhibited in various ways:

3 A lexeme L’s realized paradigm is a set of cells, each the pairing 〈w, σ〉 of a word form w realizing Lwith the morphosyntactic property set σ realized by w.Theparadigm-based conception of complexity assumedhere is not, of course, the only kind of complexity

that a language might be claimed to exhibit. Anderson (this volume) argues that allomorphy adds to amorphological system’s complexity. In that sense, one might regard the system represented by Plat A inTable . as being more complex than the system represented by Plat B: in Plat A, each of the three MPSs ρ,σ, τ has eight allomorphs, while in Plat B, each has only two allomorphs.Thus, an inflectional systemwhoseparadigms are complicated by a poverty of implicative relations among their cells may exhibit relatively littleallomorphy, and a system complicated by extensive allomorphymay exhibit paradigmswhose cells conformto an orderly and extensive network of implicative relations.The two kinds of complexity cannot be equated;one is complexity at the level of paradigms, the other, complexity at the level of individual paradigm cells.

4 To cite a non-hypothetical example, the inference of the present participial forms moving and provingfrom the infinitive forms move and prove is a motivated inference in English; by contrast, the inference ofthe past participle forms proven and *moven from these infinitive forms (alone) is not motivated.Our conception of an inflectional system’s complexity (as the extent to which the system inhibits

motivated inferences about the realization of paradigm cells) recalls the typology of pronominal systemsproposed by Donohue (this volume: .), in which a maximally transparent system affords many simpleinferences of content from form and a maximally opaque system affords few such inferences.



• The word form in cell c of a lexeme’s realized paradigm P may be unpredictableor only predictable from a combination of two or more predictors. In such cases, cexhibits high conditional entropy (a property that we discuss in 7.3) and low cellpredictability (7.6) and lowers P’s inflection-class predictability (7.5); in addition,P requires more than one dynamic principal part (7.4).

• Conversely, a cell c’s word form may be unpredictive or only predictive in com-bination with two or more other predictors. In such cases, c exhibits low cellpredictiveness (7.7).

• In a given realized paradigm P, the set of cells predicting one cell’s word form maynot suffice to predict other cells’ word forms. In such cases, P requires a large staticprincipal-part set (7.4).

• Only certain subsets of the set of cells constituting a lexeme’s realized paradigmP may be viable as predictors for P.5 In such cases, the density of P’s optimalprincipal-part sets is low (7.4).

Because of these diverse sources of complexity, the full scope of an inflectional system’scomplexity can only be revealed by employing a variety of different measurements; nosingle measurement gives a complete picture.In the following sections, we apply several measures to the H-O plat and the

S-O plat. As we show, the H-O plat is more complex than the S-O plat by thesemeasures. Because the S-O representation includes grammatical information that theH-O representation lacks (7.2), it yields slightly lower conditional entropy measures(7.3), fewer principal parts (7.4), and higher measures of inflection-class predictabil-ity (7.5), cell predictability (7.6), and cell predictiveness (7.7). This result does notimply that S-O representations are ‘better’. But it does show that if one gauges alanguage’s morphological complexity by the proposed information-theoretic and set-theoretic measures, then one must be precise about what one is measuring: theextent to which a lexeme’s inflection-class membership is deducible from its wordforms depends on whether the differences among these word forms are seen aspurely phonological or as additionally including differences in morphological struc-ture and membership in particular grammatical classes (or other possible factorsas well).The statistics employed in the following discussion have been generated from the

plats in Tables 7.2 and 7.3 by means of the Principal-Parts Analyser, a computerprogram that performs various calculations based on an input plat. This software isfreely accessible at <http://www.cs.uky.edu/∼raphael/linguistics/analyze.html>.

5 Where L is a lexeme with realized paradigm P and c is a cell in P, a viable set of predictors for c is a setof cells in P whose word forms uniquely determine that of c; a viable set of predictors for P is a set of cells inP whose word forms uniquely determine those of all of the remaining cells in P, i.e. a set of cells in P whoseword forms uniquely determine L’s inflection-class membership.

http://www.cs.uky.edu/%E2%88%BCraphael/linguistics/analyze.html



. The contents of the two plats

The contents of the two plats are alike in two ways. First, they both represent theforty verbs in (1) as belonging to forty distinct ICs;6 that is, no two rows are alike ineither plat.Second, they group MPSs in the same way. Given two MPSs σ, τ, we say that σ

and τ belong to the same distillation if and only if the exponents of σ across all ICsare isomorphic to those of τ; otherwise, σ and τ belong to distinct distillations. Thus,consider the hypothetical system of ICs in Table 7.6, in which ρ, σ, τ, υ, and ϕ representMPS; I, II, III, IV, V, and VI represent ICs; and a through i represent exponences.In this plat, the exponences of property sets τ and υ are identical; moreover, theexponences of ϕ are isomorphic to those of sets τ and υ, in the sense that when ϕhas exponence h in a given IC, τ and υ both have f, and when ϕ has exponence i in agiven IC, τ and υ both have g. In the same way, the property sets {inf} and {presPple}belong to the same distillation in both the H-O and the S-O plat; that is, each plat hasfour distinct distillations: that of {inf} (which also includes {presPple}) and those of{3sgPresInd}, {past}, and {pastPple}. In calculating a plat’s viable principal-part sets,its predictabilities, its predictiveness, or its entropy measures, it is sensible to basethese calculations on distillations rather than onmorphosyntactic property sets per se,since two morphosyntactic property sets belonging to the same distillation are alwaysequally informative. Our practice is to refer to a distillation by one of its members,typically its first-mentioned member in the plat under scrutiny.Notwithstanding the two similarities just noted, the S-O andH-O plats also present

a first statistical difference: the S-O plat has 115 distinct exponences, while the H-Oplat has only 103.This difference is reasonable: although the past-tense forms cast andpassed are, from a purely phonemic perspective, alike in their exponence (both have

Table . Plat representing a hypotheticalsystem of ICs

ρ σ τ υ ϕ

I a c f f hII a c g g iIII a d f f hIV b d g g iV b e f f hVI b e g g i

6 In this respect, the plats are more fine-grained than the traditional classification of English verbs:traditionally, band and load are both simply weak verbs, but in these two plats, they belong to distinctconjugations because the rime of band differs from that of load.



Table . Inflection classes, exponences, and distillations in the two plats

H-O S-O

Identical ICs 0 0Distinct exponences 103 115Distillations 4: • {inf} 4: • {inf}

• {3sgPresInd} • {3sgPresInd}• {past} • {past}• {pastPple} • {pastPple}

the exponence /æst/), they are different from a morphological point of view: passed,unlike cast, has /æs-d/ as its exponence in the S-O plat.These similarities and differences between the two plats are summarized in Table 7.7.

. Conditional entropy

In recent work on morphological complexity, the information-theoretic notion ofconditional entropy (Shannon 1951) has been employed as an objective measure of aninflection-class system’s complexity (Moscoso del PradoMartín et al. 2004, Ackermanet al. 2009, Milin et al. 2009). Intuitively, the entropy of a MPS σ conditional on MPSÙ is the uncertainty about the exponence eσ of σ when one knows the exponence eÙof Ù. Given a set S of MPSs such that M ∈ S, Stump and Finkel (2013) define the n-MPS entropy of M as the average of the entropy of M conditional on members of CM(= {c|c is a subset of S excluding M with up to n members}). The formula for n-MPSentropy is

Hn(M) =∑

c∈CMH(M|c)

|CM| ,

where H(M|c) is the entropy of M conditional on c.7If we calculate the 4-MPS entropy of the four distillations in Table 7.7, we get the

results in Table 7.8. On average, the H-O and S-O plats are nearly alike in their 4-MPSentropy, but closer examination reveals considerable variation between the two platswith respect to the 4-MPS entropy of certain cells. In both the {inf} and {3sgPresInd}distillations, the two plats have the same number of contrasting exponences (seeTable 7.9); yet, the 4-MPS entropy for both distillations is substantially lower in the

7 The entropy H(M|c) of M conditional on c is defined as

H(M|c) = −∑x∈c

P(x)∑y∈M

P(y|x) log2P(y|x),

where P(x) is the probability of x and P(y|x) is the probability of y given x.



Table . 4-MPS entropy (× 100) of the four distillations inTable 7.7

Distillations

Plat {inf} {3sgPresInd} {past} {pastPple} Average

H-O 81 82 90 88 85S-O 72 73 89 95 82

Table . Distinct exponences in each of thefour distillations

Distillations

Plat {inf} {3sgPresInd} {past} {pastPple}

H-O 25 26 26 27S-O 25 26 31 33

S-O plat than in the H-O plat. In the realization of these distillations, the S-O plataffords greater certainty than the H-O plat, and is in that sense less complex. In the{past} and {pastPple} distillations, by contrast, the S-O plat exhibits a larger numberof distinct exponences than the H-O plat; accordingly, there is naturally a higher levelof uncertainty associated with the S-O plat—indeed, the 4-MPS entropy of the S-Oplat exceeds that of the H-O plat in the {pastPple} distillation.

. Principal-part sets

The principal parts of a lexeme are a small set of cells in its realized paradigm fromwhich all of the remaining cells are deducible. Because they make it possible forlanguage learners to master entire realized paradigms with a minimum of mem-orization, principal parts have a long tradition of use in language pedagogy. Theyalso have theoretical significance: they concisely embody implicative relations amongthe cells of a lexeme’s realized paradigm, and these relations define patterns thatchildren clearly employ in learning their first language. As we now show, the preciseconfiguration of an inflectional system’s implicative patterns depends on how thatsystem is represented; thus, the viable principal-part sets afforded by a H-O plat maydiffer from those afforded by a S-O plat.Before proceeding, we must draw an important distinction between the principal-

part sets employed in instructional contexts and those that we use in analysing arealized paradigm’s implicative patterns. Consider, for example, the principal parts



employed in teaching Latin verbs: traditionally, these are a verb’s first-person singularpresent indicative active form, its present infinitive form, its first-person singularperfect indicative active form, and its perfect passive participle; the examples in Table7.10 illustrate. By learning these forms of a given verb, students of Latin learn todeduce all of that verb’s remaining forms. Principal-part sets of this sort have threecharacteristics that are important for language pedagogy but which are not essentialfor determining an inflectional system’s implicative patterns. First, the principal-part sets employed in language pedagogy are unique: each lexeme has exactly onesuch set. But an inflectional system’s implicative patterns do not, in general, boildown to a single principal-part set; typically, a lexeme’s full paradigm of inflectedforms may be deduced in more than one way, from any of a number of possiblesets of forms. Second, principal-part sets in language pedagogy are uniform: fromone realized paradigm to the next, it is always the same forms that serve as principalparts (as in Table 7.10). But nothing about an inflectional system’s implicative patternsnecessitates this uniformity; on the contrary, lexemes belonging to distinct inflectionclasses may participate in very different implicative relations. Finally, principal-partsets in language pedagogy are ordinarily optimal: they are as small as can be (whilestill adhering to the requirement of uniformity). But there is no logical necessity thatthe set of cells embodying a realized paradigm’s implicative structure be optimal inthis sense.We assume a simpler conception of principal parts, one according to which a

principal-part set may or may not be unique, uniform, or optimal. That is, a set ofprincipal parts for a lexeme L is simply any set of cells in L’s realized paradigm P fromwhose realization one can reliably deduce the realization of the remaining cells in P.This definition allows us to choosewhether or not to require that a given principal-partset possess any or all of the properties of uniqueness, uniformity, and optimality. Wecall principal-part sets that adhere to the requirement of uniformity static principal-part sets; sets that do not adhere to this requirement are dynamic principal-part sets.The two plats afford similar principal-part sets under the static scheme: in both

cases, there are two optimal static principal-part sets: {inf, past, pastPple} and

Table . Traditional principal parts of five Latin verbs

Conjugation 1sg present Infinitive 1sg perfect Perfect passive Glossindicative active indicative active participle

1st laudo laudare laudavı laudatum ‘praise’2nd moneo monere monuı monitum ‘warn’3rd duco ducere duxı ductum ‘lead’3rd (-io) capio capere cepı captum ‘take’4th audio audıre audıvı audıtum ‘hear’



Table . Optimal dynamic principal-part sets of three verbs (H-O plat) [ Italic1–4 represent the four distillations: {inf}, {3sgPresInd}, {past}, {pastPple} ]

Principal-part sets Distillations1 2 3 4

band 3 3 3 3 3 dynamic principal-part number: 14 4 4 4 4 cell-predictor number: 1.00

number of optimal sets: 2density of optimal sets: 50.0 of 4

bite 3, 4 3 3 3 4 dynamic principal-part number: 2cell-predictor number: 1.00number of optimal sets: 1density of optimal sets: 16.7 of 6

bring 1, 3 1 1 3 3 dynamic principal-part number: 21, 4 1 1 4 4 cell-predictor number: 1.002, 3 2 2 3 3 number of optimal sets: 42, 4 2 2 4 4 density of optimal sets: 66.7 of 6

{3sgPresInd, past, pastPple}.8 Notwithstanding this similarity, the two plats differunder the dynamic principal-part scheme.Dynamic principal-part analysis affords various informative measurements. Con-

sider, for example, the band, bite, and bring conjugations. In theH-Oplat, these havethe optimal dynamic principal-part sets in Table 7.11. In this table, the italic numerals 1,2, 3, and 4 represent the cells in a lexeme’s realized paradigm corresponding to the fourdistillations in the H-O plat: {inf}, {3sgPresInd}, {past}, and {pastPple}, respectively.Viable principal-part sets are given in successive rows.The principal parts that sufficeto determine the exponence of distillation σ in set A are listed in σ’s column on A’srow. As the table shows, we can compare band, bite, and bring according to

(i) their dynamic principal-part number, i.e. the number of optimal dynamicprincipal parts they require (band requires only one, while bite and bringboth require two);

(ii) their cell-predictor number, i.e. the average number of dynamic principalparts needed to deduce the exponence of a given distillation (in all three ofthese conjugations, the cell-predictor number is 1.0);

(iii) the number of viable optimal dynamic principal-part sets they afford (bandallows two, bite only one, and bring four); and

(iv) the density of optimal dynamic principal-part sets among sets of n cells, givena dynamic principal-part number of n (two of four single-member principal-

8 Conventionally, the first of these is used in pedagogical contexts: sing, sang, sung.



part sets are viable for band, one of six two-member sets are viable for bite,and four of six two-member sets are viable for bring).

If we average these four measures across all forty ICs in both plats, we arrive atthe figures in Table 7.12. Here, the difference between the two plats becomes morepronounced. By three of the four measures, the S-O plat allows a verbal lexeme’s cellsto be deducedmore easily than theH-Oplat: on average, the forty conjugations requiresmaller principal-part sets in the S-O plat, which also affords more viable sets—thatis, more ways of deducing unknown cells from known cells. By these criteria, the S-Oplat is less complex than the H-O plat.We can localize these differences between the S-O and H-O plats by examining

these results more closely. Consider first the dynamic principal-part numbers of theforty verbs in the two plats. As Table 7.13 shows, the two plats yield different principal-part numbers for only four verbs: bite, cast, hide, and mean. In the S-O plat, eachof these four verbs requires only a single principal part, but each requires two in theH-O plat.Consider, for example, the verb cast. In the H-O plat, cast has the exponences

in Table 7.14. In this plat, a principal-part set with only a single member cannot workfor cast, because each of its exponences is identical to the corresponding exponenceof either last or pass. For this reason, cast requires two principal parts. Given four

Table . Dynamic principal-part sets in the two plats

H-O S-O

Dynamic principal-part number, averaged across conjugations 1.23 1.12Cell-predictor number, averaged across conjugations 1.00 1.00Number of viable dynamic principal-part sets, averaged acrossconjugations

2.35 2.58

Average ratio of actual to possible dynamic principal-part sets,averaged across conjugations

52.9 60.6

Table . Dynamic principal-part numbers in the two plats

H-O S-O

band, budge, build, choose, do, draw, feel, fly,hang, have, last, lean, light, load, lose, make,pass, pay, peel, read, ride, say, see, seek, send, shop,sing, slide, stand, teach, write

1 1

bring, buy, fight, flee, sting 2 2bite, cast, hide, mean 2 1Average 1.23 1.12



Table . Candidate principal-part sets for cast in the H-O plat (Withina given row, a pair of distillations marked ‘�. . .�’ corresponds to a viableprincipal-part set, and a pair of distillations marked ‘× . . . ×’ corresponds toa set that is not viable.)

Candidate principal-part setsLexeme {inf} {3sgPresInd} {past} {pastPple}cast æst æsts æst æst

1 × ×2 � �3 � �4 � �5 � �6 × ×

last æst æsts æstәd æstәdpass æs æsәz æst æst

distillations, there are six candidate sets containing two principal parts; four of thesesix sets are viable, namely those numbered ‘2’–‘5’ in Table 7.14. (The principal-partset numbered ‘1’ doesn’t distinguish cast from last; the set numbered ‘6’ doesn’tdistinguish cast from pass.)The situation is different in the S-O plat, where cast, last, and pass have the

exponences in Table 7.15. In this plat, a single principal part suffices to distinguish castfrom last and pass. Given four distillations, there are four candidate single-memberprincipal-part sets; as Table 7.15 shows, two of the four sets—those numbered ‘3’ and‘4’—are viable. Thus, by the criterion of verbs’ dynamic principal-part numbers, theH-O plat is more complex than the S-O plat.Consider now the number of viable optimal dynamic principal-part sets available

to each verb in the two plats. As Table 7.16 shows, most verbs exhibit the same numberof sets in both plats. Among the ten verbs that do not, eight exhibit more viable setsin the S-O plat than in the H-O plat. For instance, the H-O plat affords two optimal

Table . Candidate principal-part sets for cast (S-O plat)

Candidate principal-part setsLexeme {inf} {3sgPresInd} {past} {pastPple}cast æst æst-z æst æst

1 ×2 ×3 �4 �

last æst æst-z æst-d æst-dpass æs æs-z æs-d æs-d



Table . Number of viable optimal dynamic principal-part sets

H-O S-O

Same number in both plats:

bite, fly, hide, light, say, sing, slide 1 1band, choose, feel, hang, last, lean, lose, peel, see, seek, stand,sting, teach

2 2

draw 3 3bring, budge, build, buy, fight, flee, have, load, shop 4 4

More in S-O plat:

ride, write 1 2pay 1 3make, pass, read, send 2 4do 3 4

More in H-O plat:

cast, mean 4 2

Average 2.35 2.58

Table . Candidate principal-part sets for pass (H-O plat)

Candidate principal-part setsLexeme {inf} {3sgPresInd} {past} {pastPple}pass æs æsәz æst æst

1 �2 �3 ×4 ×

cast æst æsts æst æst

dynamic principal-part sets for the verb pass, one containing its {inf} cell and theother, its {3sgPresInd} cell; neither the {past} cell of pass nor its {pastPple} cell is anoptimal principal part, since neither distinguishes the conjugation of pass from thatof cast in the H-O plat (Table 7.17). By contrast, the additional morphological infor-mation in the S-O plat makes each of the four cells an optimal dynamic principal part:in this plat, the exponence of passed (morphophonologically |pæs-d|) distinguishes itfrom cast (morphophonologically |kæst|), as Table 7.18 shows.



Table . Candidate principal-part sets for pass (S-O plat)

Candidate principal-part setsLexeme {inf} {3sgPresInd} {past} {pastPple}pass æs æs-z æs-d æs-d

1 �2 �3 �4 �

cast æst æst-z æst æst

Table . Density of viable dynamic principal-part sets among all can-didate sets having the same number of members

H-O S-O

budge, build, have, load, shop 100 of 4 100 of 4draw 75.0 of 4 75.0 of 4bring, buy, fight, flee 66.7 of 6 66.7 of 6band, choose, feel, hang, last, lean, lose, peel, see,seek, stand, teach

50.0 of 4 50.0 of 4

sting 33.3 of 6 33.3 of 6fly, light, say, sing, slide 25.0 of 4 25.0 of 4do 75.0 of 4 100 of 4make, pass, read, send 50.0 of 4 100 of 4pay 25.0 of 4 75.0 of 4ride, write 25.0 of 4 50.0 of 4bite, hide 16.7 of 6 25.0 of 4cast, mean 66.7 of 6 50.0 of 4Average 52.92 60.62

The two apparent exceptions to the generalization that the S-O plat affords moreviable principal-part sets are cast andmean, each of which has four optimal dynamicprincipal-part sets in the H-O plat but only two in the S-O plat. This apparentexceptionality is, however, an artefact of a difference between the two plats that wehave already noted: the H-O plat requires these two verbs to have two principal partseach, whereas the S-O plat only requires them to have one; see again Table 7.13.That is,cast andmean havemore viable principal-part sets in theH-O plat because they havemore candidate principal-part sets in that plat than in the S-O plat—six versus four.Finally, consider the density of viable dynamic principal-part sets among candidate

sets of the same size. As Table 7.19 shows, most verbs exhibit the same density in bothplats. Of the remaining verbs, a solid majority exhibit more density in the S-O platthan in the H-O plat. Only two exhibit the opposite pattern, and again these are cast



and mean, whose apparent exceptionality again stems from the fact that they requiretwo principal parts in the H-O plat but only one in the S-O plat.

. Inflection-class (IC) predictability

A given principal-part set reveals some of the implicative relations that hold amongthe cells of a lexeme’s realized paradigm, but it doesn’t reveal all such relations; aclearer picture can be arrived at by comparing a large number of alternative principal-part sets. Building on this idea, we have proposed a measure of IC predictability(Finkel and Stump 2009, 2013). Intuitively, the IC predictability of a lexeme L’s realizedparadigm is the fraction of viable (though not necessarily optimal) dynamic principal-part sets among all nonempty subsets of cells in L’s realized paradigm.9 Since largesubsets always tend to be viable principal-part sets, we restrict the size of the cellsets used in this computation to some arbitrary number m, usually setting m to 4 (asin the following calculations). We then define ICPL, the IC predictability of realizedparadigm PL, as follows:

ICPL = |m[DL′]|

|m[P(DL)\Ø]|The following abbreviations and notation are employed in this definition:

PL is the realized paradigm of lexeme L.ML is the set of cells in PL.DL is any maximal subset of ML none of whose members belong to the same distillation.DL

′ is the set {N : N ⊆ DL and N is a viable dynamic principal-part set for PL}.For any collection C of sets, m[C] represents {s ∈ C : |s| ≤ m}.For any set S, P(S) is the power set of S (= the set of subsets of S).For any sets S1, S2, S1 \ S2 is { x : x ∈ S1 and x /∈ S2}.

Intuitively, a realized paradigmwith high ICpredictability hasmany inter-cell implica-tive relations, and one with lower IC predictability has fewer. A realized paradigmwith an IC predictability of 1.0 allows the word form in every cell to be deduced fromthe word form in every other cell; a realized paradigm with an IC predictability of 0allows no such inferences. Low IC predictability contributes to an inflectional system’scomplexity.Most of the verbs in (1) have the same IC predictability in both the H-O and S-O

plats; for twelve verbs, however, the two plats produce distinct measures. As Table 7.20shows, it is invariably the S-O plat that induces the higher level of IC predictability inthese cases; moreover, five of these verbs exhibit full IC predictability (a value of 1.0) inthe S-O plat but not in the H-O plat. Thus, the pattern that emerged in the precedingsections is again in evidence: the S-O plat is less complex than the H-O plat.

9 For a demonstration that IC predictability and n-MPS entropy are distinct measures (neither reducibleto the other), see Stump and Finkel (: ).



Table . IC predictability of twelve verbs inthe H-O and S-O plats

H-O S-O

bite 0.267 < 0.533cast 0.600 < 0.800do 0.933 < 1.000hide 0.267 < 0.533make 0.800 < 1.000mean 0.600 < 0.800pass 0.800 < 1.000pay 0.733 < 0.933read 0.800 < 1.000ride 0.533 < 0.800send 0.800 < 1.000write 0.533 < 0.800

Avg 0.727 < 0.790

. Cell predictability

Intuitively, the cell predictability of a cell 〈w, σ〉 in a lexeme L’s realized paradigm PL isthe ratio of (a) to (b), where (a) is the number of nonempty subsets of PL’s cells whoserealization uniquely determines 〈w, σ〉 and (b) is the number of all nonempty subsetsof PL’s cells. More formally, we define the cell predictability CellP〈w,σ〉 of a cell 〈w, σ〉in a realized paradigm PL as follows:

CellP〈w,σ〉 = |[m[D〈w,σ〉]]−〈w,σ〉||[m[P(DL)\Ø]]−〈w,σ〉|

The following additional abbreviations and notation are employed in this definition:

D〈w,σ〉 is the set {N : N ⊆ DL and N uniquely determines the realized cell 〈w, σ〉 in PL}.For any collection C of sets, [C]−〈w,σ〉 represents the largest subset of C such that no memberof [C]−〈w,σ〉 contains 〈w, σ〉.Theword form in a cell with high cell predictability can be inferred in several ways; theword form in a cell with low cell predictability can be inferred in fewways, possibly notat all. Thus, cells with high cell predictability contribute to an inflection-class system’ssimplicity, while low cell predictability contributes to its complexity.When we apply the cell predictability measure to the cells represented in the H-O

and S-Oplats, we find thatmost verbs have cells whose cell predictability is the same inboth plats. But for the thirteen verbs in Table 7.21, we find differences; these differences



Table . Cell predictability measures for thirteen verbs in the H-O and S-Oplats

{inf} {3sgPresInd} {past} {pastPple} AvgH-O S-O H-O S-O H-O S-O H-O S-O H-O S-O

bite 0.875 0.875 0.875 0.875 0.000 < 0.500 0.000 0.000 0.438 < 0.562cast 0.500 < 0.875 0.500 < 0.875 0.500 0.500 0.500 0.500 0.500 < 0.688do 0.750 < 0.875 0.750 < 0.875 0.875 0.875 0.750 < 0.875 0.781 < 0.875hide 0.750 < 0.875 0.750 < 0.875 0.000 < 0.500 0.000 0.000 0.375 < 0.562make 0.500 < 0.875 0.500 < 0.875 0.875 0.875 0.875 0.875 0.688 < 0.875mean 0.500 < 0.875 0.500 < 0.875 0.500 0.500 0.500 0.500 0.500 < 0.688pass 0.500 < 0.875 0.500 < 0.875 0.875 0.875 0.875 0.875 0.688 < 0.875pay 0.500 < 0.875 0.375 < 0.750 0.750 0.750 0.750 0.750 0.594 < 0.781read 0.500 < 0.875 0.500 < 0.875 0.875 0.875 0.875 0.875 0.688 < 0.875ride 0.875 0.875 0.875 0.875 0.000 < 0.500 0.500 0.500 0.562 < 0.688send 0.500 < 0.875 0.500 < 0.875 0.875 0.875 0.875 0.875 0.688 < 0.875slide 0.750 < 0.875 0.750 < 0.875 0.500 0.500 0.000 0.000 0.500 < 0.562write 0.875 0.875 0.875 0.875 0.000 < 0.500 0.500 0.500 0.562 < 0.688Avg 0.706 < 0.781 0.700 < 0.775 0.566 < 0.616 0.584 < 0.588 0.639 < 0.690

are shaded in Table 7.21. Without exception, if a cell has distinct cell predictabilitiesin the H-O and S-O plats, its cell predictability is lower in the H-O plat than in theS-O plat. Notice, in particular, the dark-outlined values in Table 7.21: these representinstances in which a cell’s word form is unpredictable in the H-O plat but predictablein the S-O plat. By the criterion of cell predictability, we again see that the H-O plat ismore complex than the S-O plat.

. Cell predictiveness

We define the predictiveness of a cell K in a realized paradigm as the fraction of theother cells in the paradigm whose word forms are fully determined by that of K. Thetwo plats differ in their overall predictiveness (averaging over both cells and verbs):verbs’ cells are more predictive in the S-O plat than in the H-O plat. This differencestems from the fact that there are two cells in a verb’s realized paradigm that are, onaverage, more predictive in the S-O plat than in the H-O plat; these are the {past}and {pastPple} cells (Table 7.22). In particular, the ten verbs in Table 7.23 account forthis difference. In this table, a verb’s {past} word form (notated as 3) either predicts(1) or fails to predict (0) the word forms in its {inf} (= 1), {3sgPresIndic} (= 2), and{pastPple} (= 4) cells. The difference between the results for the H-O plat and thosefor the S-O plat is striking: in the S-O plat, the past-tense forms of these ten verbsare highly predictive; in the H-O plat, by contrast, they are at best mildly predictive,



Table . Predictiveness of a verb’s cells, averagedacross verbs

H-O S-O

Overall predictiveness of ({inf}): 0.550 0.550of ({3sgPresInd}): 0.600 0.600of ({past}): 0.592 0.767of ({pastPple}): 0.658 0.808

Average predictiveness across all distillations: 0.600 0.681

Table . Predictiveness of a verb’s past-tense cell in the H-O and S-O plats

H-O S-O

DistillationsPredictiveness

DistillationsPredictiveness1 2 3 4 1 2 3 4

cast 0 0 X 1 0.333 cast 1 1 X 1 1.000do 0 0 X 0 0.000 do 1 1 X 1 1.000hide 0 0 X 0 0.000 hide 1 1 X 0 0.667make 0 0 X 1 0.333 make 1 1 X 1 1.000mean 0 0 X 1 0.333 mean 1 1 X 1 1.000pass 0 0 X 1 0.333 pass 1 1 X 1 1.000pay 0 0 X 1 0.333 pay 1 1 X 1 1.000read 0 0 X 1 0.333 read 1 1 X 1 1.000send 0 0 X 1 0.333 send 1 1 X 1 1.000slide 0 0 X 0 0.000 slide 1 1 X 0 0.667

1= {inf} 2 = {3sgPresInd} 3 = {past} 4 = {pastPple} 0 = unpredicted 1 = predicted X = predictor

if they are predictive at all. Predictiveness is therefore a final criterion according towhich the S-O representation of English verb inflection is less complex than the H-Orepresentation.

. Conclusion

In investigating the inflectional morphology of the world’s languages, one is inevitablystruck by their widely varying degrees of morphological complexity. Unless thisimpression is to remain nothing more than a vague and subjective observation, it iscritical to find objectivelymeasurable correlates of perceived differences in complexity,and indeed, to articulate in objective terms exactly what morphological complexitymeans. Here and elsewhere, we have proposed that the complexity of an inflectional



system is the extent to which it inhibits motivated inferences about the word formsrealizing a paradigm’s cells. The factors that inhibit such inferences are of variouskinds, making it desirable to employ a range of approaches to measuring their effects.A number of approaches have recently been proposed, drawing on both information-theoretic and set-theoretic methods.In order to measure an inflectional system’s complexity, one must employ a precise

and explicit representation of that system. We have argued here that the same systemis inevitably representable in more than one way, and that the choice of representa-tion has profound effects on the measurement of the system’s properties. We haveelaborated by presenting two different representations of the fragment of the Englishverb system in (1), one ‘hearer-oriented’, the other ‘speaker-oriented’. Our goal has notbeen to argue that one sort of representation is more suitable than the other (and wedon’t think that that is the case in any event). Nor do we wish to suggest that theseare the only possible representations of the system of verb inflection in (1). Rather,our objective has been to demonstrate how deeply the choice of representation canaffect the results of measuring a system’s complexity. Although we have focused onEnglish here, we have seen this same effect with every inflectional system that wehave investigated: the results of calculations based on a H-O plat differ from thosebased on a S-O plat. Once one begins to investigate the relative complexity of twoor more systems, it is important that sames be compared with sames: the mode ofrepresentation chosen for one system must be assiduously employed in representingthe systems with which it is compared.


Computational complexityof abstractive morphology

VITO PIRRELLI , MARCELLO FERRO,AND CLAUDIA MARZI

. Introduction

In a constructive perspective on morphological theorizing, roots and affixes arethe basic building blocks of morphological competence, on the assumption thatthe lexicon is largely redundancy-free. The speaker, having identified the parts of aword form, proceeds to discard the full form from the lexicon. Fully inflected formsare parsed (in word recognition) or generated (in word production) on line fromtheir building blocks. This is in sharp contrast with a second type of perspective,named abstractive, which treats word forms as basic units and their recurrent partsas abstractions over full forms. According to this perspective, full forms are thefounding units of morphological processing, with sub-lexical units resulting fromthe application of morphological processes to full forms. Learning the morphologyof a language thus amounts to learning relations between fully stored word forms,which are concurrently available in the speaker’s mental lexicon and jointly facilitateprocessing of morphologically related forms.The essential distinction between constructive and abstractive approaches (due to

Blevins 2006) is not coextensive with the opposition between morpheme-based andword-based views of morphology. There is an obvious sense in which constructiveapproaches are morph-based, as they rely on the combination of minimal sub-lexicalunits. On the other hand, abstractive approaches are typically word-based, sincewordsare taken to be essential to morphological generalizations. However, one can endorsean abstractive view of morphologically complex words as being internally structuredinto recognizable constituent parts. According to this view, constituent parts areanalysed as ‘emergent’ from independent principles of lexical organization, wherebyfull lexical forms are redundantly stored and mutually related through entailment


Vito Pirrelli, Marcello Ferro, and Claudia Marzi

relations (Matthews 1991, Corbett and Fraser 1993, Pirrelli 2000, Burzio 2004, Booij2010). Conversely, a constructive orientation can underlie a word-based view ofmorphological structure. This is the case of Stump’s notion of ‘paradigm function’(Stump 2001), mapping the root of a lexeme and a set of morphosyntactic propertiesonto the paradigm cell occupied by an inflected form of the lexeme.More importantly for our present purposes, the two views are also orthogonal to the

distinction between symbolic (or rule-based) and sub-symbolic (or neurally inspired)word processing. First, we observe that, in spite of their being heralded as championsof the associative view on morphological structure, the way connectionist neuralnetworks have been used to model morphology learning adheres, in fact, to a strictlyderivational view ofmorphological relations, according to which a fully inflected formis always produced (or analysed) on the basis of a unique, underlying lexical form. Bymodelling inflection as a phonological mapping function from a lexical base to itsrange of inflected forms, connectionist architectures are closer to a rule-free variantof the classical constructive view, than to associative models of the mental lexicon.Conversely, according to Albright and Hayes’ Minimal Generalization algorithm(2003), speakers conservatively develop structure-based rules of mapping betweenfully inflected forms. These patterns are based on a cautious inductive generalizationprocedure: speakers are confident in extending amorphological pattern to other formsto the extent that (i) the pattern obtains for many existing word forms, (ii) thereis a context-similarity between words that comply with the pattern, and (iii) thereis a context-difference between those word forms and word forms that take otherpatterns. Accordingly, the speaker’s knowledge of word structure is more akin to onedynamic set of relations among fully inflected forms, in line with an overall abstractiveapproach.1Finally, the separation line between constructive and abstractive views has little

to do with the controversy over the psychological reality of morpheme-like units.Constructivists tend to emphasize the role of morphs in bothmorphology acquisitionand processing, whereas in the abstractive view morphs emerge as abstractions overfull forms. In fact, we take it to be one of the most important contributions ofemergentist approaches to language inquiry the observation that self-organizationeffects provide an explanatory basis for important aspects of language knowledge.We shall show that self-organization is a determinant of lexical competence and thatemergent morphological patterns do play an important role in word processing andacquisition.

1 In fact, the sheer number of leading forms which need to be learned for a speaker to be able to masteran entire paradigm is not a defining feature of abstractive approaches. In simple inflectional systems, asingle formmay well be sufficient to produce all remaining forms of the same paradigm.The computationalquestion of how many different forms are needed to predict the entire range of formal variation expressedby a specific paradigmor by a class ofmorphologically related paradigms has lately been addressed on eitherinformation-theoretic (conditional entropy of cells and paradigms, Moscoso del Prado Martín et al. )or set-theoretic grounds (see Stump and Finkel’s contribution to the present volume).


Computational complexity of abstractive morphology

It could be tempting to suggest that the abstractive and constructive views justrepresent two complementary modes of word processing: (i) a production mode forthe constructive perspective, assembling sub-lexical bits into lexical wholes, and (ii) arecognition mode for the abstractive perspective, splitting morphologically complexwords into their most recurrent constituent morphs. Blevins (2006) shows that thisis not true empirically, as the two approaches lead to very different implications(such as the ‘abstractive’ possibility for a word form to have multiple bases or ‘split-base effect’) and allow for a different range of representational statements (e.g. thesharp separation between actual data and abstract patterns arising from data in theconstructive approach).In this chapter, we intend to show that the two views are also computationally

different. We contend that a number of problems arising in connection with a sub-symbolic implementation of the constructive view are tackled effectively, or disappearaltogether, in a neurally inspired implementation of the associative framework, restingon key-notions such as self-organization and emergence. In particular, we will firstgo over some algebraic prerequisites of morphological processing and acquisitionthat, according to Marcus (2001), perceptron-like neural networks are known tomeet to a limited extent only (section 8.2). A particular variant of Kohonen’s Self-Organizing Maps (2001) is then introduced to explore and assess the implicationsof a radically abstractive approach in terms of its computational complexity (sec-tion 8.3). Experimental data are also shown to provide a quantitative assessmentof these implications (section 8.4). Finally, some general conclusions are drawn(section 8.5).

. The algebraic basis of constructive morphology

Marcus (2001) introduces a number of criterial properties defining any descriptivelyadequate morphological system.First, the system must have a way to distinguish variables from instances, to be

able to make such a simple statement as ‘play is an English verb stem’, where play is aconstant and stem is a variable which can be assigned any English verb stem. As weshall see, such a distinction is essential to capture the intuitive notion of repetitionof the same symbol within a string, as when we say that the string pop contains twoinstances (tokens) of the same p (type). Second, a proper morphological system mustavail itself of the formal means to represent abstract relationships between variables.Even the most straightforward default rule-like statement such as the following

(1) PROGR(Y) = Y + ing,

meaning that a progressive verb form in English is built by affixing ing after any verbstem Y , requires specification of two relations: (i) an identity relation between the



lexical root Y to the left of ‘=’ and the stem Y in the right-hand side of the equation,2and (ii) the ordering relation between the stem and the progressive inflectional ending-ing. Third, there must be a way to bind a particular instance (a constant value)to a given variable. For example, Y in (1) can be assigned a specific value (e.g. theindividual stem play to yield the form playing), as the output of the concatenationoperation expressed by ‘+’. Fourth, the system must be able to apply operations toarbitrary instances of a variable. For example, a concatenation operation must be ableto combine ing with any input value assigned to Y , unless Y ’s domain is otherwiseconstrained. Finally, the systemmust have a way to extract relations between variableslike (1), on the basis of training examples (walking, sleeping, playing, etc.). Marcus goeson to argue that any morphological system must be able to meet these prerequisites,irrespective of whether it is implemented symbolically as a rule-based processor,or sub-symbolically, as a neurally inspired artificial network. In fact, traditionalconnectionist models of morphological competence are taken by Marcus to fail tomeet at least some of these prerequisites. What is called by Rosenblatt (1962: 73) ‘asort of exhaustive rote-learning procedure’, in which every single case of a generalrelationmust be learned individually, appears to be an essential feature of perceptron-like artificial neural networks. Such a characteristic behaviour extends to the familyof multi-layered perceptrons classically used to simulate morphology learning. Theconsequences of this limitation are far-reaching, as detailed in the ensuing sections.

.. Binding

In a multi-layered perceptron, a variable (or type)—say the letter p—can be repre-sented by a single node or, alternatively, by a distributed pattern of nodes, on the inputlayer. A single instantiation of a p in a string like put is encoded on the input layerby activating the corresponding node(s). We say that an instance of p is bound to itstype node(s) through node’s activation. However, for a string like pop to be properlyencoded, two instances of p must be distinguished—say p_1 (or p in first position)and p_3 (or p in third position). Activating the same p node twice would not do,as the whole string is encoded as an integrated activation pattern, where all nodesrepresenting input letters are activated concurrently. Logically, the problem amountsto binding the input layer node (the letter type) to a specific individual, a letter token.When it comes to representing words, a token can be identified uniquely by specifyingits type together with the token’s position in the word string. In this context, thesolution to the binding problem amounts to assigning a specific order relationshipto a given type. A traditional strategy for doing this with neural networks is known asconjunctive coding.

2 Computationally, this can be implemented by introducing a copy operator, replicating the content ofa given memory register into another register, irrespective of the particular nature of such content. As weshall see later on, this is not the only possible computational interpretation of equation ().



.. Coding

In conjunctive coding (Coltheart et al. 2001, Harm and Seidenberg 1999, McClellandand Rumelhart 1981, Perry et al. 2007, Plaut et al. 1996), a word form like pop isrepresented through a set of context-sensitive nodes. Each such node ties a letterto a specific serial position (e.g. {P_1, O_2, P_3}), as in so-called positional codingor, alternatively, to a specific letter cluster (e.g. {_Po, pOp, oP_}), as is customary inso-called Wickelcoding. Positional coding makes it difficult to generalize knowledgeabout phonemes or letters across positions (Plaut et al. 1996, Whitney 2001) and toalign positions across word forms of differing lengths (Davis and Bowers 2004), aswith book and handbook. The use of Wickelcoding, on the other hand, while avoidingsome problems of positional coding, causes an acquisitional dead-lock. Speakers ofdifferent languages are known to exhibit differential sensitivity to symbol patterns.If such patterns are hard-wired in the input layer, the same processing architecturecannot be used to deal with languages exhibiting differential constraints on soundsor letters.

.. Relations and the problem of variables

Consider the problem of providing a neurally inspired version of the relationexpressed in (1). The equation conveys an identity mapping between the lexical basein PROGR(Y) and the verb stem Y to the right of ‘=’. A universally quantifiedrelation of this kind is common to a variety of default mapping relations holding inthe morphologies of virtually any language. Besides, this sort of free generalizationseems to be a fundamental feature of human classificatory behaviour. Straightforwardlinguistic evidence that people freely generalize universally quantified one-to-onemappings comes from morphological reduplication (or immediate repetition), asfound in Indonesian pluralization, whereby the plural of buku (‘book’) is buku-buku.Reduplication patterns like this have quite early roots in language development. Seven-month-old infants are demonstrably able to extract an underlying abstract structurefrom two minutes’ exposure to made-up syllabic sequences cast into an AAB pattern(ga-ga-na, li-li-ti, etc.) and to tell this habituation structure from other test patternscontaining repetitions in different positions (e.g. ABB) (Saffran et al. 1996, Marcuset al. 1999, Gerken 2006).A one-to-one mapping function can easily be implemented in a network that

uses one node to represent each variable (Figure 8.1, left). The model can freelygeneralize an identity relation by setting the connection weight to 1. However, it isnot at all clear how a one-node-per-variable solution can represent the highly non-linear types of morphological exponence abundantly documented in the literature(see Anderson, section 2.2, this volume for a thorough discussion), such as stem-internal vowel alternation in base/past-tense English pairs such as ring–rang, bring–brought, and hang–hung where different mapping functions are required depending



input output

x

inputlayer

8

inputlayer

hiddenlayer

outputlayer

outputlayer

y

4

2

1

8

4

2

1

8

h1

h2

4

2

1

8

4

2

1

Figure . One-to-one and one-to-many relations in one-node-per-variable and many-nodes-per-variable neural networks.

on both the vowel and its surrounding context. A descriptivelymore adequate solutionis provided by a many-nodes-per-variable perceptron, where each node represents aspecific value assigned to each variable, and different connections emanating fromthe same node can either enforce an identity relation (Figure 8.1, centre) or makeroom for other mapping relations, with hidden layer units accounting for inter-node influences (Figure 8.1, right). However more flexible, a many-node-per-variableperceptron, when trained by back-propagation, can learn a universally quantifiedone-to-one mapping relation only if it sees this relation illustrated with respect toeach possible input and output node (Marcus 2001). Back to Figure 8.1 (centre), aperceptron trained on ‘8’, ‘4’, and ‘2’ nodes only, is not in a position to extend an identityrelation to an untrained ‘1’ node. An implication of this state of affairs is that a multi-layered perceptron can learn the relation expressed by equation (1) only for strings (inthe domain of Y) that fall entirely in the perceptron’s training space.

.. Training independence

Technically, the problem of dealing with relations over variables is a consequence ofthe back-propagation algorithm classically used to train a multi-layered perceptron.Back-propagation consists in altering the weights of connections emanating from anactivated input node, for the level of activation of output nodes to be attuned to theexpected output. According to the delta rule in equation (2), connections betweenthe jth input node and the ith output node are in fact changed in proportion to thedifference between the target activation value hi of the ith output node and the actuallyobserved output value hi:

(2) �wi,j = γ (hi − hi)xj



where wi,j is the weight of the connection between the jth input node and the ith

output node, γ is the learning rate, and xj is the activation of the jth input node.If xj is null, the resulting change �wi,j is null. In other words, back-propagationnever alters the weights of connections emanating from an input node in Figure 8.1(centre), if the node is never activated in training. This is called by Marcus (2001)training independence. Interposition of a layer of hidden units mediating input–output mapping (Figure 8.1, right) does not remedy this. There is no direct wayin which a change in the connections feeding the output node ‘1’ can affect theconnections feeding a different output node.

.. Preliminary discussion

Inmany respects, multilayer perceptrons are taken to provide a basic neurally inspiredassociationist mechanism for morphology processing and acquisition. Much of theirsuccess history is due to the methodological allure of their simplicity. It seemsreasonable to investigate the inherent representational and procedural limitations ofthis class of models to understand more of the task of word learning and processingand its basic computational components. Among the main liabilities of multilayerperceptrons, the following ones strike us as particularly critical in the light of ourprevious discussion.

Training locality. Due to training independence, training one node does not transferto another node.There is noway to propagate information about connections betweentwo (input and output) nodes to another pair of such nodes.This is not only detrimen-tal to the development of abstract open-ended relations holding over variables (asopposed to pair-wise memorized relationships between specific instances), but alsoto the modelling of word paradigms as complex morphological interfaces. Over thelast fifteen years, considerable evidence has accrued on the critical role of paradigm-based relations as an order-principle imposing a non-local organizing structure onword forms memorized in the speaker’s mental lexicon, facilitating their retention,accessibility, and use, while permitting the spontaneous production and analysis ofnovel words. A number of theoretical models of the mental lexicon have been put for-ward to deal with the role of these global constraints in (i) setting an upper bound onthe number of possible forms a speaker is ready to produce (Carstairs and Stemberger1988), (ii) accounting for reaction times in lexical decision and related tasks (Baayenet al. 1997, Orsolini and Marslen-Wilson 1997, and others), (iii) explaining productionerrors by both adults and children (Bybee and Slobin 1982, Bybee and Moder 1983,Orsolini et al. 1998), and (iv) accounting for human acceptability judgements andgeneralizations over nonce verb stems (Say and Clahsen 2002, Albright 2002). Webelieve that computational models of morphology acquisition and processing havesomething to say about all these issues. The basic challenge is how it is possible forsuch global paradigm-based constraints to emerge on the basis of local learning steps(Pirrelli et al. 2004).



Coding. Another important related point has to do with the nature of inputrepresentations. It has often been suggested that distributed representations maycircumvent the connectionist failure to encode universally quantified variables (Elman1998). If the same pool of nodes is used to encode different input patterns, then atleast some of the mapping relations holding for one pattern will also obtain for theother ones, thus enforcing a kind of generalization. For example, if each input noderepresents not a word but a part of a word, then training one word has an impacton nodes that are part of the representation of other words. However, distributedrepresentations can be of help only if the items to which onemust generalize share thatparticular contrast (pool of nodes) on which the model was trained. If the model wastrained on a different contrast than the one encoded in the test item, generalization tothe test itemwill fail.More seriously, distributed representations can raise unnecessarycoding ambiguity if each individual is encoded as an activation pattern over the poolof nodes corresponding to its variable. If two individuals are activated simultaneously,the resulting joint activation pattern canmake it difficult to disentangle one individualfrom the other. In fact, the representational capacity of the network to uniquely bind aparticular individual decreases with the extent of distribution. One extreme is a purelylocalist representation, with each input node coding a distinct individual. The otherextreme is the completely distributed case, where each of the 2N binary activationpatterns over N nodes represents a distinct individual. In this case, no two patternscan be superimposed without spuriously creating a new pattern.

Alignment. Lack of generalization due to localist encoding is an important issue inacquisition ofmorphological structure, as it correlates with human perception of wordsimilarity.Theproblem arises whenever known symbol patterns are presented in novelarrangements, as whenwe are able to recognize the English word book in handbook, orthe shared rootmach in Germanmachen and gemacht. Conjunctive coding of letters iscloser to the localist extreme, anchoring a letter either to its position in the string, or toits surrounding context. Since languages wildly differ in the way morphological infor-mation is sequentially encoded, ranging from suffixation to prefixation, circumfixa-tion, apophony, reduplication, interdigitation, and combinations thereof, alignment oflexical roots in three diverse pairs of paradigmatically related forms like Englishwalk–walked, Arabic kataba–yaktubu (‘he wrote’–‘he writes’) andGermanmachen–gemacht(‘make’–‘made’ past participle), requires substantially different encoding strategies. Ifwe wired any such strategy into lexical representations (e.g. through a fixed templaticstructure separating the lexical root from other morphological markers) we wouldin fact slip morphological structure into the input, making input representationsdependent on languages. This is not wrong per se, but it should be the outcome,not a prerequisite of lexical acquisition. A cognitively plausible solution is to let theprocessing system home in on the right sort of encoding strategy through repeatedexposure to a range of language-specific families of morphologically related words.This is what conjunctive coding in classical connectionist architectures cannot do.



For example, in German machen and gemacht the shared substring mach would beindexed to different time positions. In Arabic, the set of Wickelcodes {_Ka, kAt, aTa,tAb, aBa, bA_} encoding the perfective form kataba turns out to have an emptyintersection with the set {_Ya, yAk, aKt, kTu, tUb, uBu, bU_} for the imperfectiveform yaktubu.

Relations between variables and the copy operator. The apparent need for algebraicvariables and open-ended relationships between variables inmorphological rules suchas (1)makes the computational problemof implementing copyingmechanisms centre-stage in word processing. This is illustrated by the two-level transducer of Figure 8.2,implementing a classical constructive approach to word parsing and generation. Theinput tape represents an input memory register containing an abstract representationof the Italian form vengono (‘they come’), with ven representing the verb root, ‘V3’its conjugation class, and ‘3p_pres_ind’ the feature specification ‘third person singularof present indicative’ (‘+’ is a morpheme boundary). The string resulting from theapplication of the finite state transducer to the input string is shown on the outputtape. The mapping relationship between characters on the two tapes is represented bythe colon (‘:’) operator in the centre box, with input characters appearing to the leftof ‘:’, and the corresponding output characters to its right. When no colon is specified,the input letter is intended to be left unchanged. Note that ‘ε’ stands for the emptysymbol. Hence the mapping ‘[+ : ε]’ reads ‘delete a marker of morpheme boundary’.Most notably, transductive operations of this kind define declarative, parallel, and bi-directional relationships. We can easily swap the two tapes and use, for word parsing,the same transducer designed for word generation.3

Multi-layered perceptrons seem to have little to match such a rich weaponry offormal tools and computational operations. Nonetheless, it should be appreciatedthat multiple copying of unbounded strings is an operation that goes beyond thecomputational power of finite-state transducers such as the one in Figure 8.2. Thisis an immediate consequence of the fact that states represent the order of memory ofa transductive automaton. For an automaton to be able to replicate an exact copy ofany possible string, irrespective of its length, the number of available states neededto precompile all copies is potentially unbounded. As pointed out by Roark andSproat (2007), if fully reduplicative morphological phenomena require potentiallyunbounded copying, they are probably among the very few morphological phenom-ena which are not in the purview of finite-state processing devices.It seems natural to establish a parallelism between the two-level architecture in

Figure 8.2 and the multi-layered perceptron of Figure 8.1 (right). In both cases, we

3 Due to the hybrid status of lexical representations in two-level automata, where both surface andabstract morphological information is combined on the same level, abstract features (e.g. conjugation class)must be linearly arranged and interspersed with phonological information. This may strike the reader asrather awkward and other neater solutions have been entertained in the literature. However, as argued byRoark and Sproat (), they do not add up to the computational power of two-level finite state machines.



ven + V3 + 3p_pres_ind

vengono

input tape

T = ven[+: g] [V3+:ε] [1s_pres_ind: 0 |3p_pres_ind: ono]

output tape

Figure . A two-level finite state transducer for the Italian irregular form vengono ‘they come’.

have an input and an output representation layer, which could be likened to digitalregisters. In both cases, mapping relations enforce the requirement that at least partof the representation in either layer is also found in the other layer. It is also temptingto conceptualize such relation as a copy operation. However, although this is indeedthe case for finite-state transducers, it is not literally true of multi-layered perceptrons,where data copying is implemented asweight adjustment over inter-layer connections.According to Marcus, this is a fundamental liability of multi-layered perceptrons.Since they cannot implement unbounded copying, he argues, they are not suitablecomputer models of morphology learning.In fact, we observed that universal copying is not a problem for perceptrons only.

Finite-state transducers are in principle capable of dealing with multiple copying, aslong as the number of strings in the domain of the copy operation is bounded. This issomewhat reminiscent of the local constraint on weight adjustments in multi-layeredperceptrons, where the domain of identity mapping must be defined with respect toeach possible input and output node. Admittedly, the two limitations are of a differentnature and are due to different reasons. Nonetheless, it is important to appreciatethat universal copying is a thorny computational issue, and its necessity for wordprocessing and learning should be argued for with care.

.. Interim summary

In this section, we have been through a number of criticisms levelled at classicalconnectionist networks dealing with issues of word processing with no symbolicmanipulation of typed variables. In the remainder of this chapter, we will make thesuggestion that many of the liabilities pointed out by Marcus in connection with



multi-layered perceptrons in fact apply to a specifically constructive approach tomorphological processing. In particular, they are due to (i) the idea thatmorphologicalinflection and derivation require a formal mapping from lexical bases to full forms,on (ii) the assumption that underlying bases and inflected forms are represented ontwo distinct layers of connectivity (or, equivalently, in two distinct registers). Such atheoretically loaded conceptualization makes computational models of inflection andderivation unnecessarily complex. In the following sections we propose an alternativeneurally inspired view of word processing and learning that is strongly committedto an abstractive perspective on these and related issues. The approach amounts to are-interpretation of equation (1) in terms of different computational operations thancopying and concatenation. It seems likely that an adequate account of morphologyacquisition will have a place both for memory mechanisms that are sensitive tofrequency and similarity, and for processing operations that apply to all possiblestrings. A fundamental assumption of the computational framework proposed here isthat both aspects, lexicalmemory andprocessing operations, are in fact part andparcelof the same underlying computational mechanism. This mechanism is considerablysimpler than advocates of algebraic symbol manipulation are ready to acknowledge,and points in the direction of a more parsimonious use of variables and universallyquantified one-to-to mappings. This is not to imply that these notions play no rolein morphology learning or language learning in general, but that their role in wordlearning and processing should be somewhat qualified.

. A computational framework for abstractive morphology

Kohonen’s Self-Organizing Maps (SOMs) (Kohonen 2001) define a class of unsuper-vised artificial neural networks that mimics the behaviour of small aggregations ofneurons (pools) in the cortical areas involved in the classification of sensory data(brain maps). In such aggregations, processing consists in the activation of specificneurons upon presentation of a particular stimulus. A distinguishing feature of brainmaps is their topological organization (Penfield and Roberts 1959): nearby neurons inthe map are activated by similar stimuli.There is evidence that at least some aspects oftheir neural connectivity emerge through self-organization as a function of cumulatedsensory experience (Kaas et al. 1983). Functionally, brain maps are thus dynamicmemory stores, directly involved in input processing, exhibiting effects of dedicatedlong-term topological organization.Topological Temporal Hebbian SOMs (hereafter TSOMs, Ferro et al. 2011, Marzi

et al. 2012c) represent a variant of classical SOMs making room for memorizingtime series of symbols as activation chains of map nodes. This is made possibleby a level of temporal connectivity (or temporal connection layer, Figure 8.3, left),implemented as a pool of re-entrant connections providing the state of activation ofthe map at the immediately preceding time tick. Temporal connections encode the



inputlayer

1

spatialconnection

layermap

nodes

temporalconnection

layer ##GGGFFFFFFNNNNNNTTT

##GGGFFFFFNNNNNNNNTT

#$GGGGFGFFFNNENNNTTT

$$$GGGGGAAAAEEEEEETT



$$GGGGGGAAAEEEEEETTT

$$GGGGGGAAEEEEEEEETT

$$GGGGGAAAAEEEEEEEEE

$$$$GGAAAAAEEEEEEEEE

$$$$$AAAAAAAEEEEEEEE

$$$$$AAAAAAAAPPEEECE

$$$$$AAAAAAAAPPPECCC

LLLLLLAAAAAAAPPCCCCC

SLLLLLLALAAAPPPCCCCC

SSLLLLLLLLLPPPPICICC

SLLLLLLLLLLPPPIIIIIC

SSSLLLLLLLLHHIIIIIRC

SSSMMLMMMLHHHHIIIIRR

SSSMMMMMMHHHHHIIIRRR

2

...

...

...

...

i

N

Xx1

x2

xj

xD

...

...

Figure . Left: Outline architecture of a TSOM. Each node in the map is connected withall nodes of the input layer. Input connections define a communication channel with no timedelay, whose synaptic strength is modified through training. Connections on the temporallayer are updated with a fixed one-step time delay, based on activity synchronization betweenBMU(t–1) and BMU(t). Right: A two-dimensional 20× 20 TSOM, trained on German verbforms, showing activation chains for gelacht and gemacht represented as the input strings#,G,E,L,A,C,H,T,$ and #,G,E,M,A,C,H,T,$ respectively. See text for more details.

map’s probabilistic expectations of upcoming events on the basis of past experience.In the literature, there have been several proposals of how this idea of re-entranttemporal connections can be implemented (see Voegtlin 2002 and references therein).In this chapter, wewillmake use of one particular such proposal (Koutnik 2007, Pirrelliet al. 2011).In language, acoustic symbols are concatenated through time to reach the hearer

as a time-bound input signal. With some qualifications, also written words can bedefined as temporal patterns, conceptualized as strings of letters that are produced andinput one letter at a time, under the assumption that time ticks are sampled discretelyupon the event of letter production/presentation. In our view, this abstract conceptu-alization highlights some peculiar (and often neglected) aspects of the linguistic inputand provides an ecological setting for dealing with issues of language production,recognition, and storage at an appropriate level of abstraction. TSOMs are designed tosimulate and test models of language processing targeted at this level of abstraction.

.. Input protocol: peripheral encoding

In our idealized setting, each word form is represented by a time-series of symbols(be they phonological segments or printed letters) which are administered to a TSOMby encoding one symbol at a time on the input layer (Figure 8.3, left). The input



layer is implemented as a binary vector whose dimension is large enough for theentire set of symbols to be encoded, with one pattern of bits uniquely associated witheach symbol. Patterns can partially overlap (distributed representations) or can bemutually orthogonal (localist representations). In all experiments presented here, weadopted orthogonal codes. A whole word is presented to a map starting with a start-of-word symbol (‘#’) and ending with an end-of-word symbol (‘$’). Map’s re-entrantconnections are initialized upon presentation of a new word. The implication of thismove is that the map’s activation state upon seeing the currently input word form hasno recollection of past word forms. Nonetheless, the map’s overall state is affected bypreviously shown words through long-term learning, as detailed in what follows.

.. Parallel activation: recoding

Upon presentation of one symbol on the input layer all map nodes are activatedsimultaneously through their input (or spatial) connections. Their resulting levelof activation at time t, hi(t), is also determined by the contribution of re-entrantconnections, defined over the map’s temporal layer, which convey information aboutthe state of map’s activation at time t-1, according to the following equation:

(3) hi(t) = α · hS,i(t) + β · hT ,i(t).

Equation (3) defines the level of activation of the ith node at time t as a weightedsummation of the contribution hS,i of the spatial layer and the contribution hT,iflowing through themap’s temporal layer. Given (3), we define themap’s BestMatchingUnit at time t (hereafter BMU(t)) as the node showing the highest activation level attime t.

BMU(t) represents (a) the map’s response to input, and (b) the way the currentinput stimulus is internally recoded by the map. As a result, after a string of letters ispresented to the map one character at a time, a temporal chain of BMUs is activated.Figure 8.3 (right) illustrates two such temporal chains, triggered by the German verbforms gelacht (‘laughed’, past participle) and gemacht (‘made’, past participle), whichare input as the series #,G,E,L,A,C,H,T,$ and #,G,E,M,A,C,H,T,$ respectively. In thefigure, each node is labelled with the letter the node gets most sensitive to aftertraining. Pointed arrows represent temporal connections linking two consecutivelyactivated BMUs. They thus depict the temporal sequence of symbol exposure (andnode activation), starting from the symbol ‘#’ (anchored in the top left corner of themap) and ending with ‘$’. Temporal BMU chains of this kind represent how the maprecodes input words.

.. Long-term dynamic: training

In the learning phase, at each time step t, BMU(t) adjusts its connection weights overthe spatial and temporal connection layers (Figure 8.3, left) and propagates adjustmentto neighbouring nodes as an inverse function of their distance from BMU(t).



Adjustment makes connection weights closer to values in the input vector. Thisamounts to a topological specialization of individual nodes which get increasinglysensitive to specific symbol types. Notably, specialization does not apply across theboard. Due to the propagation function, adjustment is stronger for nodes that arecloser to BMU(t). Moreover, the topological range of propagation shrinks as learningprogresses.This means that, over time, the propagation wave gets shorter and shorter,reaching fewer and fewer nodes.The temporal layer presents a similar dynamic. Adjustment of Hebbian connections

consists of two steps: (i) potentiate the strength of association from BMU(t-1) toBMU(t) (and its neighbouring nodes), and (ii) depress the strength of associationfrom all other nodes to BMU(t) (and neighbouring nodes) (Figure 8.4, left). Thetwo steps enforce the logical entailment BMU(t) → BMU(t-1) and the emergence ofa context-sensitive recoding of symbols. This means that the same symbol type willrecruit different BMUs depending on its preceding context in the input string. Thisis shown in Figure 8.3 (right) where the letter a in gemacht and gelacht activates twodifferent nodes, the map keepingmemory of their different left contexts. Interestingly,

time t

PB B

t-1

R

O

C

H

H

E

E

G

E

A

#

S

T

$N

$

G

F

R

A

G

Figure . Left: Topological propagation of long-term potentiation (solid lines) and long-term depression (dotted lines) of temporal re-entrant connections over two successive timesteps. B-nodes indicate BMUs. Right: Word-graph representation of German past participles.



the Markovian order of the map memory is not limited to one character behind butcan rise as the result of dynamic recoding. If space on the map allows, the differentBMUs associated with a in gemacht and gelacht will, in turn, select different BMUsfor the ensuing c, as shown in Figure 8.3. Were nodes trained independently, the mapwould thus exhibit a tendency to dedicate distinct activation chains to distinct inputstrings (memory resources allowing). This happens if the topological range of thepropagation function for temporal connections goes down to . In this case, the mapcan train temporal connections to each single node independently of connections toneighbouring nodes. However, if the topological range of the propagation function fortemporal connections does not get to , adjustments over temporal connections willtransfer from one node to its neighbouring nodes (Figure 8.4, left). In this condition,activation chains triggered by similar words will tend to share nodes, reflecting theextent to which themap perceives them as identical.The topological distance betweenchains of activatedBMUs responding to similar input strings tells us howwell themapis aligning the two strings.

. Implications of TSOMs for word processing and storage

TSOMs are dynamic models of string storage.They can be seen as mimicking the self-organizing behaviour of a mental lexicon growing in size as a function of cumulatedlexical experience. The more word forms are stored in a TSOM through lexicalexposure, the more complex its overall internal structure becomes, with recurrentpatterns of morphological structure being internally recoded as shared chains of nodeactivation.This is shown in Figure 8.4 (right) (adapted fromMarzi et al. 2012a), wherethe activation chains for the German past participles gesprochen (‘spoken’), gesehen(‘seen’), gesagt (‘said’), and gefragt (‘asked’) are unfolded and vertically arranged in aword graph.It is important to appreciate that this behaviour is the outcome of an internally

generated dynamic, not a function of either direct or indirect supervision. The mapis in no way taught what the expected morphological structure of input word formsis. Structure is an emergent property of self-organization and a function of exposureto input representations. In a strict sense, TSOMs offer no output representations. Ina broader sense, input and output representations co-exist on the same layer of mapcircuitry. It is the level of node connectivity that changes through learning. Outputmorphological structure can thus be defined as an overall patterning of long-termconnectivity. Such a radical change of perspective has important implications boththeoretically and computationally.

.. Coding

A fundamental difference between TSOMs and traditional perceptrons is that inTSOMs we distinguish among levels of input coding. In particular, note that the



input vector, peripherally encoded on the input layer in Figure 8.3 (left), is even-tually recoded on the map proper. It is the latter level that provides the long-termrepresentation upon which morphological structure is eventually perceived.4 Sincerecoding is the result of learning, different cumulated input patterns may lead todifferent recoded representations. By neglecting the important distinction betweenlevels of input coding, perceptrons exhibit a principled difficulty in dealing with inputrepresentations. Conjunctive coding can naturally deal with issues of input specificityif two conditions are met: (i) there is a topological level of encoding where two nodesthat are sensitive to the same symbol type share identical internal representations;and (ii) context-sensitive specialization on symbol tokens is acquired, not wired-in.In TSOMs, both conditions are met: (i) nodes that are selectively sensitive to thesame symbol type have overlapping representations on the spatial layer, and (ii) nodesbecome gradually sensitive to time-bound representations through specialization oftheir temporal connections. Hence, different input instances of the same symbolreceive differently recoded representations, thus effectively dealing with variablebinding when the same symbol type occurs more than once in the same input string.

.. Time vs Space: Differences in recoding strategies

A TSOM recoding strategy may reflect either a bias towards the local context wherea symbol is embedded (i.e. its immediately preceding symbols), or a tendency toconjunctively encode the symbol and its position in the string. Choice of eitherstrategy is governed by the relative prominence of parameters α and β in equation(3), with α modulating the map’s sensitivity to symbol encoding on the input layer,and β weighing up the map’s expectation of upcoming symbols on the basis of pastinput. The two recoding variants (hereafter dubbed spatio-temporal and temporalrespectively) have an impact on the overall topology of the map. In a spatio-temporalmap, topologically neighbouring nodes tend to be sensitive to symbol identity, withsub-clusters of nodes being selectively sensitive to context-specific instances of thesame symbol. Conversely, in a temporal map neighbouring nodes are sensitive todifferent symbols taking identical positions in the input string (Marzi et al. 2012a).Thisis shown in Figure 8.5, plotting the average topological dispersion of BMUs activatedby symbols in the 1st, 2nd, . . . , 9th position in the input string on a spatio-temporal(upper plot) and a temporal map (lower plot). Dispersion is calculated as the averagedistance between nodes discharging upon seeing any symbol appearing in a certainposition in the input string, and is given as a fraction of 100 over the map’s diagonal.Dispersion values thus range between per cent (when all symbols activate the same

4 This is in fact confirmed by what we know about levels of letter representation in the brain, rangingfrom very concrete location-specific patterns of geometric lines in the occipito-temporal area of the lefthemisphere, to more frontal representations abstracting away from physical features of letters such asposition in the visual space, case, and font (Dehaene ).



30

25

20

15

10

5

01 2 3 4 5

symbol position

topo

logi

cal d

isper

sion

(%)

6 7 8 9

temporalspatio-temporal

Figure . Topological dispersion of symbols on temporal and spatio-temporal maps, plottedby their position in input words. Dispersion values are given as a fraction of 100 over the map’sdiagonal.

node, this is true of the start-of-word symbol ‘#’) to 100 per cent, when two nodes lieat the opposite ends of the map’s diagonal.Marzi et al. (2012b) show that the two recoding strategies exhibit a differential

behaviour. Due to their greater sensitivity to temporal ordering, temporal maps arebetter at recalling an input string by reinstating the corresponding activation chain ofBMUs in the appropriate order, when the input string is already familiar to the map.This is due to temporal maps being able to build up precise temporal expectationsthrough learning, and avoid possible confusion arising from repetition of the samesymbols at different positions in the input string. On the other hand, spatio-temporalmaps are better at identifying substrings that are shared by two or more word forms,even when the substrings occur at different positions in the input string. This seemsto be convenient for acquiring word structure in non-templatic morphologies. For aTSOM to be able to perceive the similarity between—say—finden and gefunden, thepieces of morphological structure shared by the two forms must activate identicalor topologically neighbouring nodes. Clearly, due to the combinatorial propertiesof morphological constituents, this is done more easily when recoded symbols areabstracted away from their position in time.We can measure the sensitivity of a map to a specific morphological constituent by

calculating the topological distance between the chains of BMUs that are activatedby the morphological constituent appearing in morphologically related words. Forexample, we can calculate how close on the map are nodes that discharge upon pre-sentation of a verb stem shared by different forms of the same paradigm. The shorter



0.95

0.925

0.9

0.875

0.85

0.825

0.80.9 0.91 0.92 0.93

full-form alignment

stem

alig

nmen

t

0.94 0.95

0.95

finden

finde

finde

findet

findet

fandet

findest

findest

findend

fanden

fanden

fand

fand

gefunden

gefunden

fandst 0.925

0.9

0.875

0.85

0.825

0.80.9 0.91 0.92 0.93

full-form alignment

stem

alig

nmen

t

0.94 0.95

fandst

findend

finden

fandet

Figure . Alignment plots of the finden paradigm on a temporal (left) and a spatio-temporal(right) map.

the average distance is, the greater the map’s sensitivity to the shared morphologicalstructure. If D is such an average distance normalized as a fraction of the map’sdiagonal (or dispersion value), then 1-D measures the alignment between activationchains discharging on that morphological constituent.Plots in Figure 8.6 show how well a temporal map (left) and a spatio-temporal map

(right) perceive the alignment between the stem in each inflected form of Germanfinden (‘find’) and the stems in all other forms of the same verb paradigm. Stemalignment scores (y axis) are plotted against the perceived alignment of each full formwith all other forms in the same paradigm (full-form alignment, x axis). Interceptvalues of the linear interpolation of alignment scores in the two maps show that aspatio-temporal map can align intra-paradigmatic forms better than a temporal mapcan. In particular, spatio-temporal maps prove to be less sensitive to shifts in stemposition, as shownby the different alignment values taken by the form gefunden, whichis perceived as an outlier in the temporal map.This is consistently replicated in the inter-paradigmatic perception of affixes. Box

plots in Figure 8.7 show how scattered are activation chains that discharge upon seeinga few inflectional endings of Italian and German verbs (Marzi et al. 2012b). Activationchains are more widely scattered on temporal maps than they are on spatio-temporalmaps. This is true of both German and Italian inflectional endings, confirming asystematic difficulty of temporal maps in perceiving the formal identity of inflectionalendings that occur at different positions in word forms.



14Italian maps: -ARE, -ERE, -IRE, -NO, -I

temporal

topo

logi

cal d

isper

sion

(%)

(trai

ning

set)

spatio-temporal

12

10

8

6

4

2

14

temporal

topo

logi

cal d

isper

sion

(%)

(test

set)

spatio-temporal

12

10

8

6

4

2

14German maps: -EN, -EM, -ER, -T, -ST

temporal

topo

logi

cal d

isper

sion

(%)

(trai

ning

set)

spatio-temporal

12

10

8

6

4

2

14

temporal

topo

logi

cal d

isper

sion

(%)

(test

set)

spatio-temporal

12

10

8

6

4

2

Figure . Top: (Topological) dispersion values across activation patterns triggered byselected inflectional endings on temporal and spatio-temporal maps for Italian (left) andGerman (right) known word forms. Bottom: (Topological) dispersion values for the same setof inflections calculated on unknown word forms. In all panels, dispersion values are given asa fraction of 100 over the map’s diagonal.

It is worth emphasizing, in passing, that a TSOM identifies inflectional endings onlyon the basis of the formal contrast between fully inflected word forms. In Germanpreterite forms, it is easier for a map to perceive a possible segmentation te-st byaligning glaubtewith glaubtest, than the segmentation te-s-t on the basis of the contrastbetween glaubtest and glaubtet. The latter segmentation requires discontinuous align-ment between the final t’s in the two forms, which is something the map does morereluctantly, and only under certain distributional conditions. This behaviour appearsto be in keeping with Bank and Trommer’s ‘subanalysis complexity hypothesis’ (thisvolume).

.. Training independence

Training independence is taken by Marcus to be responsible for perceptrons’ inabilityto generalize connectivity patterns to untrained nodes. As nodes are trained inde-pendently, nothing learned in connection with one specific node can be transferred toanother node.This is not true of TSOMs,where learning requires topological propaga-tion of adjusted spatial and temporal connections, modulated by the neighbourhooddistance between each map’s node and the current BMU(t). Topological propagationplays an important role in allowing the map to generalize over unseen activation



#

O## E

EE E

EE E

E EE E

EEE

E E E E EE E E E E

E E EE E EE E E

E EE E EE E E

##

C

R

E

D

I

V

E

D

A

M

O

$

E

T

E

$

OOOO

OOOOO

OO

OOOO

OOOOO

OO

EOOO

OOO

OOO

O O OO O OT O O

O OO OO O

O IO OO I

I

OOOO

TT

TT

TTT

TTTT

T

T T T T T T T TT T T T T T T TT T T T T T

TAAAAA

AAAAAA

AAAAAA

AAA

AAA

AAA

AA

A AA N N N N N N N

N N N N N NN N N N N NNNDD

DD

DIIIII

IIII

TIII

IIIII

IIIII

I

IIIIIIIII

IIIIIII

IIII

II

I

D

DD D

DD

DDD

N N NN

N NN NNMMMMMSSSSSS

FFF

FPP

PPCCCCCC

C CCCCC

CC

SSSSSVVVVL

S

MMM

MMMMM

NT T T G

GGG

GGG

GGG

DGG

DGG

DD D

DD

GG

G GG

T TT

T TT

TT

TTT

TTT

$

$$$$

$$ $$$R R

RR RR R

R R RR R RR R RR R RR R R R

$$$$$

$

$$$$

Figure . BMU activation chains for vediamo-vedete-crediamo on a 20× 20 map (left) andtheir word-graph representation (right). In the graph, nodes that are strongly activated by thenovel form credete are highlighted in grey. Topological proximity of D-nodes allows the map tobridge the transitional gap between cred- and -ete.

chains. Figure 8.8 (left) shows the activation chains triggered by the Italian verb formsvediamo ‘we see’, vedete ‘you see’ (second person plural), and crediamo ‘we believe’,on a 20×20 map trained on all these forms. The same map is eventually promptedwith the unknown credete ‘you believe’ (second person plural), to test its capacity toanticipate a novel form. The crucial generalization step here involves the circled areaincluding D-nodes that are activated upon seeing stem-final d in either ved- or cred-.When credete is shown to the map, spread of temporal activation over the circled areawill raise an expectation for -ete to follow, based on the activation chain of vedete. Theword graph in Figure 8.8 (right) illustrates this. Topological propagation of temporalconnections over neighbouring map nodes enforces inter-node training dependence,allowing expectations of upcoming symbols to be transferred from one activationchain to another, based on topological contiguity.An important consequence of this generalization strategy is that inferences are

made cautiously, on the basis of a local, analogy-based, inter-paradigmatic extension.This seems to be a logically necessary step to take for a learner to be able to capture thewide range of stem variation and stem selection phenomena attested in inflectionalsystems of medium and high complexity. Another interesting implication of thegeneralization strategy enforced by a TSOM is that inflectional endings are betteraligned if they are immediately preceded by the same left context. Hence, not any



0.95

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5

0.45–3 –2 –1 0 1 2

position relative to morpheme boundary

corr

elat

ion

scor

e

3 4 5 6

ItalianGerman

Figure . Correlation coefficients between alignment scores and recall scores for novelwords, plotted by letter positions relative to the stem-ending boundary. Position order on thex axis is right to left: corresponds to the first letter of the inflectional ending; negative valuescorrespond to letters coming after the first letter of the inflectional ending, and positive valuesindex letters preceding the inflectional ending.The plot shows that correlation rapidly decreasesas an inverse function of the letter distance from the stem-ending boundary on either side.

analogy between two stems will work, but only a shared sub-string in the immediatecontext of the inflectional ending.This is shown by the graph of Figure 8.9, plotting thecorrelation between how accurately the map is recalling a novel input word, and howwell the novel word’s activation chain is aligned with the closest activation chain of aknown word. As we move away from the stem-ending boundary of the novel word oneither side, correlation of alignment scores with recall accuracy becomes increasinglyweaker and statistically less significant.This is exactly what Albright and Hayes’ Minimal Generalization algorithm would

expect: alignment between nodes activated by a novel word and nodes activated byknown words matters less as nodes lie further away from the morphemic boundary.UnlikeMinimal Generalization, however, which requires considerable a priori knowl-edge about word structure, mapping rules and the specific structural environment fortheir application, the generalization bias of TSOMs follows from general principles oftopological memory and self-organization.

.. Universally quantified variables

According to Marcus, a default morphological relation like (1) (repeated in (4) forthe reader’s convenience) requires the notion of a copy operation over a universally



quantified variable. An abstractive computational model of morphological compe-tence such as the one embodied by the TSOM of Figure 8.8 does not require a copyoperation.

(4) PROGR(Y) = Y + ing,

In fact, equation (4) only imposes the requirement that the base form and progressiveform of a regular English verb share the same stem. In TSOMs, sharing the samestem implies that the same morphological structure is used over again. There is nocomputational implication that this structure is copied from one level of circuitry toanother such level.This is well illustrated by the activation chains plotted in Figure 8.8for the Italian verb forms vediamo and vedete, which appear to include the same wordinitial activation chain.Another interesting related issue is whether a TSOM can entertain the notion of an

abstract morphological process such as prefixation. If a map A, trained on prefixedword forms, is more prone to recognize a word form with an unfamiliar prefix thananother map B never trained on prefixed words, we can say that map A developed anexpectation for a word to undergo prefixation irrespectively of the specific prefix. Toascertain this, we trained two maps on two sets of partially instantiated paradigms:the first map was exposed to fifty Italian verb paradigms, and the second map to fiftyGerman verb paradigms. Each paradigm contained all present indicative forms, allpast tense forms, the infinitive, past participle, and gerund/present participle forms,for a total of fifteen paradigm cells.Whereas Italian past participles aremostly suffixed(with occasional stem alternation), German past participles are typically (but notalways) prefixed with ge-. After training, we assessed to what extent the two mapswere able to align stems in <PREF-X, X> pairs, where X is a known infinitive, andPREF an unknown prefix. Results are shown in Figure 8.10, where stem dispersionis tested on fifty made-up Italian and German pairs, instantiating a <ri-X, X> (e.g.riprendere, prendere) and a <zu-X, X> (e.g. zumachen, machen) pattern respectively.The German map is shown to consistently align prefixed stems with base stems betterthan the Italian map does, with over 50 per cent of German pairs activating exactlythe same BMUs, thus proving to be more tolerant towards variability of stem positionin the input string. This evidence cannot simply be interpreted as the result of a moreflexible coding strategy (e.g. spatio-temporal recoding as opposed to temporal) sinceboth maps were trained with the same parameter set. Rather, it is the map’s acquiredfamiliarity with prefixed past participles in German that makes it easier for similarly(but not identically) prefixed patterns to be recognized.

. General discussion

Human lexical competence is known to require the fundamental ability to retainsequences of linguistic items (e.g. letters, syllables, morphemes, or words) in the



5

8ItalianGerman

10

stem

disp

ersio

n (%

)

15 20<PREF-X, X> pairs

25 30 35 40 45 50

7

6

5

4

3

2

1

0

Figure . Stem dispersion over <PREF-X, X> pairs in Italian and German verb forms.Dispersion values are given as a fraction of 100 over the map’s diagonal.

working memory (Gathercole and Baddeley 1989, Papagno et al. 1991). Speakersappear to be sensitive to frequency effects in the presentation of temporal sequences ofverbal stimuli. Items that are frequently sequenced together are stored in the long-termmemory as single chunks, and accessed and executed as though they had no internalstructure. This increases fluency and eases comprehension. Moreover, it also explainsthe possibility to retain longer sequences in short-termmemorywhen familiar chunksare presented. The short-term span is understood to consist of only a limited number(ranging from three to five according to recent estimates, e.g. Cowan 2001) of availablestore units. A memory chunk takes one store unit of the short-term span irrespectiveof length, thus leaving more room for longer sequences to be temporarily retained.Furthermore, chunking produces levels of hierarchical organization of the inputstream:what is perceived as a temporal sequence of items at one levelmay be perceivedas a single unit on a higher level, to become part of more complex sequences (Hay andBaayen 2003). Finally, parts belonging to high-frequency chunks tend to resist beingperceived as autonomous elements in their own right and being used independently.As a further implication of this ‘wholeness’ effect, frequently used chunks do notparticipate in larger word families such as inflectional paradigms (Marzi et al. 2012b).We take this type of evidence to illustrate principles of lexical self-organization and

to shed light on the intimate interplay between processing and storage in languageacquisition. TSOMs provide a general framework for putting algorithmic hypothesesof the processing–storage interaction to the severe empirical test of a computerimplementation. Furthermore, unlike classical perceptron-like neural architectures



trained on back-propagation, they allow scholars to entertain a purely abstractive viewofmorphological competence, based on a number of realistic assumptions concerningacquisition of word structure. We recapitulate them in what follows.

Ecology of training conditions. In modelling word acquisition, it is important to beclear about what is provided in input, and what is expected to be acquired. Choiceof input representation may in fact prejudge morphological generalizations to a con-siderable extent (see also Stump and Finkel’s consideration of representational issues,this volume). Machine learning algorithms tend to make fairly specific assumptionson word representations. Even memory-based approaches to morphology induction(Daelemans and van den Bosch 2005), whichmake ‘lazy’ use of abstraction-free exem-plars, are no exception. To ensure that only features associated with aligned symbols(letters or phonological segments) are matched, Keuleers and Daelemans (2007) alignexemplar representations to the right and cast them into a syllabic template. Indeed,for most European languages, we can construe a fixed-length vector representationby aligning input words to the right, since inflection in those languages typicallyinvolves suffixation and sensitivity to morpheme boundaries. However, this type ofencoding presupposes considerable knowledge of morphology of the target languageand does not possibly work with prefixation, circumfixation, and non-concatenativemorphological processes in general. Likewise, most current unsupervised algorithms(see Hammarström and Borin 2011 for a recent survey) model morphology learningas a segmentation task, assuming a hard-wired linear correspondence between sub-lexical strings and morphological structure. However, both highly fusional and non-concatenative morphologies lend themselves grudgingly to being segmented intolinearly concatenated morphemes. In assuming that word forms simply consist of alinear arrangement of time-bound symbols, TSOMs take aminimalist view onmattersof input representation. Differences in self-organization and recoding are the by-product of acquired sensitivity of map nodes to recurrent morphological patterns inthe training data. In our view, this represents a principled solution to the notoriousdifficulty of multi-layered perceptrons in dealing with issues of input binding andcoding. Finally, it addresses the more general issue of avoiding input representationsthat surreptitiously convey structure or implicitly provide themorphological informa-tion to be acquired. Models of word acquisition should be more valued for adaptingthemselves to themorphological structure of a target language, than for their inductiveor representational bias.

Storage and processing. In models of memory, mnemonic representations areacquired, not given. In turn, they affect acquisition, due to their being stronglyimplicated in memory access, recognition, and recall. TSOMs process words by acti-vating chains of memory nodes. Parts of these chains are not activated by individualwords only, but by classes of morphologically related words: for example, all formssharing the same stem in regular paradigms, or all paradigmatically homologousforms sharing the same suffix. An important implication of this view is that both



lexical forms and sub-lexical constituents are concurrently represented on the samelevel of circuitry by Hebbian patterns of connectivity linking stored representations.This eliminates the need for a copy operator transferring information from one levelto another, according to a mechanistic interpretation of the variable Y in a defaultgeneralization like PROGR(Y) = Y + ing. Moreover, robust recoding of time-boundinformation effectively minimizes the risk of creating spurious patterns due to thesuperposition of lexical and sub-lexical chains.Another consequence of this view is that morphological productivity is no longer

conceptualized as a mapping function between a (unique) lexical base (stored in thelexicon) and an inflected form (generated on-line). Any inflected form can act as abase for any other inflected form. It is important to appreciate, however, that a base isnot defined in terms of structural factors such as economy of storage or simplicityof mapping operations, but rather in terms of usage-oriented factors such as timeof acquisition, frequency-based entrenchment, probabilistic support of implicativerelations, and gradient capability of discriminating inflectional class among alternativehypotheses. As long as we are ready to accept that lexical representations are notdichotomized from lexical entailments, the reconceptualization of word structureoffered by TSOMs is in line with a genuinely abstractive view of morphology.

Generalization and training. Marcus (2001) correctly emphasizes that no general-ization is possible with training independence. Generalization implies propagation,transferring of information from one node to another node, not just adaptation ofconnectivity between individual input and output nodes. Propagation is also inde-pendently needed to address another apparent paradox in incremental acquisition.Human speakers are very good at inferring generalizations on the basis of local analo-gies, involving the target word and its closest neighbours (Albright 2002). However,analogy is a pre-theoretical notion and any linguistically relevant analogy must takeinto account global information, such as the overall number of forms undergoing aparticular change under structurally governed conditions. How can global analogiesbe inferred on the basis of local processing steps? Topological propagation offers asolution to this problem. Although a BMU is identified on the basis of a pairwisesimilarity between each node and the current input stimulus only, the connectivitypattern of a BMU is propagated topologically on the spatial and temporal layers ofa map. This prompts information transfer but also competition between conflictingpatterns of local analogy. It is an important result of our simulations that the neateffect of competition between local analogies is the emergence of a global, paradigm-sensitive notion of formal analogy (Marzi et al. 2012a).All issues raised here appear to address the potential use of computational models

of language acquisition for theoretical linguists and psycho-linguists focusing on thenature of grammar representations. We argue that there is wide room for cross-disciplinary inquiry and synergy in this area, and that the search for the mostappropriate nature of mental representations for words and sub-word constituents,



and a better understanding of their interaction in a dynamic system, can providesuch a common ground. At our current level of understanding, it is very difficult toestablish a direct correspondence (Clahsen 2006) between language-related categoriesand macro-functions (rules vs exceptions, grammar vs lexicon) on the one hand, andneuro-physiological correlates on the other hand. As an alternative approach to theproblem, in the present chapter we have focused on a bottom-up investigation ofthe computational complexity and dynamic of psycho-cognitivemicro-functions (e.g.perception, storage, alignment, and recall of time series) to assess their involvementin language processing, according to an indirect correspondence hypothesis. Albeitindirect, such a perspective strikes us as conducive to non-trivial theoretical andcomputational findings.To date, progress in cross-disciplinary approaches to language inquiry has been

hindered by the enormity of the gap between our understanding of some low-level properties of the brain, on the one hand, and some very high-level propertiesof the language architecture on the other. The research perspective entertained inthese pages highlights some virtues of modelling, on a computer, testable processingarchitectures as dynamic systems. Arguably, one of the main such virtues lies inthe possibility that by implementing a few basic computational functions and theirdynamic interrelations, we can eventually provide some intermediate-level buildingblocks that would make it easier to relate language to cognitive neuroscience.


Patterns of syncretism and paradigmcomplexity: The case of Old andMiddle Indic declension

PAOLO M I L I Z IA

. The principle of compensation

According to the principle of compensation, formulated in 1940 by Viggo Brøndal(1940: 102), a marked category does not typically allow more subdistinctions than itsunmarked counterparts (cf. also Jakobson 1939: 146, Greenberg 1966: 27 ff., Baermanet al. 2005: 22 f.). The canonical instance of such a principle involves two categories thatmay be seen as hierarchically organized so that, in the presence of marked values of thesuperordinate one, we can expect to have partial or total syncretism affecting the valuesof the other. Massive case syncretism in the dual number of the nominal declensionof ancient Indo-European languages (assuming number to be superordinate to case inthese systems) neatly exemplifies such a tendency (cf. Brøndal 1940: 103).

On the other hand, as the complexity of the considered paradigm increases, and, inparticular, when the interacting grammatical categories are more than two, syncretismpatterns can emerge for which it may be difficult to assess whether, and to what extent,the principle of compensation is complied with. In addition to the theoretical problemrelated to the very notion of markedness (which we shall touch on later), it shouldbe noticed, first, that Brøndal’s formulation makes reference to a category hierarchythat, in fact, is not always easy to define, since in a paradigm the same category mayappear superordinate or subordinate to another according to the different syncretisminstances.1 Moreover, one might wonder which prediction should be made for apossible three-member hierarchy A > B > C. In other words, given a marked valueof A, is the principle of compensation complied with to a greater extent when both B

1 A definition of the concept of ‘domination réciproque’ is already in Hjelmslev (: ).


Paolo Milizia

and C are morphologically neutralized or when only the values of B are syncretizedwhile those of C are distinguished?

In the present contribution, we shall propose some observations concerning prob-lems of this type using as a starting point the adjectival declension of the -a/a-stems in Old and Middle Indic. Indeed this paradigm represents a case worthy ofattention in that, as we shall see, the position of its syncretism patterns along the threecategorial dimensions of number, gender, and case shows noticeable properties in bothsynchronic and diachronic perspective.2

Among the three categories for which the Old Indic adjective is inflected, numberseems to be the least prone to syncretism, while the marked values of number,i.e. plural and dual, tend to represent the context for syncretism of gender or casevalues. Interestingly, in Pali the inflection of the adjectival stems in -a/a- showsthat a slab of the paradigm, namely the feminine gender, if we take it in isolation,represents a seeming counterexample to the principle of compensation, since theplural distinguishes more oblique cases (instrumental/ablative -ahi, genitive -anam. ,locative -asu) than the singular (where -aya indicates instrumental, ablative, genitive,and, in competition with -ayam. , locative).

In order to explain such state of affairs a promising approach may be to reformu-late the principle of compensation in terms of frequency and information content.According to the formulation of some scholars (cf. Cairns 1986: 18) the principle ofcompensation can be seen as the effect of a general tendency to avoid an excessivenumber of marked values. Along this line of argument, for instance, it might be saidthat in Latin or in Old Indic the ablative plural is syncretic with the dative pluralbecause both its case and number values are marked. This conception entails that therelevant property is the ‘number of marked values’. But in what sense may it be saidthat markedness values are addable items, and where in the morphological systemdoes this operation of markedness-counting have its place?

The position defended here is that marked values are not addable by themselves,because they are heterogeneous items, but their markedness is indeed addable if weinterpret it in terms of frequency. Indeed, this approach (cf. Hawkins 2004: 64 ff.,Haspelmath 2006), also with its corollary that a tendency to syncretism may be relatedto low-frequency values (s. Greenberg 1966: 68–9; cf. also Croft 2003: 113), is not newin the study of morphology. Moreover, in recent years, a line of investigation thatdescribes paradigm relationships in terms of information theory has been proven tobe scientifically fruitful (see, e.g., Baayen et al. 2011; Milin et al. 2009; Kostić and Božić2007; Moscoso del Prado Martín et al. 2004).

2 In fact, the same properties are largely shared by the noun system if we consider inflectional classoppositions instead of gender oppositions. Indeed, in Old and Middle Indic nominal, inflectional class andgender are closely related (cf. also fn. ).


Patterns of syncretism and paradigm complexity

Crucially, relative frequency and information content (henceforth IC)—i.e. thebase-2 cologarithm of the relative frequency (Shannon 1948)—are purely numerical,non-categorical, notions. Thus, if markedness is interpreted as depending on fre-quency or IC, then the operation of summing poses no problem from a theoreticalpoint of view. Moreover, if paradigms are described in information-theoretical terms,then the IC of an exponent will be due to the sum of the ICs associated with eachof the morphosyntactic properties that are cumulatively expressed by that exponent.In other words, the operation of summing is automatically involved by cumulativeexponence.

Along this line of analysis, we may assume that a paradigm cell associated with‘marked’—in the sense of being relatively infrequent—values, such as for instance alocative, feminine, dual, will be likely to share its exponent with other cells, becausea non-syncretic exponent specific to that cell would have a too low frequency, i.e. atoo high IC. Such a formulation, in fact, leaves undetermined what frequency valueshould be considered relevantly low. Yet, it would be misguided to look for a universalcritical value under which syncretism would become likely or more likely: indeed, ifwe compare a paradigm with four cells with a paradigm with seventy-two cells (like,e.g., the Old Indic adjective), then a cell of the latter is expected to be less frequentthan a cell of the former, each within its paradigm; and that, of course, independentlyof markedness relationships.

Therefore, we propose, tentatively, that the relevant point of reference for eachparadigm is the average frequency value of its cells. To make such an assumptionis, in fact, to hypothesize that one of the aims of syncretism is to compensate forthe difference in frequencies shown by the cells of a paradigm by allowing two—or more—rare morphosyntactic property sets to share the same exponent.3 In otherwords, our working hypothesis will be that one of the forces operating within themorphological system strives to make all exponents have the same frequency or, atany rate, to reach an equilibrium point whereby the distribution of the exponents iscloser to the equiprobable distribution than the one that would be generated by a one-to-one mapping between exponents and morphosyntactic property sets.

Given our premises, the question arises of which information-theoretic quantitiesare most suitable to be used as heuristic indicators of the degree of compliance with theprinciple of compensation. In regard to this issue, we can look, in the first instance, atthe Kullback–Leibler divergence between the observed and the uniform distribution(henceforth D) and at the Shannon redundancy (henceforth R), with the caveat that

3 Such a claim may be viewed as complementary to the assumption that systematic inflectionalhomonymy favours memorability in paradigms exhibiting cumulation (Carstairs ; : –).See also Brown et al. () for the observation that, in a rule-based framework distinguishing betweensyncretism due to exponent underspecification and syncretism due to explicit rules of referral, the lattermight also increase memorability.


Paolo Milizia

we do not rule out that other measures might prove to be able to describe the linguisticdata more exactly.

D and R are related to entropy (H), which represents the average informationcontent of a symbol, i.e. in our case, of an exponent of the inflectional paradigm.

It should be noted that such a use of entropy is different from that described byStump and Finkel in Chapter 7. Importantly, in the present chapter the probabilityvalues relevant to what we call ‘entropy of the exponent set’ are always related to thetext (token) frequency of the items.

The entropy of the exponent set is defined according to the following expression,where X is the set of the exponents {x1, x2 . . . xN}, N is the number of the exponentsbelonging to X, and P(xi) is the probability (i.e. the relative frequency) of an expo-nent xi:

(1) H(x) = −∑i=1

NP(xi)log2P(xi)

D and R represent, respectively, the difference (see expression 2; cf. Rissanen 2007:18–19) and the fractional difference (see expression 3; cf. Shannon 1948) between themaximum possible value of entropy (which is equal to log2 N) and its observed value.

(2) D = log2N − H(X)

(3) R = 1 − H(X)log2N

H is at its maximum when all probability values are equal. Hence, by definition, whenall symbols have the same probability, both D and R are minimized and equal tozero.4 Therefore, it seems to be justified to interpret a decrease in these quantities5 as ashift towards an exponent distribution closer to the equiprobable case and, therefore,towards a paradigm organization more in keeping with the principle of compensation.

4 Unlike other information-theoretic measures proposed for the description of morphologicalparadigms (cf. Kostić et al. ; s. also Milin et al. : ff.), the entropy of the exponent set as we havedefined it does not involve a weighting of the probabilities of the exponents according to their functionsand meanings.

5 While D can be considered as a measure of the distance from the uniform distribution, the use of R asan indicator is compatible with the assumption of a system organization leading to redundancy reduction.Both indexes predict that syncretism is disfavoured between high-frequency cells and favoured betweenlow-frequency ones. However, the remark must be made that there is an intermediate range of frequencyvalues for which appearance of syncretism causes a decrease in D and an increase in R. More precisely, itcan be shown that, in the case of a syncretism involving two cells a and b with probabilities pa and pb , Ddecreases if the following inequality is satisfied:

pa + pb <log2N − log2(N − 1)

H(pa

pa+pb, pb

pa+pb)

On the other hand, the condition for R to decrease is the satisfaction of an inequality identical to the oneabove except for the fact that the right term is multiplied by (-R).



Table . Relative frequencies of inflectional values on the basis ofLanman (1880)

numbersingular 0.703plural 0.250dual 0.047

gendermasculine 0.621neuter 0.208feminine 0.171

casenominative 0.399

non-oblique 0.726vocative 0.084accusative 0.242instrumental 0.082

oblique 0.274dative 0.048ablative 0.011genitive 0.075locative 0.059

. Some observations on the Old Indic adjectives in -a-

The data of the Old Indic adjectival declension (based on Lanman 1880) are effective inshowing how different from each other the relative frequencies of the morphosyntacticproperty sets can be for a paradigm involving three morphological categories (cf.Table 9.16 and Figure 9.17): compare, for instance, a property set combining threefrequent values such as the nom. masc. sg. with another combining three infrequentvalues such as the loc. fem. du. (top and bottom of the graph in Figure 9.1, respectively).

If we look at the paradigm of the Old Indic adjectives in -a/a- (Table 9.2,8 cf.Thumb and Hauschild 1959: 30–50), we can see that the negative correlation betweenfrequency of morphosyntactic property sets and proneness to syncretism is manifest.In addition to the already noticed ‘dual effect’, it is particularly remarkable that theablative, which is the grammatical case of the lowest frequency, is, at the same time,one of the two oblique cases showing syncretism in the plural (with the dative),and one of the two cases exhibiting syncretism in the feminine singular (with thegenitive).

6 As Lanman’s tables (: –) conflate nominative and vocative in the plural, nominative,accusative, and vocative in the dual and in the neuter plural, and nominative and accusative in the neutersingular, the values concerning the three non-oblique cases have been completed so that the proportionbetween the frequencies of two morphosyntactic properties corresponds to the one resulting from thesections of Lanman’s tables where those properties are distinguished.

7 The graph is based on the data in Table ., and represents an idealization since it treats gender, number,and case as if they were mutually completely independent variables. The volume of each parallelepipedal tileis proportional (with the caveat just noted) to the relative frequency of the corresponding morphosyntacticproperty set.

8 In Table . a probability value is given for each exponent of the paradigm on the basis of a calculationof the relative frequency of each morphosyntactic property set according to Lanman’s data (). Forsyncretic exponents, the table records the sum of the relative frequencies of the property sets associatedwith them. The endings -e of the locative masculine/neuter singular, the vocative feminine singular, and thenon-oblique feminine/neuter dual have been considered as different homonymous exponents.


Paolo Milizia

Fem

Ntr

GENDER

LGAbl

DI

CASE V

A

Masc

NSing

Plur

NUMBER

Du

Figure . A graph representing the relative frequencies of morphosyntactic property com-binations in Old Indic adjectival paradigms

Now, in order to exemplify how the defined indicators may be used to assesssyncretism patterns, we can consider hypothetical variants of the paradigm in Table9.2. To pick a clear example of a ‘compensation-compliant’ pattern, reduction ofsyncretism in the dual by introducing, e.g. an opposition between a masculine-neuteroblique exponent **-aebhyam and a corresponding feminine **-abhyam, would pro-duce an increase in D and R. This means that the presence of this syncretism instancecontributes to keep D and R low (cf. Table 9.3, first variant).9

9 This is an exemplifying calculation based on the data shown in Table ., which leaves out ofconsideration the information content related to the inflectional class and the fact that certain exponents(e.g. gen. -asya) are specific to the -a/a-class, while others are shared by more classes (e.g. voc. zero).Moreover, OI adjectives follow the same inflection shown by substantives with the specific property thatcertain adjectival inflectional classes form a feminine subparadigm by creating a peculiar stem in -a-(for masc./ntr. stems in -a-), in -u- (for masc./ntr. stems in -u-), or in -ı-. Therefore, a comprehensive



Table . Old Indic -a-/-a- adjective declension (with the omission of someending variants) and relative frequencies of the sets of inflectional value arraysassociated with the different exponents

singular plural dualmasc. neuter fem. neuter masc. fem. m. n. f.

non- obl.

nom. -as-am

-ā-āni -ās -au -evoc. -a -e

acc. -ām -ān

obl.

instr. -ena -ayā -ais/-ebhis -ābhis-ābhyāmdat. -āya -āyai -ebhyas -ābhyasabl. -āt -āyāsgen. -asya -ānām-ayosloc. -e -āyām -es.u -āsu

singular plural dualm. n. f. n. m. f. m. n.| f.

nom.voc.acc.instr.dat.abl.gen.loc.

.186.210

.029.039 .117 .033 .010.055 .004

.024 .019.029 .016 .025 .011

.001.035 .009 .004 .001.008 .007.051 .017 .003.036 .005 .010 .006

Importantly, this kind of comparison allows us to identify those syncretisminstances that cannot be traced back to the principle of compensation and, in fact,are in conflict with it. Thus, the introduction for the feminine nom./acc./voc. plural ofan ending different from the one of the nom. pl. (-as) would make D and R decreaseand not increase (cf. Table 9.3, second variant).10

On the other hand, it is clear that in Old Indic nominal paradigms the principle ofcompensation coexists and interacts with a series of major structuring principles.11

Thus, the cases are organized into a non-oblique and an oblique subset; the masculineand the neuter behave as two subcategories (almost systematically neutralized in theoblique) of a masculine/neuter gender; the masculine is prone to syncretism withthe feminine in the non-a/a- classes, while it generally rejects syncretism with the

information-theoretic description of the whole noun system, which is beyond the purpose of the presentchapter, should also incorporate this sort of data.

10 We can observe that, given the entropy (H) and the number of exponents (N) of the paradigm inTable ., for a syncretism between two cells having equal probabilities pa = pb to cause a decrease in D,the sum pa + pb must be less than about .. For R to decrease the same sum must be less than about .(cf. fn. ).

11 See Koenig and Michelson (this volume), for an example of how a series of structuring principles (thatcan be formalized by means of rules with underspecified antecedent and rules of referrals) may lead to asystem of actual morphosyntactic distinctions that is dramatically different, and poorer, than the set of thepotential morphosyntacic distinctions.


Paolo Milizia

Table . R and D values for some hypothetical variants of theparadigm in Table 9.2: (1) with distinction between m./ntr. and fem.in the instr./dat./abl. du.; (2) with distinction between m. nom. pl. andfem. non-obl. pl.; (3) with oblique-case syncretism in the fem. sg.; (4)with oblique-case syncretism in the m./ntr. sg.

N R D

Paradigm 29 0.2078 1.0097

N R R variation D D variation

Variant 1 30 0.2156 +0.0078 1.0582 +0.0485Variant 2 30 0.1932 −0.0146 0.9478 −0.0619Variant 3 26 0.1962 −0.0116 0.9222 −0.0875Variant 4 25 0.2450 +0.0372 1.1378 +0.1281

neuter in the nominative and vocative cases; the neuter systematically syncretisesnon-oblique cases. These facts may be thought of as due, at least partially, to seman-tic/functional grounds.12 More generally, the here hypothesized tendency towardsequiprobability is to be viewed as one of several factors that may influence inflectionalstructures.13

. Compensation and language change

It is important to specify that, in our view, the principle of compensation, reformulatedin information-theoretical terms, is to be intended as a factor capable of conditioningthe diachronic development of a language rather than as a rule that operates synchron-ically at the morphological level.

12 Clearly, a crucial fact is that one of the major functions of non-oblique cases is to express syntacticrelations, particularly the subject/non-subject opposition. Thus, to a certain degree, non-oblique casesyncretism is balanced by the possibility of recovering syntactic structure from other elements such asword order or verbal agreement. Significantly, non-oblique syncretism seems to be cross-linguisticallycommoner than syncretism between oblique cases (cf. Baerman et al. : ). As for the OI tendency tostructurate gender distinction in non-oblique cases along an opposition masculine–feminine vs neuter, thismay depend on the fact that, as far as the core of nouns with semantic gender assignment is concerned, suchan opposition coincides with the semantic feature [± human]. Indeed, in interaction with other categoriessuch as topicality and agentivity, humanness is typically associated with the syntactic relation of subject.

13 For more details about principles and tendencies that are or may be in conflict with the preferencefor equiprobable exponents, see Milizia () (especially Chapters and ). A point to be mentionedhere concerns the possible assumption of a general tendency towards the phonological optimization ofexponents, in accordance with which more frequent exponents tend to be phonologically shorter, or simpler,than less frequent ones. Indeed, as I attempted to show, such kind of optimization has the side-effect offavouring the appearance of anti-compensative syncretism.



In other words, our claim is that, other things being equal, i.e. apart from pressuresexerted by the phonological and the syntactic level, a diachronic change that isconsistent with the principle of compensation (i.e. that makes the proposed indicatorsdecrease) will be more likely to occur than a change that goes in the opposite direction.

Now, for a paradigm exhibiting cumulative exponence, the diachronic changes thathave the desired effect are of at least three types: (1) syncretization of low-frequencycells; (2) introduction of formal distinctions eliminating instances of syncretism thatinvolve high-frequency cells; (3) shift from cumulative to separate exponence incorrespondence with low-frequency cells.

Types (1) and (2) directly follow from our previous reasoning. Type (3) is alsoevidently ‘compensation-compliant’ in that it dispenses the system with having inflec-tions limited to ultra-rare morphosyntactic property combinations.

. The Pali development

Interesting observations can be made if we compare the Old Indic paradigm withits Middle Indic descendant in the Pali language (Table 9.4, cf. Oberlies 2001: 141).The morphosyntactic properties are the same as in Old Indic except for the loss ofthe dual number and the near-complete loss of the dative case (with its functionspredominantly subsumed by the genitive).14

The ablative plural, left without its ancient syncretism companion (i.e. the dativeplural), finds a new fellow in the instrumental plural.15 The oblique feminine singularis totally or almost totally syncretic for case (depending on which locative variant is

Table . Pali -a-/-a- adjective declension

singular pluralmasc. neuter fem. neuter masc. fem.

non-obl.nom. -o

-am. -ā

-āni -ā -ā/-āyovoc. -a -eacc. -am. -e/-āni

obl.

instr. -ena-āya -ā-hiabl. -ā/-amhā/-asmā/-ato

gen. -assa -ānam. loc. -e/-amhi /-asmi(-m. ) -āya(-m. ) -e-su

-e-hi

-ā-su

14 As for the remnants of the -aya- dative, see von Hinüber : ff. These forms, having final func-tion and being predominantly associated with action nouns, seem to have moved towards the derivationalend of the inflectional–derivational continuum.

15 In Pali, the ablative continues to be the least frequent case. A text sample containing occurrencesof nominals has given the following frequency values for the non-oblique cases: instr. .; abl. .;gen.: .; loc.: ..


Paolo Milizia

Table . Syncretism in the marked gender and in the markednumber in Pali -a-/-a- adjectives

femininesingular plural

instr.-āya

-ā-hiabl.gen. -ānam.loc. -ā-su

pluralmasc. / neuter fem.

instr. -e-hi -ā-hiabl.gen -ānam. loc. -e-su -ā-su

selected) and this development is not phonologically expected.16 These changes clearlycomply with the principle of compensation (type 1 according to the classificationsketched earlier). Indeed, it may be noted that, if hypothetically applied to the OldIndic paradigm, syncretization of all the oblique cases in the feminine singularinvolves a decrease in D and R, while the same is not true if syncretization is applied,e.g., to the masculine/neuter slab (cf. Table 3, third and fourth variants).

Moreover, in the non-oblique plural, Old Indic had an ending -as for nom./voc.masc. and nom./voc./acc. fem, but this syncretism is lost in the Pali paradigm variant17

characterized by the ending -ayo, which is specific to the non-oblique feminine plural.As we have already seen, the loss of the syncretism in question (a type 2 development)also involves a shift of the exponent distribution towards equiprobability. Thus, evenif other systemic forces that might have led to this syncretism loss are easy to imagine(homonymy avoidance; analogical pressure of the endings of the -ı- class, cf. Oberlies2001: 150), our starting hypothesis predicts that it might have been favoured—andcertainly was not hindered—by the principle of compensation.

Now, as we mentioned at the beginning of this chapter, the subparadigm of the femi-nine oblique cases exhibits an interesting phenomenon in that the plural distinguishesmore case forms than the singular.

Is this in contradiction with the principle of compensation? In fact, if we look atthe broader picture, we can see that for the oblique cases of the marked number, i.e.the plural, the scarceness of case syncretism is somehow balanced by the propertiescharacterizing the expression of gender in the same paradigm cells. We observe two

16 In the instr. sing. in -aya the length of the vowel a is unexpected (cf. OI -aya); also unexpected is theloss of the final consonant in the -aya locative.

17 For a literary language with a complex tradition like Pali (cf. Oberlies : –, with furtherreferences), it is difficult to say whether a variation of this kind should be ascribed to the presence of twopossibilities in the system or to a diasystematic variation coming to the surface in our texts. This issue, atany rate, does not affect the core of our argument.



relevant phenomena: the presence of gender syncretism18 in the genitive in -anam andthe tendency to disaggregate cumulative exponence in the other oblique cases. Indeed,the exponents -ehi; -ahi; -esu; -asu can be analysed as sequences of two morphs if weposit 1) a morph -e- peculiar to the masculine/neuter (plural), 2) a morph -a- peculiarto the feminine (plural), 3) a morph -hi marking the instrumental/ablative plural, and4) a morph -su marking the locative plural.

It is true that this segmentability of the oblique plural endings was in part alreadypresent in Old Indic, where, moreover, -bhis and -su were synchronically operatingalso as exponents of the consonantal declensions. Nevertheless, it is significant that thephonological decomposability was increased in Pali by two specific changes: the firstis the elimination of the non-segmentable variant of the instr. pl. -ais (which was theending inherited from Proto-Indo-European) in favour of the segmentable -ebhis (>Middle Indic -ehi), which is a secondary analogical creation (cf. Thumb and Hauschild1959: 36); the second is the introduction of new forms of accusative masculine pluralin -e.19 Indeed, the -e-accusatives may be analysed as containing the morph -e plusa zero case-inflection and may be thus considered to be parallel to the forms in -ehiand -esu.

Now, as noted, these phenomena of partial change from cumulative to separateexponence—or, perhaps better said, to semi-separate exponence since in our formsnumber and case continue to be canonically cumulated—can be seen as a means ofcomplying with the principle of compensation (type 3 development).20 Namely, in thecase of separate exponence we have inflections that appear in more paradigm cells,like in syncretism, and that, therefore, have a frequency higher than that of the fullycumulative non-segmented exponents (see Milizia 2014 for a more comprehensivetreatment of this issue).

. Parallel developments

The phenomena that we have just described have striking parallels in other Indo-European languages (Table 9.6).

In the definite adjectival paradigm of Old Church Slavic (Table 6a, cf. Huntley1993: 147), the feminine singular shows two instances of case syncretism, and, atthe same time, the plural and dual oblique are totally syncretic for gender. In the

18 The appearance of gender syncretism in the context of non-singular number values is a cross-linguistically recurrent phenomenon that complies with the principle of compensation. An interestinginstance of this kind, by which the distribution of different gender syncretism patterns is related to wordclass distinctions, is exhibited by Archi (Chumakina and Corbett, this volume).

19 The masculine accusative plural in -e is in competition with a variant in -ani that is homonymouswith the nom./acc./voc. ntr. and reproduces the gender/case syncretism pattern of the singular ending -am. .

20 A distinction such as, e.g., the one between masculine (endings -va, -ta) and non-masculine (endings-ve, -te) in the dual of the Slovene verb (s. Plank and Schellinger : f.) should be analysed along thisline (compare the numeral masc. dva, non-masc. dve ‘two’).


Paolo Milizia

Table . Old Church Slavic (a) and Russian (b) definite adjective

a.singular plural dual

masc. neuter fem. neuter masc. fem. m. n. f.non-obl.

nom. -ŭjĭ -oje -aja -aja -iji -yję -aja -ějiacc. =nom./gen. -ojo

obl.instr. -yjimĭ -yjimi -yjimadat. -ujemu -ěji

-yjimŭloc. -ějemĭ -yjixŭ -ujugen. -ajego -yję

b.singular plural

masc. neuter fem. masc./neuter/fem.non-obl.

nom. -yj -oe -aja -yeacc. =nom./gen. -uju =nom/gen.

obl.instr. -ym

-oj-ymi

dat. -omu -ymloc. -om -yxgen. -ogo

corresponding Russian declension (Table 6b; cf. Vaillant 1958: 497f.) these two lines ofdevelopment have been taken to their extreme consequences: the feminine oblique istotally syncretic, as in Pali, and, in the plural, gender distinctions have been completelyneutralized (in this case also at the syntactic level).

Thus Indo-Aryan and Slavic, starting from similar paradigms (not identical becausethe antecedent of the Slavic definite adjective contains a univerbized pronoun), showtwo developments that, though mutually independent, are strikingly reminiscent ofeach other.21

A difference is that in the Slavic plural we find only gender syncretism, while thePali plural exhibits a mix of gender syncretism and gender separate exponence.

Also this type of development is not unparalleled among the IE languages. Thus,in the paradigm of the Latin adjectives of the type bonus, the etymological corre-spondent of the Indo-Aryan -a-/-a- class, the dative/ablative plural shows total gendersyncretism, but the genitive plural exhibits the two different endings -orum and -arum,associated with the masculine/neuter and with the feminine respectively (Table 9.7).Significantly, these endings can be analysed as sequences formed by the morph -o-,

21 Analogous developments are also found in the Germanic group. Thus, in the Old Icelandic weakadjectival declension, the singular has the ending -a for the genitive and the dative (and the accusative) ofthe masculine/neuter and the ending -o for all corresponding feminine cells, while the plural distinguishesthe dative (in -om) from the genitive (in -u, like the nominative) but is totally syncretic for gender (cf.Noreen : ).



Table . Gender syncretism and gender semi-separate exponencein the plural of the Latin -o-/-a and of the Pali -a-/-a- adjectives

Latinmasc./neuter fem.

acc. -ō-s -ā-sdat. -īsabl.gen. -ō-rum -ā-rum

Palimasc./neuter fem.

acc. -e -āinstr. -e-hi -ā-hiabl.gen. -ānam. loc. -e-su -ā-su

for the masculine/neuter, or -a-, for the feminine, plus a morph -rum specific to thegenitive plural; in turn the accusative plural endings -os and -as are interpretable asformed by -o- or -a- plus -s.

Noticeably, this situation is not the inherited one. The ending -orum represents theproduct of morphological analogy (the inherited one was -um, cf. Leumann 1977: 415):the analogical creation of -rum in the Latin genitive plural parallels, somehow, that of-ehi (< -ebhis) in the Indo-Aryan instrumental and ablative plural.

Indeed we might say that the Latin pattern is the ‘mirror image’ of the Pali one:in Latin the ablative is syncretic for gender and the genitive has separate genderexponence, while in Pali the opposite occurs. Again, we observe similar lines ofdevelopment that are carried out in autonomous ways.

. Vertical and horizontal syncretism

The developments mentioned earlier involve two alternative means to avoid theappearance of exponents with low frequency, i.e. either case syncretism (as in Paliin the feminine singular) or elimination of gender distinction from cumulative expo-nence; in turn, such an elimination is achieved either by gender syncretism or by theintroduction of separate gender exponence.

Now, if we think that the parallel development shown by Pali and Slavic is not purelydue to chance, we have to explain why oblique case syncretism seems to be preferredin the feminine singular, while elimination of gender distinction from cumulativeexponence seems to be preferred in the plural.

The recurrence of such a configuration is indeed not banal in that it conflicts withthe hypothesis of a fixed hierarchy among inflectional categories (cf. the Feature Rank-ing Principle in Stump 2001: 239 f.; s. also Baerman et al. 2005: 113 ff.) and, after all, withthe principle of compensation as Brøndal conceived it. If we posit that the categoryof case may have a feature structure involving a node [oblique]22 (cf. Wiese 2004),

22 Cf. Wiese (). For further issues related to underspecification, see also Koenig and Michelson (thisvolume).


Paolo Milizia

then oblique case syncretism can be seen as an instance of case underspecificationin the context of gender, which suggests that gender is superordinate to case; on thecontrary, gender syncretism with no or limited loss of case distinctions seems to pointto a different hierarchy.

A possible solution might be to assume that the ranking of gender with respectto case is dependent on number because of some semantic/functional grounds.Nevertheless, there is a second, perhaps more interesting possibility, i.e. to assumethat the whole syncretism pattern can be explained as a by-product of the interactionbetween competing structural principles internal to the morphological level.23

We start from the following two assumptions: (1) morphosyntactic property oppo-sitions may differ in strength; (2) the IC that can be carried by a single exponent isconstrained so that a too high IC may result in a critical instability of the correspond-ing exponent.

Table 9.8 shows a simulation describing the IC (in bits) of the exponent of an obliquecase in the feminine singular (left part of the table) and plural (right part) in a hypo-thetical language having four equiprobable oblique cases, a feature-value structureanalogous to that of the Pali adjective, and feminine/non-feminine, plural/singular,and oblique/non-oblique ratios (1:4, 3:7, 3:7, respectively) similar to the ones of Oldand Middle Indic. The rows are arranged according to three types of exponents: non-syncretic exponents (row 1); exponents associated with all the oblique cases, i.e. show-ing ‘vertical syncretism’ (row 2); exponents specific to a particular oblique case andassociated with all the three gender values, i.e. showing ‘horizontal syncretism’ (row 3).

Now, let us posit that the opposition between the feminine and the mascu-line/neuter is stronger than the one between an oblique case and another, i.e. that,other things being equal, vertical syncretism is favoured as against horizontal syn-cretism. Given that premise, it is possible to imagine a situation in which the tendencyto avoid exponents with an IC far higher than the average results not only in adispreference for the non-syncretic patterns (i.e. A1 and B1) but also, in the plural andonly in the plural, in a dispreference for the—elsewhere preferred—vertical syncretism(B2), and this is because of the low relative frequency of plural nominals, which raisesthe overall IC of the plural exponents.

The easiest way to devise a model able to generate the observed patterns wouldbe to posit the existence of an IC threshold and hypothesize that the threshold value

23 On the other hand, it is difficult to assess to what extent the phonological level may have played arole in these developments. One might think that in the singular the OI feminine endings of the preservedoblique cases (-aya, -ayas, -ayam) had in common a significant amount of phonological material with eachother and that this fact might have favoured a phonological collapse (even if the regular sound changeswould not have led to it). Nevertheless, the fact that the corresponding masculine/neuter endings (-ena,-asya, -at, -e) are less similar to each other with respect to what occurs in the feminine slab, is in part due tomorphological grounds: the inherited instr. sg ending was indeed -a (cf. Thumb and Hauschild : ),while -ena is an Indo-Aryan innovation. In other words, the morphological level is able to enhance formaldistinctions when this is in keeping with its preferences.

OU

PC

OR

REC

TEDPRO

OF

–FIN

AL,9/3/2015,SPi

Patternsofsyncretismand

paradigmcom

plexity

Table . Vertical and horizontal syncretism in a hypothetical paradigm inflected for number, gender, and case

A2)bits

singular 0.51feminine 2.32obl. 1.74total 4.57

sing. plur.m n f m n f

non-obl.

case αcase β

obl.

case acase bcase ccase d

B2)bits

plural 1.74feminine 2.32obl. 1.74total 5.80

case αcase βcase acase bcase ccase d


non-obl.

obl.

A3)bits

singular 0.51– 0case a 3.74total 4.25


non-obl.

case αcase β

obl.


B3)bits

plural 1.74– 0case a 3.74total 5.48

A1)bits

singular 0.51feminine 2.32case a 3.74total 6.57


non- obl.

case αcase β

obl.


B1)bits

plural 1.74feminine 2.32case a 3.74total 7.8



non-obl.

obl.


non-obl.

obl.



Paolo Milizia

lies between the IC of B2 and that of B3—let it be, e.g., 5.5 bit. Indeed, given sucha threshold, it is expected that B2 will be avoided and B3 will be preferred. In otherwords, we can posit a configuration of values on the basis of which oblique casesyncretism is not sufficient in the plural, so that the most preferable paradigm willhave horizontal syncretism in the plural (B3) due to the pressure of the IC-restrainingtendency, but vertical syncretism in the singular (A2) in compliance with the system-specific syncretism preference.

Yet it would be going too far to say that such a simple evolutionary algorithm can beconsidered as an exact description of the Pali or Slavic development.24 As regards thedetails of the algorithmic machinery, other hypotheses could be legitimately devised.In particular, it is clear that a different model of the way the IC-restraining tendencyand the resistance to syncretism offered by the different categorial oppositions interactwith each other might lead to partially different predictions.

Our aim here is limited to defining a problem that seems to us of theoreticalimportance—i.e. the question of why different IE groups independently exhibit inflec-tions with vertical syncretism in the singular and horizontal syncretism in the plural—and to trying to show that such a phenomenon, which appears as an increase in theorganizational complexity of the paradigm, may be thought of as the result of aninteraction of preferences by which restrictions related to the information capacityof the exponents must play a central role.

. Horizontal syncretism and the feminine gender

In the previous paragraph we hypothesized that in the Slavic and Pali developmentsthe feminine/non-feminine opposition was stronger than oblique case oppositions (cf.also Hjelmslev 1935: 108). That such a language-specific property may have been sharedby both Indo-Aryan and Slavic poses no problem since the two are genetically related.

On the other hand, cross-linguistic data seem to point to a hierarchy according towhich gender syncretism is more widespread than (and therefore cross-linguisticallypreferred to) case syncretism (see Baerman et al. 2005: 113 f.). However, because of theparticular status of the feminine in Proto-Indo-European, the assumption of a non-typical situation in Pali and Slavic is not particularly demanding. Indeed, a widespreadhypothesis is that the category of the feminine was originally autonomous from the

24 As for Indo-Aryan, a model with a fixed-threshold fits the data of the instrumental, genitive, andlocative (and predicts, therefore, the different distribution of the two patterns along the two number values),but assesses every non-syncretic ablative exponent as critically unstable. This is due to the fact that, becauseof the rareness of this case (cf. also fn. ), a non-syncretic ablative exponent will have an IC greater thanthe threshold imagined. On the other hand, in both Old Indic and Pali, adjectives and nouns do have anon-syncretic ablative exponent—i.e. the masc./ntr. OI -at > Pali -a—only for the least marked values ofthe categories of gender (masculine/neuter), number (singular), and declension class (-a/a-). Significantly,among the Indo-European languages, the commonest development is the loss of the purely ablatival caseby means of a category merger involving at least another oblique case.



Table . Jaina-Maharas.t.rı -a-/-a- adjective declension

singular pluralmasc. neuter fem. neuter masc. fem.

non-obl.nom. -o

-am. -ā

-ān. i/-āi(m. )-ā -ā/-āovoc. -a

-āo

-eacc. -am. -e

obl.

abl. -e-hi(m. ) /-e-him. to -ā-hi(m. ) / -ā-him. toinstr. -en. a(m. )

-āe-e-hi(m. ) -ā-hi(m. )

gen. -assa -ān. a(m. )loc. -e/-ammi -e-su(m. ) -ā-su(m. )

opposition between masculine and neuter, and was only secondarily included into thecategory of gender (cf. Clackson 2007: 104 f.).

Significantly, the absorption of the feminine into the gender-system seems to gofurther in the course of the Middle Indo-Aryan period. In the paradigm in Table 9.9(cf. Jacobi 1886: xxxvi f.) we can see the inflection of the -a/a- adjectives in Jaina-Maharas.t.rı, a Prakrit language that represents a more advanced stage of the MiddleIndo-Aryan development.

Here, the feminine singular distinguishes the ablative case from the other obliquecases, but again the presence of this case distinction is realized at the cost of losinggender distinctions, since the ablative singular ending -ao, is totally syncretic forgender.25 Thus, at this stage, gender distinctions seem to have been weakened andto have increased their proneness to syncretism independently of number. Moreover,the whole development corroborates the idea that a particularly rare paradigm celltends to share its exponent with other cells independently of the syncretism patternthat is realized. From Proto-Indo-European to Middle Indo-Aryan, the ablative case isconstantly involved in some pattern of syncretism, but these patterns are continuouslyreshaped.

. Conclusion

The Indo-Aryan morphological structures previously considered confirm that aninformation-theoretic approach to paradigm description turns out to be a promisingtool for grasping organizational principles underlying the emergence of complexity

25 The -ao forms (cf. also the corresponding form -ado in the Śaurasenı prakrit variety, cf. Pischel :§), which are continuations of denominal ablatival adverbs in -tas, are phonologically expected in thefeminine but not in the masculine/neuter, where the thematic vowel -a- is etymologically short. Noticeably,the adverbial suffix -tas is also the etymological ancestor of the final ◦to which is added to the instrumentalplural to form ablatival ending variants non-syncretic for case (cf. Table .). The distribution of -tas >–to(> -o) is, therefore, the reflex of an instance of semi-separate exponence that has been clouded by the lossof intervocalic t.


Paolo Milizia

in syncretism patterns. More generally, it may be thought that one of the sources ofmorphological complexity lies in the interplay between organizational principles atthe level of the morphosyntactic properties and equilibrium tendencies related to thequantitative distribution of the inflectional items. As we have tried to show, the studyof diachronic developments provides a favourable viewpoint for observing how thisinterplay exerts its moulding force.

Acknowledgements

Most of the issues addressed in the present chapter are treated in more detail inMilizia (2013). The author wishes to express his gratitude to Matthew Baerman andDunstan Brown for their precious observations and suggestions. He is also gratefulto Marco Mancini, Luca Lorenzetti, and Giancarlo Schirru for the support receivedwhile carrying out this research. As usual, he remains entirely responsible for possibleerrors or omissions.


Learning and the complexityof Ø-marking

SE BAST IA N BA N K A N D JO C H E N T ROM M E R

. Introduction

Research on the automatic learning of morphological segmentation is a history ofmixed success. On the one hand, there are by now well-understood and efficientmethods for stemming, i.e. the identification of lexical stems in inflectional word forms(Lovins 1968, Porter 1980, Goldsmith 2010). On the other hand, identifying affixesin the affixal strings isolated by stemming, a task which we will call in the followingsubanalysis, is a research area still in its infancy.1

Consider, for example, the German verbal agreement paradigm in (1). Standardstemming assigns the segmentation in (1a), which separates the stem glaub ‘believe’from the suffixes expressing tense and subject agreement. However, it is obvious to thelinguistically trained eye, and uncontroversial in German linguistics, that the resultingsuffix strings have further structure: they contain the separable past tense suffix -te,whose segmentation (1b) reveals that most of the agreement markers appearing inpresent tense forms are also used in the corresponding past forms. In fact, recentresearch in theoretical morphology (cf. the papers in Trommer and Müller 2006,and references cited there) often assumes a much more radical subanalysis. ThusMüller (2005) argues for the segmentation in (1c) where the apparent 2sg affix -st isdecomposed into a true 2sg marker -s and the non-first-person affix -t which alsoshows up in the 2pl and the 3sg:

1 We do not address here two further problems for morphological segmentation: First, discontinuousmorphology, as in Semitic root-and-pattern morphology (see Pirelli et al., this volume, for discussion) andinfixation (cf. the case of Archi discussed by Chumakina and Corbett, this volume). Second, the possibilitythat affixes occur in different positions with respect to each other (cf. Rice on variable orders amongprefixes and suffixes, and Chumakina and Corbett on cases in Archi, where the same affix occurs as a prefix,suffix, or infix depending on phonological, morphological, and selectional properties of the base).


Sebastian Bank and Jochen Trommer

(1) German present and preterite verbal agreement

a. output of stemming(i) present

sg pl1 glaub-e glaub-en2 glaub-st glaub-t3 glaub-t glaub-en

(ii) preteritesg pl

1 glaub-te glaub-ten2 glaub-test glaub-tet3 glaub-te glaub-ten

b. minimal subanalysis(i) present

sg pl1 glaub-e glaub-(e)n2 glaub-st glaub-t3 glaub-t glaub-(e)n

(ii) preteritesg pl

1 glaub-te glaub-te-n2 glaub-te-st glaub-te-t3 glaub-te glaub-te-n

c. elaborate subanalysis (Müller 2005: 10)(i) present

sg pl1 glaub-e glaub-(e)n2 glaub-s-t glaub-t3 glaub-t glaub-(e)n

(ii) preteritesg pl

1 glaub-te glaub-te-n2 glaub-te-s-t glaub-te-t3 glaub-te glaub-te-n

In this chapter, we advance the hypothesis that Ø-affixes are a crucial factor thatenables subanalysis. Thus in German verb inflection, present tense is systematicallyzero (although one might argue that the 1sg suffix -e expresses also present tense),which allows for the identification of the person/number markers without furthersegmentation.2 Similarly 1sg is zero in past tense forms, which reveals the bare pastsuffix -te. Thus, every marker of the segmentation in (1b) occurs on its own in atleast one paradigm cell. In analogy to the (in-)dependent occurrence of morphemesin syntax, we will call the occurrence of a morpheme as the only form preceding orfollowing the stem free and the occurrence as substring of the string preceding orfollowing the stem bound (hence -st is free in (1a-i) and bound in (1a-ii); see alsoKoenig and Michelson, this volume for a discussion on subparts of pronominal port-manteau affixes, that may be assigned a consistent meaning, but are ultimately non-separable because they are bound elements in the sense assumed here). We will arguethat it is exactly this contrast that makes the segmentation in (1b) (which involvesonly free forms) uncontroversial and transparent, whereas the further subanalysis in(1c) (which adds the bound affix -s to the inventory) is more debated and opaque(although as we will see, still plausible). For a learner guided by free forms it is easier

2 Note that the disclosure of the agreement marker’s forms is independent from the analytical options ofhaving a present tense marker ‘-Ø’ in the lexicon or just lacking a present tense exponent.


Learning and the complexity of Ø-marking

to subanalyse glaub-tet into -te and -t than glaub-st into -s and -t, as -s does not occuron its own.

More generally, we show that subanalysing learning algorithms profit substantiallyfrom paradigms where most or all markers have a free occurrence since this reducesthe search space for possible segmentations (established by possible segmentationpoints): if less potential segmentation points are considered, the search for the correctsegmentation remains shallow, and the range of possible subanalyses the learner mustconsider is restricted. Hence learners with different reliance on the free occurrenceof a potential marker can provide dramatically different results. In fact, each kind ofsearch-space-reduction that we will define classifies the complexity of subanalyses tobe (un-)feasible with a particular strategy. Zero marking of inflectional categories,which is often regarded as a major source of complexity in morphological systems(cf. e.g. Anderson 1992, Wunderlich and Fabri 1994, Segel 2008), can hence be seenas a central factor facilitating the learning of subsegmentation. The fact that zero-exponence is a typologically ubiquitous phenomenon (e.g. for third person, presenttense, etc.) would then have the effect that the subanalysis complexity observed cross-linguistically is low in the majority of languages, helping the learner to identify affixes.Our major empirical hypothesis is thus that no language has a subanalysis complexitythat demands the learner to explore the full range of possible segmentations.

The chapter is structured as follows: In section 10.2, we define a hierarchy ofsubanalysis complexity classes in terms of their reliance on the occurrences of freemarkers, and specify the respective amounts of search-space-reduction they involve.In section 10.3, we demonstrate the impact on the learning of inflection by introducingan incremental learning algorithm for inflectional affix inventories. Finally, we presentthe results of a typological pilot study on the distribution of zero-exponence in tenseand agreement across a cross-linguistic language sample, which confirms the correla-tion between complexity classes and cross-linguistic distribution of Ø-marking.

. Subanalysis complexity

Let us assume that a learner of a language has already managed to isolate the lexicalmaterial of an inflected word form from affixal material by stemming. The learner isthen left with everything that precedes and follows the stem, what we will call affixstrings (prefix_string–stem–suffix_string). The German verbal agreementparadigm from (1) is strictly suffixing, so its representation for subanalysis can simplyomit the empty prefix strings and also the slot-indicating hyphens for the suffixes:3

3 In the following, we largely abstract away from phonological interference by using underlying repre-sentations as input for the learner.



(2) German verbal agreement suffix strings after stem removala. prs sg pl

1 e n2 st t3 t n

b. pst sg pl1 te ten2 test tet3 te ten

A learner faced with (2) might hypothesize about segmenting the affix strings intosubaffixes or rather assume that they are undecomposable morphemes expressingtense and agreement in a portmanteau fashion. The crucial problem is that first, theaffix strings may be segmented in different ways (segmentation options) and second,affix strings with identical forms can be mapped to the same or different markers(lexicon assignment options). The combination of these two interdependent kinds ofanalytic options leaves the learner with a vast range of possibilities. For the content ofeach paradigm cell there are length(string)–1 possible segmentation points. Each seg-mentation point represents a binary decision of (non-) segmentation, so they add up to2length(string)–1 possible segmentations. E.g. the 2sg past suffix string test has 24–1 = 8possible segmentations, listed exhaustively in (3f). The number of possible segmenta-tions of the whole paradigm is the product of 2(length(string)–1)∗number of occurrences foreach affix string. For the rather small paradigm in (2) this yields 20∗1

e · 21∗1st · 20∗2

t ·20∗2

n · 21∗2te · 23∗1

test · 22∗2ten · 22∗1

tet = 212 = 4, 096 different segmentations.

(3) Suffix strings and their possible segmentationsa. e → { e }b. st → { st, s-t }c. t → { t }d. n → { n }e. te → { te, t-e }f. test → { test, t-est, te-st, tes-t, t-e-st, t-es-t, te-s-t, t-e-s-t }g. ten → { ten, t-en, te-n, t-e-n }h. tet → { tet, t-et, te-t, t-e-t }

Regardless of the segmentation, the learner needs to assign a single or multiplemeaning(s) to forms that occur in more than one cell, i.e. analyse them as either syn-cretic instances of the same lexicon entry or unrelated occurrences of (accidentally)homonymous lexicon entries. Segmentation of course affects these choices becausesubanalysis determines what the forms of affixes are and how they are distributed(what their occurrences are): without subanalysis, the learner faces the eight distinctforms listed in (4a), of which four have two occurrences (indicated by superscripts).Each form with two occurrences can either be mapped to a single (syncretic) or totwo different (homonymous) marker(s), hence there are 2 · 2 · 2 · 2 = 16 different



possible lexicon assignments for (4a), i.e. distinct ways to map the form occurrence tomeanings.4

(4) Suffix string inventory from different subanalysis deptha. unsegmented { e, st, t2, n2, te2, test, ten2, tet }b. intermediate { e, st2, t3, n4, te6 }c. maximal { e7, s2, t11, n4 }

If leading -te is segmented from all suffix strings in the past paradigm, the numberof forms drops to five (4b), while the number of lexicon assignments increasesdramatically: just for the six occurrences of -te there are 203 different ways to assignthem to one to six different markers (203 possible partitions of a six-item set). Maximalsubanalysis is reached when every segment is segmented (4c), which demonstratesthat the minimal possible number of different forms is four, and has an explodingnumber of lexicon assignments (678,570 possible partitions just for the eleven occur-rences of -t).

.. Complexity classes

If segmentation is guided by the free occurrences of affix strings, possible formsfor markers differ in how accessible they are from the affix strings that constitute aparadigm. The full affix strings themselves are already implicitly segmented by thestem and the word boundary. Thus they constitute possible affix forms for the lastresort of non-subanalysis. In this respect, a learner that never subsegments an affixstring is maximally conservative (class (5a)). The most restricted possibility for asubanalysing learner is one that only considers segmentation of an affix string, if allof its parts are affix strings on their own in other parts of the paradigm (class 1 (5b)),i.e. all parts need to have a free occurrence. The minimal way to lift this restriction isto allow for the possibility that one part of a subanalysis is a non-affix string (class 2(5c)), which allows a kind of second order cranberry affix (bound affix): an affix whichnever occurs without at least one adjacent other affix. Finally, we establish a class forunrestricted subanalysis (class 3, (5d)):

(5) Subanalysis complexity classes as constraints on subaffixesa. Class Affix strings are potential forms (no subaffixes)b. Class 1 Every subaffix S of an affix string AS also occurs as an affix stringc. Class 2 For every binary subanalysis of an affix string AS into S1 +S2 either

S1 or S2 occur as an affix stringd. Class 3 No restriction on the occurrences of subaffixes

4 The number of possible lexicon assignments for a single segmentation computes as the product ofthe number of partitions of each form’s set of occurrences (the nth Bell number Bn for a form with noccurrences).



Crucially the classes defined in (5) establish an implicational complexity hierarchy:

(6) Hierarchy of subanalysis complexity classesClass ⊆ Class 1 ⊂ Class 2 ⊂ Class 3

A learner capable of class 3 subanalysis complexity has to consider all possiblesegmentation points, a class 2 learner only those which result from removing an affixstring from the beginning or end of another affix string. A class 1 learner is limited tosubanalysing cells exhaustively composed of affix strings. Finally, class learners arerestricted to word and stem boundaries. In the next section, we will discuss the impactof subanalysis complexity for the set of possible forms for affix hypotheses.

.. Evaluation

The German data in (2) involve a minimal amount of subanalysis complexity, asthere is a reasonable subanalysis, which only includes class 1 segmentation points.A class learner yields the single non-subanalysing segmentation and needs toconsider eight possible forms of a marker. With its three syncretic/homonymousmarkers covering two cells, there are sixteen possible lexicon assignments for (7a).

(7) German present and preterite verbal agreement

a. Class subanalysis(i) prs sg pl

1 e n2 st t3 t n

(ii) pst sg pl1 te ten2 test tet3 te ten

b. Class 1 subanalysis(i) prs sg pl

1 e n2 st t3 t n

(ii) pst sg pl1 te te-n2 te-st te-t3 te te-n

c. Possible forms to consider{ e, st, t, n, te, test, ten, tet }

Crucially, to find (7b), no additional possible forms need to be considered: Everysubanalysed form of a possible marker also occurs freely. With four possible segmen-tation points there are sixteen possible class 1 subanalyses (including (7a)), and forthe maximal class 1 subanalysis shown here, there are 30,450 possibilities which mapsyncretic/homonymous markers to lexicon entries.5

A higher complexity is needed for the subanalysis of the Swahili paradigm in (8a):it does not contain any class 1 segmentation points and thus can only be subanalysed

5 B1e · B2 st · B3 t · B4n · B6 te = 1 · 2 · 5 · 15 · 203 = 30,450 lexicon assignments for the segmentationin (b).



by a learner of at least class 2. Note that the example shown in (8a) is not the maximalsegmentation possible with this class. The segmentation of the trailing cranberry-likeli- in the past (8a-ii) is made possible by the fact that every substring that precedes italso occurs in the zero tense–aspect–mood marked subjunctive (8a-i).

(8) Swahili subjunctive and imperfective verbal agreement (Seidel 1900: 10–18)

a. Class 2 subanalysis(i) sub sg pl

1 ni tu2 u m3 a wa

(ii) imp sg pl1 ni-li tu-li2 u-li m-li3 a-li wa-li

b. Possible forms to consider by complexity classClass /1 { ni, u, a, tu, m, wa, nili, uli, ali, tuli, mli, wali }Class 2 Class /1 ∪ { t, w, li }6

Class 3 Class /1 ∪ Class2 ∪ { n, i, l, il, nil, ili, ul, al, tul, ml, wal }

The extension of the search space from class /1 to class 2, which is in fact necessary tofind (8a), is reflected by the additional possible forms that are considered, cf. (8b). Yetstill the search space is restricted in comparison to class 3. While for a class 3 learner,every segment transition is a possible segmentation point that can be combined withany other segmentation point,7 there are only eight class 2 segmentation points in (8).In sum, class 2 yields 28 = 256 possible segmentations of the 218 = 262,144 possiblewith class 3.

Finally, the full search space of class 3 is needed to subanalyse a paradigm thatalways has an overt marker for all categories. Recall that our hypothesis is that such apattern should not exist. Yet, such complexity can occur if the learner does not ‘see’ allrelevant data, i.e. misses the zero-exponent paradigm cells. So if a learner of Swahili hasno access to the subjunctive paradigm but only the present and imperfective, whichboth consequently mark tense–aspect–mood, the subanalysis of this highly regular(sub-) paradigm would be of complexity 3:

(9) Swahili present and imperfective verbal agreement with class 3 subanalysisa. prs sg pl

1 ni-na tu-na2 u-na m-na3 a-na wa-na

b. imp sg pl1 ni-li tu-li2 u-li m-li3 a-li wa-li

6 Observe that the possibility of the bound forms t-, w-, and li- in class arises from the possibility tosegment one of the cells in which they occur into a free and a bound marker like t-u-, w-a-, and ni-li-.

7 Note that subanalysis complexity class restricts not only the number of segmentation points but alsotheir possible combinations. Given the free occurrences { t, e, st }, the affix string test can in class besegmented into t-est, te-st, tes-t, and t-e-st, but not into t-es-t, te-s-t, and t-e-s-t.



In (9), all forms regularly consist of an agreement marker (ni-, u-, a-, . . . ) followedby one of the TAM markers (na- and li). Yet none of these markers has a freeoccurrence as the only overt marker preceding the stem. Hence a subanalysing learnerdoes not gain anything from comparing the free affix strings for substring search(no free string is contained within another free affix string). As there are no class 1or class 2 segmentation points the learner would need to scan through all possiblesegmentations to get to this subanalysis.

. Learning algorithms

To demonstrate the impact of subanalysis complexity on the learning of inflection,we have integrated the different complexity restrictions of the developed hierarchyin an incremental algorithm that we have implemented. The algorithm performssegmentation and meaning assignment in an integrated way, based on a paradigm ofunsegmented affix strings and their morphosyntactic specification. At every cycle ofthe algorithm, the learner builds a set of possible form-meaning pairs, chooses the bestmorpheme hypothesis by comparing the accuracy and generality of the candidates,and removes the learned morphemes’ occurrences from the paradigm. What countsas a possible affix in this search process is crucially restricted by the complexity classthe learner adopts. The selection of the optimal morpheme hypothesis is driven by apreference for maximally general and reliable (accurate) mappings between form andmeaning. To evaluate the accuracy and generality of different form-meaning map-pings, we employ standard classification measurements used in work on informationretrieval and machine learning (Baeza-Yates and Ribeiro-Neto 1999), namely precisionand recall, or rather their non-proportional counterpart false positives and negatives,but it is important to keep in mind that these are simply more formal equivalents ofthe criteria morphologists use for morphological analysis. To exemplify the use of thisterminology, we will use the abstract paradigm given in (10), where a and b are stringsof segments, and [±x], [±y] morphological features.

(10) Informal and formal paradigm representation

a. [+y] [-y][+x] a b[-x] a ab

b. { 〈a,[+x +y]〉, 〈b,[+x -y]〉,〈a,[-x +y]〉, 〈ab,[-x -y]〉 }

As formally stated in (10b), we represent a paradigm as the set of 〈form, meaning〉pairs corresponding to its cells. Meanings are represented as (possibly empty) setsof features drawn from the morphosyntactic feature-values needed to establish thecell meanings, e.g. [+1 -pl -past] or [-3 +pl]. (11) lists some of the affix hypoth-eses corresponding to a and b in (10) and introduces the evaluation criteria weuse. A false positive for an affix hypothesis H of the phonological form F is any



paradigm cell for which H is predicted to occur, but which does not contain F.Conversely, a false negative is a cell where F is not predicted to occur by H, but stillshows up.

(11) Accuracy criteria for affix hypotheses

〈form, false false implication perfectmeaning〉 positives negatives relation accuracy ratio

a. 〈b,[-y]〉 – – ↔ perf. precision,perf. recall

b. 〈a,[+y]〉 – yes ← perf. precisionc. 〈a,[-x]〉 – yes ← perf. precisiond. 〈a,[]〉 yes – → perf. recalle. 〈b,[+x]〉 yes yes neitherf. 〈a,[-y]〉 yes yes (2) neither

The meaning of (11a) is a completely accurate characterization for the distribution ofb in (10): all cells with the meaning [-y] contain the string b and conversely all cellscontaining b have the meaning [-y] (there is no b not covered by [-y]). Hence, thereare no false positives and no false negatives for this hypothesis. Correspondingly, thereis an implication both from the meaning to the form (←) and from the form to themeaning (→), it is a one-to-one mapping (↔). With (11b), every cell that matchesthe meaning [+y] has the form (←, perfect precision), yet the occurrence of a in the[-x -y] cell is not covered by the marker, resulting in a false negative (false predictionof the non-occurrence of the form). Whereas (11b) doesn’t incur any false positives,(11d) does not involve false negatives: it does not occur in any contexts for which itwould not be predicted. This is for the trivial reason that, being wholly underspecified,it is compatible with the entire paradigm. On the other hand, it leads to a false positivefor the [+x -y] cell.

Precision is the fraction of true positives of an affix hypothesis H of form F (thecorrectly predicted occurrences of F) from all paradigm cells matching H, and recallthe fraction of true positives from all occurrences of F in the paradigm. Thus (11a) hasboth perfect precision and recall (amounting to 1). (11b) has perfect precision, but arecall of 2

3 ([+y] predicts only two of the three occurrences of a in the paradigm).Conversely, (11d) has perfect recall, but a precision of 3

4 (a occurs only in three of fourcells for which it is predicted). Thus optimizing (and hence maximizing) precisioncorrelates with minimizing false positives, whereas optimizing recall correlates withminimizing false negatives.

It is obvious that these evaluation metrics closely mirror the criteria linguists (andlearners of a natural language) employ to determine the correct affix entries formorphological systems. Virtually every morphologist would conclude that (11a) (withperfect precision/recall and zero false positives/negatives) is the correct characteriza-



tion for b in the given paradigm, and that (11b) (with one false negative and imperfectrecall, but perfect precision) is a better analysis for a than (11f) (which has two falsenegatives and also a false positive for [+x -y]).

On the other hand, there is no inherent reason in the definition of these criteria toprefer either of the hypotheses with false positives or false negatives in cases where theycan’t be completely avoided (cf. a in (10)). Again this corresponds closely to informedlinguistic judgements: Which kind of imperfect distribution is preferable cruciallydepends on the details of the grammatical formalism assumed by the morphologicalframework at hand.

The grammar may provide principles or additional machinery that either preventsa marker to occur although it matches a cell (e.g. blocking, impoverishment) or thatmakes a marker occur in a cell although it does not match its meaning (e.g. emptycells taking the next best marker, rules of referral; see also Michelson and Koenig, thisvolume on Directness).

Thus whether (11b) (no false positives, but false negatives) or (11d) (no falsenegatives, but false positives) is the better characterization for a might be answereddifferently by morphologists of specific theoretical persuasions. (11b) would be a viableanalysis for proponents of Paradigm Function Morphology (Stump 2001), which couldcapture the ‘aberrant’ occurrence of a in the [-x -y] cell by a rule of referral, whereas(11d) would be the option of choice in frameworks which favour underspecification(such as Distributed Morphology, cf. Halle and Marantz 1993, Halle 1997) and mightassume blocking by an impoverishment rule.

Hence to end up with a complete analysis that exactly matches the data, the learnerwe sketch here would have to be adapted to the insertion restrictions the grammaremploys and to the use of its additional mechanisms to cope with either false positivesor false negatives.

For our algorithm, we assume that every marker matching a cell’s meaning isinserted (i.e. no blocking or other dependencies among inserted markers) and—whilethe learner tries to avoid homonymy whenever possible—there is no general ban onhomonymy. These assumptions are best matched by a learner that optimizes for perfectprecision (prefers markers without false positives) and assumes homonymy in case offalse negatives. For (10) this would yield two markers for a, as the distribution of thisform cannot be captured perfectly by a single marker, cf. (12a).

(12) a. Lexicon optimized for maximal precision > maximal recall{ 〈b,[-y]〉, 〈a1,[+y]〉, 〈a2,[-x -y]〉 }

b. Lexicon optimized for maximal recall > maximal precision{ 〈b,[-y]〉, 〈a,[]〉 }

Note that while (12b) of course is preferable in terms of smaller marker inventory andthe avoidance of homonymy, unlike (12a) it is not a complete analysis of the data, as



it does not determine why a is not present in the [+x -y] cell contrary to its everymeaning which is compatible with every cell.8

.. Implementation

To avoid a brute force search through the whole search space resulting from thecombination of possible segmentations and lexicon assignments for all affixes, ourlearner uses a greedy algorithm which searches in every optimization step for onlya single affix hypothesis, the one with the most regular paradigmatic distributionin minimizing false positives and negatives. At the end of every optimization step,this affix hypothesis is added as a new entry to the affix lexicon of the language, andthe strings corresponding to the affix hypothesis are removed from the paradigm.Optimization steps are repeated until the paradigm is empty, i.e. all affix strings inits cells have been assigned to affix entries in the lexicon. The full algorithm is givenin pseudocode in (13).9

Optimization in our algorithm is strictly local in being myopic for interdependen-cies between markers (blocking), possible partitions of form occurrences into differentmarkers (homonymy), and even possible segmentations of paradigm cells. In fact,segmentation and homonymy are not even a genuine notion in the algorithm: theyemerge from the removal of learned affixal material.

(13) Greedy algorithm for incremental perfect precision learningInput: a paradigm P, i.e. a set of 〈affix string, meaning〉-pairs

an empty lexicon L1 build the set M of all potential markers for P which do not incur false positives2 choose the optimal marker O ∈ M according to the metrics · � ‚ � γ

· maximize the number of true positives (including subaffixes)‚ minimize the number of false negatives (excluding subaffixes)γ maximize the number of segments

3 add O to L andremove the affix string of O from all 〈affix string, meaning〉-pairs ∈ Pthat match its meaning

8 Note that this can’t follow from blocking in this particular example because this would imply a blockingwhere 〈b,[-y]〉 blocks 〈a,[]〉 in the [+x -y] cell, but not in the [-x -y] cell. An impoverishment rulecan also not serve to remove this false positive and complete the analysis: as the meaning of 〈a,[]〉 is thecomplete default subsuming every meaning, there is no way to prevent its insertion by deleting featuresfrom the cell meaning it is matched against.

9 This type of incremental optimization is closely akin to the Harmonic Serialism version of OptimalityTheory (McCarthy ) where, in contrast to the standard version of OT, candidates for evaluation mayonly exhibit a single structural change to the input, but which allows iteration of evaluation cycles, whereevery cycle takes the output of the preceding cycle as input, up to the point that optimization stagnates, i.e.does not lead to further harmonic improvement. In these terms, the single structural change that definesthe optimization cycle for our learner consists of learning a single marker and removing its occurrencesfrom the paradigm.



4 if any 〈affix string, meaning〉-pair ∈ P has a non-empty affix string:go to step 1 else output L

Parametrization: Restrict step 1 to class , class 1, or class 2 segmentations(checked with P)

In principle, we have the following details of the learner which might be fine-tuned:(i) the potential segmentation points assumed when generating possible forms, (ii) thepossible meanings they are combined with to generate affix hypotheses, and (iii) theevaluation metrics for choosing the best among the marker hypotheses. Parametriza-tion for possible segmentation points is explicitly included in the algorithm in (14),and we will discuss the consequences of the different parameters in the next section.

With respect to possible meanings, an aspect not specified explicitly in (13), weremain rather agnostic. However, since we restrict possible markers to affix hypotheseswhich do not incur false positives, it is crucial that the set of meanings is complete inthe sense that it allows one to refer individually to any single paradigm cell; in effectthere is always a last resort option to build a one-cell marker with perfect precision.

In its evaluation metrics determining the choice between different affix hypotheseswith perfect precision (line 2 of (13)), the algorithm is biased towards markersoccurring in more paradigm cells, so that the algorithm preferentially learns markerswhich cover more paradigmatic space (and learns them first). If this bias results in atie, the algorithm prefers markers which are more accurate in terms of recall, coveringmore of the free occurrences of its form in the paradigm.10 If this still gives a tie,it will use the marker with more segments, again maximizing paradigmatic spacefor the current affix hypothesis. This also provides an inherent upper bound to thesubanalysis depth of the results the learner produces: if a paradigm consists of thecells { 〈en,[x y]〉, 〈en,[x]〉, 〈e,[z]〉 }, optimization for length always prefers 〈en,[x]〉over both 〈e,[x]〉 and 〈n,[x]〉—which have the very same distribution—and henceprevents the vacuous segmentation of en into two markers with identical meaning.

.. Results

Examples (14) and (15) show the affix lexica and the concomitant segmentations thatthe algorithm in (13) produces for the German verb paradigm if restricted to class 1and class 2 segmentations respectively. Interestingly, the class 1 segmentation in (14)corresponds to the more traditional and conservative segmentation of German we

10 Note that if a learner is meant to always gain more confidence from the free than the bound occurrenceof a form, it must prefer free true positives over bound ones and bound false negatives over free ones. Alsonote that there is no free vs bound distinction for false positives and true negatives, as they refer to thenon-occurrence of a string (which is neither free nor bound).



have already seen in (1b), whereas class 2 segmentation gives us the more radicalsubanalysis proposed by Müller (2005) 1c.11

(14) German verbal agreement with class 1 restricted learnera. Segmentation

(i) prs sg pl1 e n2 st t13 t2 n

(ii) pst sg pl1 te te-n2 te-st te-t13 te te-n

b. Lexicon

(i) te, [+past] (iv) t1, [+2 +pl](ii) n, [-2 +pl] (v) e, [+1 +sg -past](iii) st, [+2 +sg] (vi) t2, [+3 +sg -past]

The order in which the markers appear in the lexica in (14b) and (15b) reflects thesequence of optimization steps that has generated them. In both analyses, te andn are learned first since they involve the widest and most regular distributions ofinflectional affix strings in the German paradigm. The corresponding affix hypothesesare completely accurate (their string specifications occur in all and only the cellssubsumed by their meaning) and cover six and four cells respectively. However, afterthe removal of te and n from the paradigm, the two learners behave slightly differently.For the setting in (14), the set of possible forms is restricted to the freely occurringaffix strings { e, st, t } (of which st and t have two occurrences in the paradigm), hencethe algorithm first learns the marker corresponding to st, then t1, and after e finally,homonymous t2.

(15) German verbal agreement with class 2 restricted learner

a. Segmentation(i) prs sg pl

1 e n2 s-t1 t13 t2 n

(ii) pst sg pl1 te te-n2 te-s-t1 te-t13 te te-n

b. Lexicon

(i) te, [+past] (iv) s, [+2 +sg](ii) n, [-2 +pl] (v) e, [+1 +sg -past](iii) t1, [+2] (vi) t2, [+3 +sg -past]

11 An interesting difference is that Müller does assume only one marker of the phonological shape -t,where our algorithm learns two homonymous affixes of this form in both versions.



For the class 2 learner of (15) e, st, t, and crucially s are possible forms at this point,hence second person t with four occurrences is preferred; as a result st is subanalysed.

From a linguistic point of view, we see it as an open question whether the subanalysisin (15) or the one in (14) is more adequate. See Müller (2005) for persuasive, but in oureyes not compelling, arguments in favour of (15). Still it is a suggestive result that thetwo intermediate degrees of subanalysis complexity we propose correspond closely tothe two major segmentations of German which have been suggested in the literature.

In contrast, Swahili verb inflection (16) provides an example where an adequatesubanalysis requires the algorithm to presuppose segmentation points of complexityclass 2 (i.e. subanalysis of an affix string into two affix hypotheses only requires thatone of the corresponding strings occurs freely in the paradigm). By parameterizing thealgorithm to subanalysis complexity 2 we obtain the tense-agreement segmentationin (16a) which virtually every linguist would concur: the subjunctive exhibits the bareperson/number suffixes, and all other subparadigms combine these with tense/aspectaffixes.12

(16) Swahili verbal agreement with class 2 restricted learner

a. Segmentation(i) sub sg pl

1 ni tu2 u m3 a w-a

(ii) prs sg pl1 ni-na tu-na2 u-na m-na3 a-na w-a-na

(iii) imp sg pl1 ni-li tu-li2 u-li m-li3 a-li w-a-li

b. Lexicon

(i) ni, [+1 +sg] (v) a, [+3](ii) na, [-past] (vi) u, [+2 +sg](iii) tu, [+1 +pl] (vii) m, [+2 +pl](iv) li, [+past] (viii) w, [+3 +pl]

However, from the perspective of subanalysis complexity, the present/imperfect mark-ers -na/-li are cranberry affixes. Neither of them occurs as a free affix string in any partof the paradigm. Consequently, if these data are analysed with our class 1 learner, itproduces the counterintuitive result in (17b), which has one marker for every paradigmcell and is thus identical to the input paradigm, with affix entries sorted by cell length.

12 Swahili has plenty more verb forms than shown in (); all of them are transparently structured asthe present and the subjunctive. See Seidel () for exhaustive discussion.



(17) Swahili verbal agreement with class 1 restricted learner

a. Segmentation(i) sub sg pl

1 ni tu2 u m3 a wa

(ii) prs sg pl1 nina tuna2 una mna3 ana wana

(iii) imp sg pl1 nili tuli2 uli mli3 ali wali

b. Lexicon

(i) nina, [+1 +sg -past] (x) uli, [+2 +sg +past](ii) tuna, [+1 +pl -past] (xi) mli, [+2 +pl +past](iii) wana, [+3 +pl -past] (xii) ali, [+2 +sg +past](iv) nili, [+1 +sg +past] (xiii) ni, [+1 +sg +subj](v) tuli, [+1 +pl +past] (xiv) tu, [+1 +pl +subj](vi) wali, [+3 +pl +past] (xv) wa, [+3 +pl +subj](vii) una, [+2 +sg -past] (xvi) u, [+2 +sg +subj](viii) mna, [+2 +pl -past] (xvii) m, [+2 +pl +subj](ix) ana, [+3 +sg -past] (xviii) a, [+3 +sg +subj]

Thus Swahili provides good evidence that class 1 subanalysis is too weak in general,and that at least some languages require subanalysis of complexity 2. In fact ourimpressionistic estimation is that this pattern is even more frequent in other areas ofinflection such as adjectival comparison. For example, Persian forms its superlative ontop of the comparative (parallel structures are found e.g. in Ubykh, Sanskrit, Gothic;cf. Bobaljik 2007: 12). Thus the superlative suffix -ín is again of the cranberry type sothat the hardly disputable segmentation given in (18) requires class 2 complexity:

(18) Adjectival comparison in Persian (Mace 2003: 53)Positive Comparative Superlativebozorg bozorg-tár bozorg-tar-ín ‘big’mofid mofid-tár mofid-tar-ín ‘useful’moškel moškel-tár moškel-tar-ín ‘clear’

We turn now to a slightly more complex set of data to illustrate the consequencesof analytic ambiguity for the application of our algorithm.

Estonian verb inflection (19) provides another example, where an adequate subana-lysis requires the algorithm to presuppose segmentation points of complexity class 2.Thus most linguists would concur that the past forms in (19a-ii) comprise an imperfectsuffix and the agreement affixes also found in the corresponding present forms. Wherethe morphologist’s intuition potentially breaks down (or, to put it more optimistically,requires more evidence) is the exact identity/segmentation of the imperfect marker:one plausible option is that the 3sg imperfective -s is an independent tense-agreementportmanteau, whereas all other imperfect forms comprise the imperfect suffix -si.



Alternatively, one might argue for a completely general imperfect suffix -s. Thiscaptures the obvious parallelism between the imperfect forms which all start with s,but has the drawback that we get an ‘orphanized’ affix string i, which we would haveto analyse as a second imperfect marker inducing multiple (extended) exponence.

Crucially, both analyses result in cranberry morphs: neither -si nor i occurs asa free affix string in the paradigm and, as with Swahili, our algorithm producesundersegmentation (in fact non-segmentation) if restricted to class 1 complexity:

(19) Estonian verbal agreement (Ehala 2009: 42) with class 1 restricted learner


1 n me2 d te3 b vad

(ii) imp sg pl1 sin sime2 sid1 site3 s sid2

b. Lexicon

(i) sime, [+1 +pl +past] (vii) n, [+1 +sg -past](ii) site, [+2 +pl +past] (viii) d, [+2 +sg -past](iii) vad, [+3 +pl -past] (ix) b, [+3 +sg -past](iv) sin, [+1 +sg +past] (x) s, [+3 +sg +past](v) me, [+1 +pl -past] (xi) sid1, [+2 +sg +past](vi) te, [+2 +pl -past] (xii) sid2, [+3 +pl +past]

Again, we achieve plausible segmentation if we parametrize the learner to subanalysiscomplexity of class 2:

(20) Estonian verbal agreement with class 2 restricted learner


1 n me2 d te3 b vad

(ii) imp sg pl1 s-i-n s-i-me2 s-i-d s-i-te3 s s-id

b. Lexicon

(i) s, [+past] (v) n, [+1 +sg] (ix)b, [+3 +sg -past](ii) me, [+1 +pl] (vi) d, [+2 +sg](iii) i, [-3 +past] (vii) vad, [+3 +pl -past](iv) te, [2 +pl] (viii) id, [+3 +pl +past]

As (20a) needs eight instead of sixteen markers and does not introduce morehomonymy, we can see this as confirmation that the data are an instance of complexityclass 2. Although the learner is not restricted to freely occurring affixes (class 1complexity), the order of the affix entries in (20) still demonstrates that the algorithm



prefers affix hypotheses which occur freely in the paradigm, thus freely occurring 1plme is learned before the cranberry submorph s.

The Estonian data also nicely illustrate the reasons why a learning algorithm suchas ours which seeks to emulate the intuitions guiding theoretical linguists necessarilycombines computation of affix meaning and subanalysis. An analysis positing imper-fect -s and -i would be clearly superior for a morphologist giving maximal importanceto the Syncretism Principle (Müller 2003), whereas the assumption that true multipleexponence is universally excluded (Halle and Marantz 1993, Ortmann 1999) wouldmake an analysis assuming imperfect -si the only possible option. Since syncretismand multiple exponence are by definition notions based on the assignment of meaning,approaches to the learning of morphological segmentation which rely exclusively onphonological information (Harris 1955, Langer 1991, Goldsmith 2010, Saffran et al.1996) cannot capture the fact that they influence the decision whether a specific affixstring such as si should be subanalysed or not.13

A further analytic option we haven’t considered so far is that -si and 3sg/imperfect-s are instances of the same morphological marker -s, obscured by a (morpho-)phonological process inserting i in consonant clusters.

Under this analysis, the subanalysis complexity of the Estonian paradigm dropsto 1 so it can be successfully subanalysed considering nothing but free forms aspotential markers, as shown in (21).14 At this point, the potential impact of (morpho-)phonological alternations for morphological subanalysis is simply ignored by ouralgorithm because the learning of phonological processes is a highly complex problemof its own.

(21) Estonian assuming i-insertion with class 1 restricted learner


1 n me2 d1 te3 b vad

(ii) imp sg pl1 s-n s-me2 s-d1 s-te3 s s-d2

b. Lexicon

(i) s, [+past] (v) d1, [+2 +sg](ii) me, [+1 +pl] (vi) vad, [+3 +pl -past](iii) te, [+2 +pl] (vii) d2, [+3 +pl +past](iv) n, [+1 +sg] (viii) b, [+3 +sg -past]

13 This does not necessarily mean that purely phonotactic approaches to morphological segmentationwhich ignore affix meaning are ‘wrong’ (i.e. do not model the competence of speakers/learners). Theoreticalmorphology might be wrong, or there might be different phases of learning employing different methods.

14 Applying a class analysis to the data in () gives further subanalysis. Especially -d2 is segmented asa pl suffix.



The clearest case of data which might only be adequately analysed by parameterizingthe learner to class 3 complexity is the verb paradigm of the Oceanic language Lenakel(Lynch 1978). Besides additional TAM and subject agreement prefixes (for number)in other positional slots, the core system of verbs according to Lynch (pp. 42, 45, 47)is the obligatory combination of an overt agreement prefix with a right adjacent overtTAM prefix as shown in (22):

(22) Lenakel verbal agreement (Lynch 1978)15

pres past stat seq neg1ex i-ak- i-im- i-n- i-ep- i-is-1i k-ak- k-im- k-n- k-ep- k-is-2 n-ak- n-im- n-n- n-ep- n-is-3sg i-ak- i-im- i-n- i-ep- i-is-3nsg k-ak- k-im- k-n- k-ep- k-is-3ks m-ak- m-im m-n- m-ep- m-is-

As there is no zero-marking both in the agreement and the TAM marking, thesubanalysis of these categories into adjacent markers cannot be done in terms of freeaffix strings, upon which the class 1 and class 2 learners rest. It demands a full searchthrough all possible segmentations by a class 3 learner.

. A typological pilot study

As we have shown in the last section, lower subanalysis complexity of inflectionalparadigms leads to a processing advantage for a linguistically informed segmentationof affix strings by cutting down the search space for potential affixes. Under theplausible assumption that morphological systems are adapted for learnability, thisleads to the prediction that inflectional paradigms should avoid the full complexityof class 3 complexity and show a bias for paradigms of class 1 complexity. To test thisprediction we have carried out a small typological pilot study on subject agreementand TAM affixes.

Since evaluating complex inflectional paradigms for their subanalysis complexitystatus would require a complete morphological and phonological analysis takinginto account all subparadigms, allomorphs, and morphophonological alternations, wehave not tested complexity status, but the closely connected occurrence of Ø-affixes.Crucially a paradigm which allows one to subanalyse TAM and subject agreementmarkers with respect to each other under the class 2 restriction must exhibit at leastone TAM, or one agreement marker which is Ø. Similarly, a paradigm giving rise to aclass 1 analysis must involve at least one Ø agreement affix and one Ø TAM affix.

15 pres = present, habitual, and concurrent mood, stat = stative/perfective, seq = sequential, neg =negative, ks = known subject; abstracts away from phonological alternations.



Table . Typological pilot study: language sample

Language Phylum Macroarea Ø-Agr Ø-TAM Source

Udmurt Uralic Eurasia + + Csúcs (1998)Armenian Indo-European Eurasia + + Schmitt (1981)Nahuatl Uto-Aztecan N.America + + Andrews (1975)Kobon Trans-N.Gui. Austr./N.Gui. + + Davies (1989)Mapudungun Araucanian S.America + + Zúñiga (2000)Azerbaijani S.Turkic Eurasia + + Schöenig (1998)Turkana Nilotic Africa + + Dimmendaal (1983)Berber Afroasiatic Africa + + Kossmann (2007)Choctaw Muskogean N.America + + Broadwell (2006)Remo Munda Eurasia + + Anderson et al. (2008)Kalkatungu PamaNyungan Austr./N.Gui. + + Blake (1979)Moghol Mongolian Eurasia + - Weiers (2011)Belhare Kiranti Se.Asia/Oc. + - Bickel (2003)Kannada S.Dravidian Eurasia + - Steever (1998)Somali Cushitic Africa + - El-Solami-Mewis (1987)Inuktitut Eskimo-Aleut Eurasia + - Mallon (1991)Swahili Bantu Africa - + Seidel (1900)Pawnee Caddoan N.America - + Parks (1976)Manambu Sepik Austr./N.Gui. - + Aikhenvald (2008)Lenakel CE.M-Polynes. Se.Asia/Oc. - - Lynch (1978)

Our sample contains inflectional verbal paradigms of twenty areally and geneticallydiverse languages on the basis of Ruhlen’s (1987) phyla and macroareas. We haveconsidered only languages which have (at least some) subject agreement and TAMinflection on the same side of the stem, disregarding portmanteau expression ofsubject agreement + TAM, non-finite verb forms, and non-segmental exponence.

Table 10.1 shows the results of our survey, where a ‘+’ for Ø-Agr (Ø-TAM) indicatesthat the language has at least one Ø-affix for subject agreement (TAM), whereas a ‘−’indicates that all relevant affixes of the language are non-zero.

Crucially, more than half of the languages (11 of 20) have some Ø-marking forsubject agreement and TAM, and virtually all languages (19 of 20) have someØ-marking for either subject agreement or TAM. This result strikingly confirms ourpredictions. In fact the marginal character of class 3 languages suggests that languagelearning might generally rely on Ø-affixes and completely avoid class 3 complexity.Even for Lenakel, the only plausible candidate for being of complexity 3 we haveencountered so far, Lynch reports that:

In each case, the categories of person, tense, and number are obligatory, except that . . . tense maybe omitted in certain . . . circumstances . . . Certain tense prefixes may be omitted under certain



conditions. The markers ak- and im- may be omitted in verbs with third person subjects whenthe context makes the time of action quite clear. (1978: 43, 52)

More generally, since we have only evaluated verbal paradigms, it is likely that theoverall systems of many languages are actually of lower complexity since the markersemployed in verbal inflection may occur as free forms elsewhere, e.g. as independentpronouns or as affixes in nominal or adjectival inflection.


References

Abbott, C. (1984). ‘Two feminine genders in Oneida’. Anthropological Linguistics 26, 125–37.Abrams, P. (2006). Onondaga Pronominal Prefixes. PhD thesis, University of Buffalo.Ackerman, F., J. Blevins, and R. Malouf (2009). ‘Parts and wholes: Patterns of relatedness in

complex morphological systems and why they matter’. In J. Blevins and J. Blevins (eds),Analogy in Grammar: Form and Acquisition, pp. 54–82. Oxford: Oxford University Press.

Aikhenvald, Alexandra Y. (2003). A grammar of Tariana. Cambridge: Cambridge UniversityPress.

Aikhenvald, Alexandra Y. (2008). The Manambu language of East Sepik, Papua New Guinea.Oxford: Oxford University Press.

Albright, Adam. (2002). ‘Islands of reliability for regular morphology: Evidence from Italian’.Language 78, 684–709.

Albright, Adam, and Bruce Hayes (2003). ‘Rules vs. Analogy in English past tenses: A compu-tational/experimental study’. Cognition 90, 119–61.

Anderson, Gregory D. S., and K. David Harrison (2008). Remo (Bonda). In G. D. S. Anderson,(ed.), The Munda languages. London: Routledge, pp. 557–632.

Anderson, Stephen R. (1988). ‘Morphological change’. In Frederick J. Newmeyer (ed.), Linguis-tics: The Cambridge survey, vol. I, 324–62. Cambridge: Cambridge University Press.

Anderson, Stephen R. (1992). A-Morphous morphology. Cambridge: Cambridge UniversityPress.

Anderson, Stephen R. (2005). Aspects of the theory of clitics. Oxford: Oxford University Press.Anderson, Stephen R. (2011). ‘Stress-conditioned allomorphy in Surmiran (Rumantsch)’. In

Martin Maiden, John Charles Smith, Maria Goldbach, and Marc-Olivier Hinzelin (eds), Mor-phological autonomy: Perspectives from Romance inflectional morphology, pp. 13–35. Oxford:Oxford University Press.

Anderson, Stephen R. (to appear). ‘The morpheme: Its nature and use’. In Matthew Baerman(ed.), Oxford handbook of inflection. Oxford: Oxford University Press.

Andrews, James Richard (1975). Introduction to Classical Nahuatl. Austin: University of TexasPress.

Aronof, Mark (1994). Morphology by itself. Cambridge, MA: MIT Press.Baayen, R. Harald, Rochelle Lieber, and Robert Schreuder (1997). ‘The morphological complex-

ity of simplex nouns’. Linguistics, 35, 861–77.Baayen, R. Harald, Petar Milin, Dusica Filipović ÐurHević, Peter Hendrix, and Marco Marelli

(2011). ‘An amorphous model for morphological processing in visual comprehension basedon naive discriminative learning’. Psychological Review 118, 438–81.

Baerman, Matthew (2004). ‘Directionality and (Un)Natural Classes in Syncretism’. Language80(4), 807–27.

Baerman, Matthew, Dunstan Brown, and Greville G. Corbett (2005). The syntax–morphologyinterface: A study of syncretism. Cambridge: Cambridge University Press.


References

Baerman, Matthew, Greville G. Corbett, and Dunstan Brown (eds) (2010). Defective paradigms:Missing forms and what they tell us. Oxford: Oxford University Press.

Baerman, Matthew, Greville G. Corbett, Dunstan Brown, and Andrew Hippisley (eds) (2007).Deponency and morphological mismatches. Oxford: Oxford University Press.

Baeza-Yates, Ricardo, and Berthier Ribeiro-Neto (1999). Modern Information Retrieval. NewYork, NY: ACM Press, Addison-Wesley.

Baker, Mark C. (1995). The polysynthesis parameter. New York: Oxford University Press.Bauer, Laurie (2004). Morphological productivity. Cambridge: Cambridge University Press.Beard, Robert (1995). Lexeme–morpheme base morphology. Albany, NY: SUNY Press.Bickel, Balthasar (2003).‘Belhare’. In G. Thurgood and R. J. LaPolla (eds), The Sino-Tibetan

languages. London: Routledge, pp. 546–70.Blake, Barry J. (1979). A Kalkatungu grammar. Canberra: Dept. of Linguistics, Research School

of Pacific Studies, Australian National University.Blevins, J. (2004). ‘Inflection classes and economy’. In Z. G. Muller Gereon and Lutz Gunkel

(eds), Explorations in Nominal Inflection, pp. 375–96. Berlin: Mouton de Gruyter.Blevins, J. (2006). ‘Word-based morphology’. Journal of Linguistics 42, 531–73.Blevins, J. (2013). ‘Word-based morphology from Aristotle to modern wp (word and paradigm

models)’. In K. Allan (ed.), The Oxford handbook of the history of linguistics, pp. 41–85. Oxford:Oxford University Press.

Boas, Franz (1947). ‘Kwakiutl grammar, with a glossary of the suffixes’. Transactions of theAmerican Philosophical Society 37(3), 201–377.

Bobaljik, Jonathan David (2007). On Comparative Suppletion. Ms., University of Connecticut.Bobaljik, Jonathan David (2012). Universals in comparative morphology: Suppletion, superlatives,

and the structure of words. Cambridge: MIT Press.Bochner, H. (1993). Simplicity in Generative Morphology. Berlin: Mouton de Gruyter.Boelaars, J. H. M.C. (1950). The linguistic position of south-western New Guinea. Leiden: E. J. Brill.Booij, Geert (2010). Construction Morphology. Oxford: Oxford University Press.Broadwell, George Aaron (2006). A Choctaw reference grammar. Lincoln: University of

Nebraska Press.Brøndal, Viggo (1940). ‘Compensation et variation, deux principes de linguistique générale’.

Scientia 68, 101–9. Reprinted: Id. 1943. Essais de linguistique générale, pp. 105–16. København:Munskgaard.

Brown, Dunstan, Carole Tiberius, and Greville G. Corbett (2007). ‘The alignment of form andfunction: Corpus-based evidence from Russian’. International Journal of Corpus Linguistics12, 511–34.

Burzio, Luigi (2004). ‘Paradigmatic and syntagmatic relations in Italian verbal inflection’. InJ. Auger, J. C. Clements, and B. Vance (eds), Contemporary Approaches to Romance Linguistics.Amsterdam: John Benjamins.

Bybee, Joan L., and Carol L. Moder (1983). ‘Morphological classes as natural categories’.Language 59, 251–70.

Bybee, Joan L., and Dan Isaac Slobin (1982). ‘Rules and Schemas in the Development and Useof the English Past Tense’. Language 58, 265–89.

Bybee, Joan L., R. D. Perkins, and W. Pagliuca (1994). The evolution of grammar: Tense, aspectand modality in the languages of the world. Chicago: University of Chicago Press.


References

Cairns, Charles E. (1986). Word structure, markedness, and applied linguistics. Markedness. Pro-ceedings of the twelfth annual linguistics symposium of the University of Wisconsin-Milwaukee,March 11–12, 1983, ed. by Fred R. Eckman, Edith A. Moravcsik, and Jessica R. Wirth, pp. 13–38.New York: Plenum.

Carstairs, Andrew (1983). ‘Paradigm economy’. Journal of Linguistics 19, 115–28.Carstairs, Andrew (1984). ‘Outlines of a constraint on syncretism’. Folia linguistica 18, 73–85.Carstairs, Andrew (1987). Allomorphy in inflection. London: Croom Helm.Carstairs, Andrew, and Paul Stemberger (1988). ‘A processing constraint on inflectional

homonymy’. Linguistics 26, 601–18.Carstairs-McCarthy, Andrew (1994). ‘Inflection classes, gender, and the principle of contrast’.

Language 70, 737–88.Carstairs-McCarthy, Andrew. (2010). The evolution of morphology. Oxford: Oxford University

Press.Chafe, W. L. (1977). The evolution of third person verb agreement in the Iroquoian languages. In

C. Li (ed.), Mechanisms of Syntactic Change, pp. 493–524. Austin, Texas: University of TexasPress.

Clahsen, Harald (2006). ‘Linguistic perspectives on morphological processing’. In D. Wunder-lich, (ed.), Advances in the Theory of the Lexicon, pp. 355–88. Berlin: Mouton de Gruyter.

Chomsky, Noam, and Morris Halle (1968). The sound pattern of English. New York: Harper& Row.

Chumakina, Marina (2011). ‘Morphological complexity of Archi verbs’. In Gilles Authier andTimur Maisak (eds), Tense, aspect, modality and finiteness in East Caucasian languages(Diversitas Linguarum 30), pp. 1–24. Bochum: Brockmeyer.

Chumakina, Marina, Dunstan Brown, Harley Quilliam, and Greville G. Corbett (2007).Archi: A Dictionary of the Archi Villages, Southern Daghestan, Caucasus. <http://www.smg.surrey.ac.uk/archi/linguists/>.

Chumakina, Marina, and Greville G. Corbett (2008). ‘Archi: The challenge of an extremeagreement system’. In Aleksandr V. Arxipov (ed.), Fonetika i nefonetika. K 70-letiju SandroV. Kodzasova [Festschrift for S. V. Kodzasov], pp. 184–94. Moskva: Jazyki slavjanskix kul’tur.

Clackson, James (2007). Indo-European linguistics: An Introduction. Cambridge: CambridgeUniversity Press.

Coltheart, Max, Kathleen Rastle, Conrad Perry, Robyn Langdon, and Johannes Ziegler (2001).‘DRC: A dual route cascaded model of visual word recognition and reading aloud’. Psycho-logical Review 108, 204–56.

Corbett, Greville G. (2007). ‘Deponency, syncretism and what lies between’. Proceedings of theBritish Academy 145, 21–43.

Corbett, Greville G. (2012). Features. Cambridge: Cambridge University Press.Corbett, Greville G., and Norman M. Fraser. (1993). ‘Network morphology: A DATR account

of Russian nominal inflection’. Journal of Linguistics 29, 113–42.Cowan, Nelson (2001). ‘The magical number 4 in short-term memory: A Reconsideration of

mental storage capacity’. Behavioral and Brain Sciences 24, 87–114.Croft, William (2003). Typology and universals. 2nd edition. Cambridge: Cambridge University

Press.

http://www.smg.surrey.ac.uk/archi/linguists/

http://www.smg.surrey.ac.uk/archi/linguists/


References

Csúcs, Sándor (1998).‘Udmurt’. In D. Abondolo (ed.), The Uralic languages. London and NewYork: Routledge, pp. 276–304.

Culicover, Peter W. (2013). Grammar and Complexity: Language at the Intersection of Compe-tence and Performance. Oxford: Oxford University Press.

Culy, Christopher (1985). ‘The complexity of the vocabulary of Bambara’. Linguistics andPhilosophy 8, 345–51.

Daelemans, Walter, and Antal van den Bosch (2005). Memory-based language processing.Cambridge: Cambridge University Press.

Dahl, Östen (2004). The growth and maintenance of linguistic complexity. Amsterdam: JohnBenjamins.

Davies, John (1989). Kobon. London: Routledge.Davis, J. Colin, and S. Jeffrey Bowers (2004). ‘What do letter migration errors reveal about letter

position coding in visual word recognition?’ Journal of Experimental Psychology: HumanPerception and Performance 30, 923–41.

Dehaene, Stanislas (2009). Reading the brain. New York: Penguin.Dimmendaal, Gerrit Jan (1983). The Turkana language. Dordrecht: Foris.Donohue, Mark (1999). Warembori. Muenchen: Lincom Europa.Donohue, Mark (2008). ‘Complex predicates and bipartite stems in Skou’. Studies in Language

32(2), 279–335.Donohue, Mark (2011). ‘Case and configurationality: Scrambling or mapping?’ Morphology 21,

499–513.Drabbe, P. (1947). Nota’s over de Jénggalntjoer-taal. Merauke: Archief.Drabbe, P. (1950). ‘Talen en Dialecten van Zuid-West Nieuw-Guinea’. Anthropos 45, 545–74.Driem, George van (1987). A grammar of Limbu. Berlin: Mouton de Gruyter.Ehala, Martin (2009). ‘Linguistic strategies and markedness in Estonian morphology’. Sprachty-

pologie und Universalienforschung 1/2, 29–48.Elman, Jeff (1998). ‘Generalization, simple recurrent networks, and the emergence of structure’.

Proceedings of the 20th Annual Conference of the Cognitive Science Society. Mahwah. NJ:Lawrence Erlbaum Associates.

El-Solami-Mewis, Catherine (1987). Lehrbuch des Somali. Leipzig: VEB Verlag Enzyklopädie.Evans, Nicholas (1995a). A grammar of Kayardild: With historical-comparative notes on Tangkic.

Berlin: Mouton de Gruyter.Evans, Nicholas (1995b). ‘Multiple case in Kayardild: Anti-iconic suffix order and the diachronic

filter’. In Frans Plank (ed.), Double case: Agreement by suffixaufnahme, pp. 396–428. New York:Oxford University Press.

Ferro, Marcello, Claudia Marzi, and Vito Pirrelli (2011). ‘A Self-Organizing Model of WordStorage and Processing: Implications for Morphology Learning’. Lingue e Linguaggio (10)2,209–26. Bologna: il Mulino.

Finkel, Raphael, and Gregory Stump (2007). ‘Principal parts and morphological typology’.Morphology 17, 39–75.

Finkel, Raphael, and Gregory Stump (2009). ‘Principal parts and degrees of paradigmatictransparency’. In James P. Blevins and Juliette Blevins (eds), Analogy in grammar: Form andacquisition, pp. 13–53. Oxford: Oxford University Press.


References

Fortescue, M. (1980). ‘Affix ordering in West Greenlandic derivational processes’, InternationalJournal of American Linguistics 46(4), 259–78.

Gathercole, E. Susan, and Alan D. Baddeley (1989). ‘Evaluation of the role of phonological STMin the development of vocabulary in children: A longitudinal study’. Journal of Memory andLanguage 28, 200–13.

Gazdar, G., G. Pullum, B. Carpenter, E. Klein, T. Hukari, and R. Levine (1988). ‘Categorystructures’. Computational Linguistics 14, 1–19.

Gerken, Lou Ann (2006). ‘Decisions, decisions: Infant language learning when multiple gener-alizations are possible’. Cognition 98, B67–B74.

Gershenson, Carlos, and Nelson Fernández (2012). ‘Complexity and information: Measuringemergence, self-organization, and homeostasis at multiple scales’. Complexity 18, 29–44.

Givón, Talmy (1971). ‘Historical syntax and synchronic morphology: An archaeologist’s field-trip’. Proceedings of the Chicago Linguistic Society 7, 394–415.

Goldsmith, John A. (2010). ‘Segmentation and morphology’. In A. Clark, C. Fox, and S. Lappin(eds), Handbook of Computational Linguistics and Natural Language Processing. Oxford:Blackwell.

Greenberg, Joseph (1966). Language universals, with special reference to the feature hierarchies.The Hague: Mouton. Reprinted 2005. Berlin: Mouton de Gruyter.

Halle, Morris (1997). ‘Distributed Morphology: Impoverishment and Fission’. In Y. K. BenjaminBruening and M. McGinnis (eds), Papers at the Interface. Vol. 30 of MIT Working Papers inLinguistics, pp. 425–49. Cambridge MA: MITWPL.

Halle, Morris, and Alec Marantz (1993). ‘Distributed Morphology and the Pieces of Inflection’.In K. Hale and S. J. Keyser (eds), The View from Building 20, pp. 111–76. Cambridge MA: MITPress.

Hammarström, Harald, and Lars Borin (2011). ‘Unsupervised learning of morphology’. Compu-tational Linguistics 37(2), 309–50.

Hankamer, J. (1989). ‘Morphological parsing and the lexicon’. In W. Marslen Wilson (ed.),Lexical Representation and Process, pp. 392–408. Cambridge, MA: MIT Press.

Hargus, Sharon (1997). ‘The Athabaskan disjunct prefixes: Clitics or affixes?’ In Jane Hill, P. J.Mistry, and Lyle Campbell (eds), The life of language: Papers in linguistics in honor of WilliamBright. Berlin: Mouton de Gruyter.

Harm, W. Michael, and Mark S. Seidenberg (1999). ‘Phonology, reading acquisition, anddyslexia: Insights from connectionist models’. Psychological Review 106, 491–528.

Harris, Zellig Sabbatai (1955). ‘From phoneme to morpheme’. Language 31, 190–222.Haspelmath, Martin (2006). ‘Against markedness (and what to replace it with)’. Journal of

Linguistics 42, 25–70.Hawkins, John A. (2004). Efficiency and complexity in grammars. Oxford: Oxford University

Press.Hay, Jennifer, and R. Harald Baayen (2003). ‘Phonotactics, parsing and productivity’. Italian

Journal of Linguistics 15, 99–130.Hippisley, Andrew (2010). ‘Lexical analysis’. In Nitin Indurkhya and Fred J. Damerau (eds)

Handbook of natural language processing, second edition pp. 31–58. Boca Raton, FL: CRC Press,Taylor and Francis Group.


References

Hjelmslev, Louis (1935). La catégorie des cas. Copenhagen: Munskgaard.Hockett, Charles F. (1947). ‘Problems of morphemic analysis’. Language 23, 321–43.Hocket, Charles F. (1958). A course in modern linguistics. New York: Macmillan Company.Huntley, David (1993). ‘Old Church Slavonic’. The Slavonic languages, ed. by Bernard Comrie

and Greville G. Corbett, pp. 125–87. London: Routledge.Jacobi, Hermann (1886). Ausgewählte Erzählungen in Mâhârâshtrî. Leipzig: Hirzel.Jakobson, Roman (1939). ‘Signe zéro’. In Mélanges de linguistique, offerts à Charles Bally,

pp. 143–52. Genève: Georg.Jurafsky, Daniel, and James H. Martin (2009). Speech and language processing. London: Pearson

Education Ltd.Kaas, Jon H., Michael M. Merzenich, and Herbert P. Killackey (1983). ‘The reorganization of

somatosensory cortex following peripheral nerve damage in adults and developing mam-mals’. Annual Review of Neuroscience 6, 325–56.

Keuleers, Emmanuel, and Walter Daelemans (2007). ‘Morphology-based learning modelsof inflectional morphology: A Methodological Case Study’. In V. Pirrelli (ed.), Psycho-computational issues in morphology and processing, Lingue e Linguaggio 2, 151–74.

Kibrik, Aleksandr E. (1977a). Opyt strukturnogo opisanija arčinskogo jazyka. Volume 2: Tak-sonomičeskaja grammatika. Moskva: Izdatel’stvo Moskovskogo Universiteta.

Kibrik, Aleksandr E. (1977b). Opyt strukturnogo opisanija arčinskogo jazyka. Volume 3:Dinamičeskaja grammatika. Moskva: Izdatel’stvo Moskovskogo Universiteta.

Kibrik, Aleksandr, Alexandre Arkhipov, Mikhail Daniel, and Sandro Kodzasov (2007). Architext corpus. <http://www.philol.msu.ru/∼languedoc/eng/archi/corpus.php>.

Kibrik, Aleksandr E., Sandro Kodzasov, Irina Olovjannikova, and Džalil Samedov (1977). Opytstrukturnogo opisanija arčinskogo jazyka. Volume 1: Leksika, fonetika. Moskva: Izdatel’stvoMoskovskogo Universiteta.

Koenig, Jean-Pierre, and Karin Michelson (2012). ‘The (non)universality of syntactic selectionand functional application’. In C. Piñon (ed.), Empirical studies in syntax and semantics,Volume 9, pp. 185–205. Paris: Centre National de la Recherche Scientifique.

Koenig, Jean-Pierre, and Karin Michelson (in press). ‘Invariance in argument realization: Thecase of Iroquoian’. Language, 90(4).

Kohonen, Teuvo (2001). Self-organizing maps. Berlin Heidelberg: Springer-Verlag.Kossmann, Maarten G. (2007). ‘Berber morphology’. In A. Kaye (ed.), Morphologies of Asia and

Africa, pp. 429–46. Winona Lake, Indiana: Eisenbrauns.Kostić, Aleksandar, and Milena Božić (2007). ‘Constraints on probability distributions of

grammatical forms’. Psihologija 40, 5–35.Kostić, Aleksandar, Tania Markovic, and Aleksandar Baucal (2003). ‘Inflectional morphology

and word meaning: Orthogonal or co-implicative domains’. Morphological Structure inLanguage Processing, ed. by R. H. Baayen and R. Schreuder, pp. 1–44. Berlin: Mouton deGruyter.

Koutnik, Jan (2007). ‘Inductive modelling of temporal sequences by means of self-organization’.Proceedings of International Workshop on Inductive Modelling (IWIM 2007), pp. 269–77.Prague: Czech Technical University.

Langer, Hagen (1991). Ein automatisches Morphemsegmentierungsverfahren für das Deutsche.PhD thesis, Georg-August-Universität zu Göttingen.

http://www.philol.msu.ru/~languedoc/eng/archi/corpus.php


References

Lanman, Charles R. (1880). ‘A statistical account of noun-inflection in the Veda’. Journal of theAmerican Oriental Society 10, 325–601.

Leumann, Manu (1977). Lateinische Laut- und Formenlehre. München: Beck.Lounsbury, F. (1953). Oneida verb morphology. Yale University Publications in Anthropology 48.

New Haven, CT: Yale University Press.Lovins, Julie Beth (1968).‘Development of a stemming algorithm’. Mechanical Translation and

Computational Linguistics 11(2), 22–31.Lynch, John (1978). A grammar of Lenakel. Vol. 55 of Series B, Pacific Linguistics B.Mace, John (2003). Persian grammar: For reference and revision. London: Routledge.Mallon, Mick (1991). Introductory Inuktitut: Reference grammar. Montreal: Arctic College—

McGill University Inuktitut Text Project.Maiden, Martin (2005). ‘Morphological autonomy and diachrony’. Yearbook of Morphology

2004, 137–75.Maiden, Martin, and Paul O’Neill (2010). ‘On morphomic defectiveness’. Proceedings of the

British Academy 163, 103–24.Marchand, Hans (1969). The categories and types of present-day English word formation. 2nd

edition. Minchen: Beck.Marcus, Gary F. (2001). The algebraic mind: Integrating connectionism and cognitive science.

Cambridge, MA: MIT Press.Marcus, Gary F., S. Vijayan, Shoba Bandi Rao, and Peter M. Vishton (1999). ‘Rule learning in

7-month-old infants’. Science 283, 77–80.Marzi, Claudia, Marcello Ferro, Claudia Caudai, and Vito Pirrelli (2012a). ‘Evaluating Hebbian

self-organizing memories for lexical representation and access’. Proceedings of the 8th Inter-national Conference on Language Resources and Evaluation (LREC 2012), 886–93, Istanbul.

Marzi, Claudia, Marcello Ferro, and Vito Pirrelli (2012b). ‘Prediction and generalisation in wordprocessing and storage’. Proceedings of the 8th Mediterranean Morphology Meeting (8th MMM2011), 114–31. University of Patras, Greece.

Marzi, Claudia, Marcello Ferro, and Vito Pirrelli (2012c). ‘Word alignment and paradigmInduction’. Lingue e Linguaggio (11)2, 251–74.

Matthews, P. H. (1974). Morphology: An introduction to the theory of word-structure. Cambridge:Cambridge University Press.

Matthews, P. H. (1991). Morphology. 2nd edition. Cambridge: Cambridge University Press.McCarthy, John (2010). An introduction to Harmonic Serialism. Ms., University of Mas-

sachusetts, Amherst.McClelland, James L., and David E. Rumelhart (1981). ‘An interactive activation model of

context effects in letter perception: Part 1. An account of basic findings’. Psychological Review88, 375–407.

Meir, Irit, Wendy Sandler, Carol Padden, and Mark Aronof (2010). ‘Emerging sign languages’.In Marc Marschark and Patricia Elizabeth Spencer (eds), Oxford handbook of deaf studies,language, and education, vol. 2, pp. 267–80. New York: Oxford University Press.

Miestamo, Matti, Kaius Sinnemäki, and Fred Karlsson (2008). Language complexity: Typology,contact, change. Amsterdam: John Benjamins.

Milin, Petar, Victor Kuperman, Aleksandar Kostić, and R. Harald Baayen (2009). ‘Words andparadigms bit by bit: An information-theoretic approach to the processing of paradigmatic


References

structure in inflection and derivation’. In James P. Blevins and Juliette Blevins (eds), Analogyin grammar: Form and acquisition, pp. 214–52. Oxford: Oxford University Press.

Milizia, Paolo (2013). L’equilibrio nella codifica morfologica. Roma: Carocci.Milizia, Paolo (2014). ‘Semi-separate exponence in cumulative paradigms: Information-

theoretic properties exemplified by Ancient Greek verb endings’. Linguistic Issues in LanguageTechnology 11(4), 95–123.

Mohri, Mehryar, and Richard Sproat (2006). ‘On a common fallacy in computational linguistics’.In M. Suomen, A. Arppe, A. Airola, O. Heinämäki, M. Miestamo, U. Määttä, J. Niemi, K. K.Pitkänen, and K. Sinnemäki (eds), A Man of Measure: Festschrift in Honour of Fred Karlssonon his 60th Birthday. SKY Journal of Linguistics 19, 432–9.

Moscoso del Prado, Martín, Fermín Aleksandar Kostic, and R. Harald Baayen (2004). ‘Puttingthe bits together: An information-theoretical perspective on morphological processing’.Cognition 94(1), 1–18.

Müller, Gereon (2003).‘On decomposing inflection class features: Syncretism in Russian nouninflection’. In L. Gunkel, G. Müller, and G. Zifonun (eds), Explorations in nominal inflection,pp. 189–228. Berlin: Mouton de Gruyter.

Müller, Gereon (2005). Subanalyse verbaler Flexionsmarker. Ms., Universität Leipzig.Noreen, Adolf (1923). Altisländische und altnorwegische Grammatik. Halle: Niemeyer.Oberlies, Thomas (2001). Pali: A grammar of the language of the Theravada Tipit.aka. Berlin:

Mouton de Gruyter.O’Neill, Paul (2011). ‘The notion of the morphome’. In M. Goldbach, M. Maiden, and J.-C.

Smith (eds), Morphological autonomy: Perspectives from Romance inflectional morphology,pp. 70–94. Oxford: Oxford University Press.

Orsolini, Margherita, Rachele Fanari, and Hugo Bowles (1998). ‘Acquiring regular and irregularinflection in a language with verb classes’. Language and Cognitive Processes 13, 425–64.

Orsolini, Margherita, and William Marslen-Wilson (1997). ‘Universals in morphological repre-sentation: Evidence from Italian’. Language and Cognitive Processes 12, 1–47.

Ortmann, Albert (1999). ‘Affix repetition and non-redundancy in inflectional morphology’.Zeitschrift für Sprachwissenschaft 1, 76–120.

Packard, Jerome L. (2000). The morphology of Chinese: A linguistic and cognitive approach.Cambridge: Cambridge University Press.

Papagno Costanza, Tim Valentine, and Alan Baddeley (1991). ‘Phonological short-term mem-ory and foreign-language vocabulary learning’. Journal of Memory and Language 30, 331–47.

Parks, Douglas Richard (1976). A grammar of Pawnee. New York: Garland.Penfield, Wilder, and Lamar Roberts (1959). Speech and brain mechanisms. Princeton, NJ:

Princeton University Press.Perry, Conrad, Johannes C. Ziegler, and Marco Zorzi (2007). ‘Nested incremental modeling in

the development of computational theories: The CDP+ model of reading aloud’. PsychologicalReview, 114(2), 273–315.

Pirrelli, Vito (2000). Paradigmi in Morfologia. Un approccio interdisciplinare alla flessione verbaledell’italiano. Pisa-Roma: Istituti Editoriali e Poligrafici Italiani.

Pirrelli, Vito, Basilio Calderone, Ivan Herreros, and Michele Virgilio (2004). ‘Non-locality all theway through: Emergent global constraints in the Italian morphological lexicon’. Proceedingsof 7th Meeting of the ACL SIGPHON.


References

Pirrelli, Vito, Marcello Ferro, and Basilio Calderone (2011). ‘Learning paradigms in time andspace. Computational evidence from Romance languages’. In M. Maiden, John C. Smith,Maria Goldbach, and Marc-Olivier Hinzelin (eds), Morphological autonomy: Perspectivesfrom Romance inflectional morphology, pp. 135–57. Oxford: Oxford University Press.

Pischel, Richard (1900). Grammatik der Prakrit-Sprachen. Strassburg: Trübner.Plank, Frans, and Wolfgang Schellinger (1997). ‘The uneven distribution of genders over

numbers: Greenberg Nos. 37 and 45’. Linguistic Typology 1, 53–101.Plaut, David, James McClelland, Mark Seidenberg, and Karalyn Patterson (1996). ‘Understand-

ing normal and impaired word reading: Computational principles in quasi-regular domains’.Psychological Review 103, 56–115.

Polinsky, Maria, and Eric Potsdam (2001). ‘Long-distance agreement and topic in Tsez’. NaturalLanguage and Linguistic Theory 19, 583–646. Kluwer Academic Publishers.

Porter, Martin (1980). ‘An algorithm for suffix stripping’. Program 3, 130–7.de Reuse, Willem Joseph (1994). Siberian Yupik Eskimo: The language and its contacts with

Chukchi. Salt Lake City: University of Utah Press.Rice, Keren (2000). Morpheme order and semantic scope: Word formation in the Athabaskan

verb. Cambridge: Cambridge University Press.Rice, Keren (2011). ‘Principles of affix ordering: An overview’. Word Structure 4(2), 169–200.Rice, Keren (2012). ‘Morphological complexity in Athabaskan languages: A focus on disconti-

nuities’. Presented at SSILA workshop on Morphological Complexity in the Languages of theAmericas, Annual Meeting of the Linguistic Society of America, Portland, Oregon, January2012.

Rissanen, Jorma (2007). Information and complexity in statistical modeling. New York: Springer.Roark, Brian, and Richard Sproat (2007). Computational approaches to morphology and syntax.

Oxford: Oxford University Press.Rosenblatt, Frank (1962). Principles of neurodynamics. New York: Spartan.Round, Erich R. (2009). Kayardild morphology, phonology and morphosyntax. Yale University

PhD dissertation.Round, Erich R. (2010). ‘Autonomous morphological complexity in Kayardild’. Paper presented

at the Workshop on Morphological Complexity, Harvard University, 12 January 2010.Round, Erich R. (2011). ‘Morphomes as a level of representation capture unity of exponence

across the inflection–derivation divide’. Linguistica 51, 217–30.Round, Erich R. (2013). Kayardild morphology and syntax. Oxford: Oxford University Press.Round, Erich R. (forthcoming). ‘Kayardild inflectional morphotactics is morphomic’. In A. Luis

and R. Bermudez-Otero (eds),The morphome debate. Oxford: Oxford University Press.Round, Erich R. (in prep). Paradigmatic evidence for morphomic organisation in Kayardild

inflection.Ruhlen, Merritt (1987). A guide to the world’s languages: Classification. Vol. 1. Stanford: Stanford

University Press.Russell, S. and P. Norvig (2009). Artificial intelligence: A modern approach. 3rd edition. Upper

Saddle River, NJ: Prentice Hall.Sagot, Benoît (2013). ‘Comparing complexity measures’. Paper presented at the Workshop

‘Computational Approaches to Morphological Complexity’, 22 February, 2013, Paris.


References

Sagot, Benoît, and Géraldine Walther (2011). ‘Non-canonical inflection: data, formalisation andcomplexity measures’. In Actes de SFCM 2011 (2nd workshop on Systems and Frameworks forComputational Morphology), Zürich, Switzerland.

Saffran, Jenny, Richard N. Aslin, and Elissa L. Newport (1996). ‘Statistical Learning by 8-Month-Old Infants’. Science 274, 1926–8.

Saffran, Jenny, Elissa L. Newport, and Richard N. Aslin (1996).‘Word segmentation: The role ofdistributional cues’. Journal of Memory and Language 35, 606–21.

Salanova, Andres Pablo (2012). ‘Reduplication and verbal number in Mebengokre’. Presented atSSILA workshop on Morphological Complexity in the Languages of the Americas, AnnualMeeting of the Linguistic Society of America, Portland, Oregon, January 2012.

Sampson, Geoffrey, David Gil, and Peter Trudgill (2009). Language Complexity as an evolvingvariable (Studies in the evolution of language 13). Oxford: Oxford University Press.

Sapir, Edward (1921). Language. New York: Harcourt, Brace & World.Say, Tessa, and Harald Clahsen (2002). ‘Words, rules and stems in the Italian mental lexicon’.

In S. Nooteboom, Fred Weerman, and Frank Wijnen (eds), Storage and computation in thelanguage faculty, pp. 96–122. Dordrecht: Kluwer Academic Publishers.

Schmitt, Rüdiger (1981). Grammatik des Klassisch-Armenischen. Vol. 32 of Innsbrucker Beiträgezur Sprachwissenschaft. Innsbruck: Institut für Sprachwissenschaft der Universität Innsbruck.

Schönig, Claus (1998). ‘Azerbaidjanian’. In L. Johanson and E. A. Csató (eds), The TurkishLanguages, pp. 248–60. London: Routledge.

Segel, Esben (2008). ‘Re-evaluating zero: When nothing makes sense’. Skase Journal of Theoret-ical Linguistics 5(2), 1–20.

Seidel, August (1900). Swahili Konversations grammatik. Heidelberg: Julius Groos.Shannon, Claude E. (1948). ‘A mathematical theory of communication’. Bell System Technical

Journal 27, 379–423.Shannon, Claude E. (1951). ‘Prediction and entropy of printed English’. The Bell System Technical

Journal 30, 50–64.Spencer, Andrew (2007). ‘Extending deponency: implications for morphological mismatches’.

Proceedings of the British Academy 145, 45–70.Steever, Sanford B. (1998).‘Kannada’. In S. B. Steever (ed.), The Dravidian languages, pp. 129–57.

London: Routledge.Stump, Gregory T. (1993). ‘On rules of referral’. Language 69, 449–79.Stump, Gregory T. (2001). Inflectional morphology. Cambridge: Cambridge University Press.Stump, Gregory T. (2002). ‘Morphological and syntactic paradigms: Arguments for a theory of

paradigm linkage’. Yearbook of Morphology 2001, 147–80.Stump, Gregory T. (2012). ‘The formal and functional architecture of inflectional morphology’.

In A. Ralli, G. Booij, S. Scalise, and A. Karasimos (eds), Morphology and the Architecture ofGrammar: Online Proceedings of the Eighth Mediterranean Morphology Meeting, pp. 245–70.

Stump, Gregory, and Raphael A. Finkel (2013). Morphological typology: From word to paradigm.Cambridge: Cambridge University Press.

Thumb, Albert, and Richard Hauschild (1959). Handbuch des Sanskrit: Eine Einführung in dassprachwissenschaftliche Studium des Altindischen. II Teil. Heidelberg: Winter.

Trommer, Jochen, and Gereon Müller (eds) (2006). Subanalysis of argument encoding indistributed morphology. Vol. 84 of Linguistische Arbeits Berichte. Institut für Linguistik:Universität Leipzig.


References

Vaillant, André (1958). Grammaire comparée des langues slaves. Tome II. Morphologie. Deuxièmepartie: Flexion pronominale. Lyon-Paris: IAC.

Vajda, Edward J. (2010). ‘A Siberian link with Na-Dene languages’. In James Kari and BenA. Potter (eds), The Dene–Yeniseian connection (Anthropological Papers of the Universityof Alaska New Series, Vol. 5(1–2)), pp. 33–99. Fairbanks: Department of Anthropology,University of Alaska Fairbanks.

Voegtlin, Thomas (2002). ‘Recursive self-organizing maps’. Neural Networks 15, 979–91.Von Hinüber, Oskar (1968). Studien zur Kasussyntax des Pali, besonders des Vinaya-pit.aka.

München: Kitzinger.Weiers, Michael (2011). ‘Moghol’. In J. Janhunen (ed.), The Mongolic Languages, pp. 248–64.

Routledge.Whitney, Carol (2001). ‘Position-specific effects within the SERIOL framework of letter-position

coding’. Connection Science 13, 235–55.Wiese, Bernd (2004). ‘Categories and paradigms: On underspecification in Russian declension’.

In Gereon Müller, Lutz Gunkel, and Gisela Zifonun (eds), Explorations in nominal inflection,pp. 321–72. Berlin: Mouton de Gruyter.

Wunderlich, Dieter, and Ray Fabri (1994). ‘Minimalist morphology: An approach to inflection’.Zeitschrift fü Sprachwissenschaft 20, 236–94.

Yu, Alan C. L. (2007). A natural history of infixation. Oxford: Oxford University Press.Zúñiga, Fernando (2000). Mapudungun. München: Lincom Europa.Zwicky, A. (1985). ‘How to describe inflection’. In M. Niepokuj, M. van Clay, V. Nikiforidou, and

D. Feder (eds), Proceedings of the Eleventh Annual Meeting of the Berkeley Linguists Society,pp. 372–86. Berkeley, CA: Berkeley Linguistics Society.



Languages Index

Arabic 149Archi 3, 9, 18, 93–116, 177, 185Armenian 203Athabaskan 17Azerbaijani 203

Babine-Witsu Wit’en 19Bambara 4Belhare 203Berber 203Berik 53

Celtic 16, 23Chinantec 3Choctaw 21, 203Church Slavic, Old 177

Dargwa 67

Emplawas, Bobotand 54English 8, 18, 25, 120, 143, 148English, American 22, 119Eskimo, Greenlandic 70Eskimo-Aleut 18Estonian 199–201

Finnish 54French 8

Georgian 25German 8, 148, 149, 152, 154, 155, 158, 159, 162,

185, 186–90, 196–8Gothic 199Greek 3, 90

Icelandic 25Icelandic, Old 178Iha 67Indic

Middle 167–84Old 167–84

Inuktitut 203Iroquoian 9Italian 149, 158, 159, 160, 162

Jaina-Maharas.t.ri Prakrit 183

Kalkatungu 203Kannada 203Kanum 53–68Kayardild 30–52Kobon 203Kwakw’ala 14–17, 18, 19, 24

Latin 3, 22, 38, 49, 79, 90, 130, 168, 178Lenakel 201–4Lingala 54

Manambu 203Mandarin 18, 25Mapudungun 203Meryam Mer 54Moghol 203Mohawk 17, 25

Nahuatl 203Nakh-Daghestanian 93

Oneida 9, 17, 69–92, 99

Pali 168, 175–82Palu’e 67Pawnee 203Persian 199

Quechua 7

Remo 203Russian 178

Salish 18Sanskrit 199Sanskrit, Vedic 20Sign Language, Al Sayyid Bedouin 25Skou 67Slavic 179Slovene 177Somali 203Surmiran 23Swahili 190–1, 198–200, 203


Languages Index

Tariana 67Tlingit-Athabaskan-Eyak 20Tsez 112Turkana 203Turkish 70

Ubykh 199Udmurt 203

Vietnamese 18

Wakashan 16, 18Warlpiri 23

Yeniseian 20Yupik, Central Alaskan 17Yupik, Central Siberian 18, 19


Names Index

Abbott 71Abrams 70Ackerman 4, 5, 6, 77, 88, 119, 128Aikhenvald 67, 203Albright 142, 147, 161, 165Anderson 4, 14, 35, 67, 69, 187, 203Andrews 203Aronoff 23, 49

Baayen 147, 163, 168Baddeley 163Baerman 22, 54, 57, 85, 167, 174, 179, 182Baeza-Yates 192Baker 11, 12Bank 9Bauer 68Beard 39, 40, 41Bickel 203Blake 203Blevins 69, 79, 141, 143Boas 18Bobaljik 4, 199Boelaars 54Booij 142Borin 164Bowers 145Božić 168Broadwell 203Brøndal 167, 179Brown 169Burzio 142Bybee 20, 147

Cairns 168Carstairs see Carstairs-McCarthyCarstairs-McCarthy 4, 13, 69, 79, 147, 169Chafe 72, 84Chomsky 24Chumakina 9Clackson 183Clahsen 147, 166Coltheart 145Corbett 9, 38, 49, 142Cowan 163Croft 168

Csúcs 203Culicover 7Culy 4

Daelemans 164Dammel 7Davies 203Davis 145de Reuse 18de Saussure 13Dehaene 156Dimmendaal 203Drabbe 54

Ehala 200El-Solami-Mewis 203Evans 30, 33, 34, 35, 42

Fabri 187Fernández 8Ferro 9Finkel 6, 9, 69, 77, 119Fortescue 70Fraser 49, 142

Gathercole 163Gazdar 77Gerken 145Gershenson 8Givón 25Goldsmith 185, 201Greenberg 167, 168

Halle 24, 194, 201Hammarström 164Hankamer 70Hargus 19Harm 145Harris 201Haspelmath 168Hauschild 171, 177, 180Hawkins 168Hay 163Hayes 142, 161Hippisley 5


Names Index

Hjelmslev 167, 182Hockett 7, 22Huntley 177

Jacobi 183Jakobson 167Jurafsky 4

Kaas 151Keuleers 164Kibrik 93, 94, 103, 114Koenig 9Kohonen 143, 151Kossman 203Kostić 168, 170Koutnik 152Kürschner 7Kuster 7

Langer 201Lanman 171Leumann 179Lounsbury 70, 72, 82, 88Lovins 185Lynch 201, 202, 203

Mace 199Maiden 30, 43, 49Mallon 203Marantz 194, 201Marchand 18Marcus 143, 144, 145, 146, 150, 159, 161, 165Marslen-Wilson 147Martin 4Marzi 9Matthews 35, 142McCarthy 195McClelland 145Meir 25Michelson 9Miestamo 7Milin 119, 128, 168, 170Milinet 6Milizia 9Moder 147Mohri 4Moscoso del Prado Martín 6, 119, 128,

142, 168Müller 185, 186, 197, 198, 201

Noreen 178Norvig 77

Oberlies 175, 176O’Neill 49Orsolini 147Ortmann 67, 201

Packard 18Papagno 163Parks 203Penfield 151Perry 145Pirrelli 9Pischel 183Plank 177Plaut 145Polinsky 112Porter 185Potsdam 112

Ribeiro-Neto 192Rice 19, 21, 185Rissanen 170Roark 149Roberts 151Rosenblatt 144Ruhlen 202Rumelhart 145Russell 77

Saffran 145, 201Sagot 5, 6Sampson 7Sapir 11Saussure 13Say 147Schellinger 177Schmitt 203Schöenig 203Segel 187Seidel 191, 198, 203Seidenberg 145Shannon 128, 169, 170Slobin 147Spencer 38Sproat 4, 149Steever 203Stemberger 147Stump 6, 9, 35, 36, 50, 54, 69, 75, 77, 84, 85,

119, 142, 179, 194

Thumb 171, 177, 180Trommer 9


Names Index

Vaillant 177Vajda 20Van den Bosch 164Voegtlin 152Von Hinüber 175

Walther 6Weiers 203

Whitney 145Wiese 179Wunderlich 187

Yu 107

Zúñiga 203Zwicky 36, 54, 76, 83


Subject Index

ablautapophony 148vowel alternation 22, 36, 43, 123–4, 145

acquisition of morphology 25, 70, 87, 89,86 n. 9, 89, 129–30, 141–8, 150, 151, 153–7,159, 163–5, 185–204

agreement 9, 20, 54–5, 58, 61–8, 93–116, 102,174 n. 12, 185–92, 197–203

allomorphy 7–8, 13, 15–16, 23–4, 46–7, 50,73–4, 76–7, 79 n. 5, 81–2, 86–8, 89, 91,125 n. 3, 202

architecture of grammar 20, 24, 49, 142, 166

circumfix 21, 148, 164clitics 20, 24compounding 18, 25, 65–6connectionism 142, 144, 148, 150, 169, 175,

177, 179cumulative exponence 21, 44, 45, 51, 169

defectiveness 22, 65deponency 22, 38–9, 65derivation 18, 20, 40, 41, 45, 48–9, 50, 99,

100, 107, 142, 151, 175 n. 14dynamic systems 142, 151, 155, 166

empty morph 15, 21entropy 5–6, 77, 119, 126, 127, 128–9, 136 n. 9,

142 n. 1, 170, 173 n. 10exponence, rule of 35, 36, 37, 40, 75, 77, 79,

81–5, 91, 195 n. 8

faithfulness 13, 16frequency 88, 103–4, 151, 163, 165, 168–75, 177,

179, 180

Grammaticalization 24, 66

homonymy (accidental identity) 84

incorporation 71infixation 9, 18, 21, 93–116inflection class 4–6, 8–9, 29, 30, 49, 50, 69,

86, 88 n. 10, 90, 119–20, 124–6, 128, 130,132, 136–7, 149 n. 3, 168 n. 2, 172 n. 9

information theory 168

learning see acquisition of morphologylearning algorithms 143, 164, 187, 192–202Lexical Phonology 24lexical specification 4, 98 n. 6, 111–16

markedness 9, 13, 16, 167–9, 176, 182n. 24, 191

morphological features 5, 75, 192morphology

abstractive 141–66constructive 141, 143, 151word-based 9, 35, 141

morphome 8, 29–30, 32 n. 1, 39, 43–51multiple exponence 21, 67, 101, 200, 201

Optimality Theory 13, 195

paradigm function 35, 76, 142phonology 3, 4, 9, 13, 15–16, 20, 23–4, 29, 30,

35–6, 39–48, 50, 51, 75, 84, 86, 87, 93, 94,95, 99–100, 101–14, 116, 120, 124, 126,142, 149 n. 3, 152, 164, 174 n. 13, 175, 176,177, 180 n. 23, 183 n. 25, 185 n. 1, 187 n. 3,192, 197 n. 11, 201, 202, 203 n. 15

portmanteau form 44, 54, 57, 62, 65, 71, 188,199, 203

predictability 5–6, 8, 20, 23, 65–8, 119, 125–7,136–8

processing 141–3, 145, 147, 149, 150–2, 155,163–6, 202

rarity 18, 53, 54, 61, 93, 169, 175, 182 n. 24, 183redundancy 7, 141, 169, 170 n. 4referral, rule of 8, 35–40, 42, 48, 50, 53–5,

58–65, 68, 81, 84–6, 169 n. 3, 173 n. 11, 194

segmentation 9, 69–71, 76–7, 86 n. 9, 91, 159,164, 177, 186–92, 195–9, 201–2

self-organisation 142, 143, 151, 155, 161, 164semantics 3, 8, 9, 14–15, 19–20, 23, 29, 36,

70 n. 2, 71–2, 74, 75–6, 78–9, 83, 89, 91,114–15, 174 n. 12, 180

structuralism 21, 77subanalysis 159, 186–9, 191, 192, 196–202subanalysis complexity classes 189–92,

199–203suppletion 4, 54, 57, 60–1, 65, 67–8


Subject Index

syncretism 9, 22, 29, 30–7, 39, 42, 57–8, 62–3,93, 96–9, 116, 167–84, 188, 201

syntax 3–8, 12–16, 24–5, 69, 89, 95, 100,174 n.12, 175, 178, 186

takeover 58, 60, 63–5, 67, 68typology 5, 6, 7, 8, 11, 23, 125 n. 4

umlaut 22, 24underspecification 70, 81–3, 85, 90 n. 11,

91, 169 n. 3, 173 n. 11, 179 n. 22, 180,193, 194

zero exponence 21, 71, 97, 105, 172 n. 9,185–204

Date post:	27-May-2019
Category:	Documents
Upload:	truongthuan
View:	215 times
Download:	0 times

Understanding and Measuring Morphological Complexity · IE Indo-European IMP imperfective IN...

Documents