Chhany Sak-Humphry 1996 1
CHAPTER 1
INTRODUCTION
1.1 Background of the Language
Modem Khmer or Cambodian is the official language used in Kampuchea or
Cambodia. According to Judith Jacob (1960:351; 1965:143),Modem Khmer is
considered to extend from about AD 1800to the present. Khmer is a member of the
Mon-Khmer subgroup of the Austroasiatic family of languages. Khmer is spoken by
people who live in Cambodia and by sizable communities who live in the Mekong Delta
area of southern Vietnam, and in northern Thailand. In the last twenty years, the majority
of Khmer speakers outside of the country are in America, France and Australia.
Khmer language has a long literary tradition of 1,500years. Native Khmer words
are monosyllabic or disyllabic. Words ofIndic origin tend to be polysyllabic. The
Cambodian language has been subjected to the influence of Sanskrit, Pali, Thai,
Vietnamese, Chinese, French and English, just to name a few. Most of the colloquial
speech relies on native Mon-Khmer words, but any elevation in style or discussion on
topics of a political, cultural, economic or environmental nature and technological words
bring in many words borrowed from Sanskrit, Pali, Chinese, French, and more recently
English. The majority of Cambodians are monolingual; however, in the last twenty
---~
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 2
years, many of the Khmer population bordering Thailand, Laos and Vietnam have
become bilingual. Since the 1993Cambodian election, French and English are the
dominant foreign languages in the city for the educated, but in local Khmer life, Thai,
Chinese and Vietnamese have an advantage over the other two. The Khmer language,
like its country, is in a state of shock with rapid changes.
1.2 Previous Analyses
Very limited linguistic research has been done on Khmer phonology, morphology,
semantics, and grammar, especially in the area of syntax.
I will briefly present a general overview of the previous works on Khmer
grammar.
Maspero's work (1915) Grammaire de la Langue Khmere was one of the earliest.
It is based on the traditional European grammar approach which relies on semantics to
establish word classes. In addition, much of the vocabulary and style are no longer in
common use.
Gorgoniev's work (in Russian) is unavailable, thus I am unable to make any
comparative analysis.
From the point of view of generative grammar, the structuralist analyses by
linguists such as of Huffman (1967) An Outline of Cambodian Grammar, Noss (1966)
Cambodian Basic Course, Jacob (1968) Introduction to Cambodian, and Ehrman (1972)
---
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 3
Contemporary Cambodian: GrammaticalSketch are inadequate in the following
respects:
(1) They focus mostly on morphology and are not explicit. Thus they cannot be
objectively tested;
(2) Most are pedagogical materials for language teaching and learning rather
than systematic descriptions of the structures of the language; and
(3) They are in some ways too narrow and specific in the sense that they provide
only prose statements about the individual patterns observed in a particular corpus. They
achieve only language-internal generalizations, and since they are not designed to be
consistent with any general linguistic theory, they have no cross-linguistic implications.
In addition these works on syntax are inadequate in the following ways:
(1) There is no comprehensive detailed analysis of the internal structure of
phrases and clauses;
(2) No language-internalor cross-linguistic predictions were made and none
could therefore be tested; and
(3) Most importantly, none of them has done a comprehensive analysis of
modem Khmer nouns and noun phrases.
--- ~-- -- - -~ -- --
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996
-- - - -
4
1.3 Goals and Objectives of the Study
The scope of this dissertation will be limited to the subcategorization of nouns and
their dependency relationships to their regents and dependents (attributes). The purpose
of this study is fourfold:
(1) To do a complete Lexicase dependency grammar analysis of the grammatical
characteristics of nouns and noun phrases in Modem Khmer;
(2) To test the theory by determining to what extent the grammatical properties
of Khmer can be described and explained within this formal and explicit theory, and to
identify any areas in which the data prove to be incompatible with the claims made by the
theory, thereby possibly necessitating a modification of the theory itself;
(3) To re-evaluate and redefine the definition of pronouns and their
characteristics in relation to their regents and dependents; and
(4) To take and justify positions on controversial questions such as the status of
classifiers, number words, and 'adjectives', the distinction between relator nouns and
prepositions, and the structure of the indirect relative clause and indirect possessive
constructions.
The following is the outline of the dissertation:
Chapter 1, apart from specifying the goals and objectives of this study, presents
information on previous work in Khmer grammar, a brief conception of lexicase
dependency grammar, the source of the data and the Khmer orthographic representation.
- - - - -
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 5
Chapter 2 gives an overview of Khmer clause-level structures, including the basic
patterns of verbal and verbless sentences, conjoined sentences, and prepositional phrase
constructions in Khmer.
Chapter 3 focuses on the grammatical classification of Khmer nouns and the
classification ofNP structures.
Chapter 4 presents an analysis of the anaphoric noun qaa.
Chapter 5 investigates the syntactic distribution of Khmer pronouns and revises
the generallexicase definition of pronouns.
Chapter 6 presents a justification of the claim that classifiers are nouns and not
adjectives, and describes their relationship with their noun regents.
Chapter 7 presents an investigation of the extension noun class, which includes
the relative noun daael and the non-relative noun kaar.
Chapter 8 discusses the syntacticproperties of locational relator nouns and their
distributions, and of the non-Iocationalrelator noun rbah2.
Chapter 9 examines the justification that number words are nouns. In addition it
will show their dependency relationships with other constituents.
Chapter 10 examines the identification of non-number nouns or independent
nouns which are location nouns and ordinary nouns in Khmer.
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 6
Chapter 11 describes multiple dependent constructions and the relationship among
possessive, locative, equative and prepositional phrases when occurring as dependents of
a single head noun.
Chapter 12presents a summary of conclusions and will point out some
generalizations, some contributions of this study to the lexicase model, remaining
problems, and suggestions for further study.
1.4 Theoretical Framework: Lexicase Dependency Grammar
This investigation is formulated within the lexicase dependency grammar
framework. Lexicase dependency grammar was developed by Professor Stanley Starosta
in the early 1970s at the University of Hawaii. Lexicase is a highly constrained grammar
of words, with no deep structure, no transformations, and no phrase structure rules. It is
generative, that is, it is explicit and formalized. A lexicase grammar is a set of
generalizations about the internal compositions, external distributions, and lexical
interrelationships of the words in a language (Starosta 1988:2). This theory attempts to
capture cross-linguistic generalizations, and thus makes claims about human language in
general.
A grammar consists of the lexicon which contains words and a set of
generalizations about the lexicon. Each word in the lexicon is marked for the classes of
words which can be dependent on it. A sentence is any sequence of words such that
every word is linked to one of the other words in accordance with features specified on
the two words. The syntactic information about the sequential and hierarchical
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 7
characteristics of words in phrases, clauses and sentences is marked by contextual
features marked on the lexical heads of construction. Thus, the lexical entry of each word
will include contextual features which specify the subordinate attributive words with
which it may occur (Starosta 1988:56).
The lexicon contains only actually occurring words. All terminal nodes are words
and every word in a grammar is a member of one and only one of a restricted set of the
following syntactic word classes: noun (N), verb (V), adjective (Adj), adverb (Adv),
determiner (Det), preposition (P), conjunction (Cnjc) and sentence particle (Sprt)
(Starosta 1988:27).
The following section presents a brief definition of the basic lexicase dependency
concepts. A comprehensive discussion ofthe theory can be found in The Case For
Lexicase: An Outline of Lexicase Grammatical Theory (Starosta 1988).
1.4.1 Syntactic Dependency Relationship
Dependency is a relationship between two words, a regent and a dependent. A
regent is lexically marked for the kinds of features it allows or requires on its dependents.
A dependent is marked for features that are required or allowed by its regent; see stemma
representation of example 1.
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 8
1.4.2 Features
Features represent the properties of each lexical item, and generalizations are
expressed about relationships among sets of features (Starosta 1988:52). There are two
basic types of features: contextual (e.g., 1[+Nom], 1[+AGT)) and non-contextual features
(e.g., [+V], [+N], [+pmn], [+lctn)); see stemma representation of example 1.
Contextual features specify the grammatical relationships between the regents and
their sister dependents in terms of immediate dependency (the relationship between a
head and its sister dependent) and linear precedence (the linear ordering relationship
between the regent and one or more dependents).
1.4.3 Case Forms and Case Relations
Case markers are grammatical devices which realize the case forms and signal the
presence of case relations.
'Case form' refers to lexical features that are assigned to or appear on the matrices
of nouns and prepositions (e.g., [+Nom] nominative, [+Acc]accusative, [+lctn] location).
Case forms are universal in scope but their realization varies from language to language.
Case relations in lexicase are universally limited to five: Patient (PAT), Agent
(AGT), Correspondent (COR), Locus (LOC), and Means (MNS). Case relations are
syntactic-semantic relations obtaining between non-predicate nouns and their regents.
The predicate clause has a predicate word [+prdc]as the head of a clause.
---- ---- --
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 9
1.4.4 Endocentric Constructions vs. Exocentric Constructions
In lexicase, an endocentric construction has only one obligatory word and zero or
more dependents (e.g., rfi::Jnis the regent of khflom, nfl) and n~w2)' An exocentric
construction has more than one obligatory word (a lexical head and one or more
dependent words). There are two kinds of exocentric constructions: prepositional phrases
and coordinate constructions. In example 1, n~w/ hawaii forms an exocentric
prepositional phrase where n~w2and hawaii are both heads of their own phrases. The
coordinate conjunction has the conjunction nfg as the head and the nouns phii::Jsaa
khmeeT and phii::Jsaa lii::Jwas its dependents.
1.4.5 Complements and Adjuncts
A complement is a constituent which is required by its regent verb (e.g., khflom,
phii::Jsaa khmeer and phii::Jsaa lii::Jware complements of the verb rfi::Jn)and must have a
fixed position. An adjunct is optional to the construction (e.g., the prepositional phrase
n~w2 hawaii is allowed but not required by the verb rfi::Jn)and may have no fixed
position (e.g., n~w2 hawaii khflom rfi::Jnphii::Jsaa khmeer nfg phii::Jsaa lii::JW). Nouns
can have complements and adjuncts too.
I In this dissertation there are two homophonous forms of n:iwj and n:iw2' The word n:iw2 has a
syntactic function as a preposition and n:fwj has a syntactic function as a verb. However for simplicity to
the readers n:iwj is also referred to as n:iw.
-- - -- --- --~- --
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 1. khJ1om
1rii~n
study
phii~saa lii~w n~w2Lao language locate'I study Khmer and Lao in Hawaii.'
phii~saa khmeerKhmer language
hawaiiHawaii
1.4.6 Lexicase Dependency Stemma Representation
1.4.7 Interpretation of the Diagram
phii~saa lii~wSndex+N
6<7[+N]
n~w26ndex+P+lctn7[+N]
The transitive verb rfj{}D[+V, +trns] is the main verb of the sentence (the
10
nygand
hawaii7ndex+N
+prpr
distribution of this word is marked by its indexing), and functions as the head of the
construction. Its contextual features I([+ND, I[+Nom], I[+AGT], 4([+ND, 4[+Acc],
4[+PAT], 6([+PD, 6([+lctn D, and 6([+LOCD, imply that the transitive verb rfj{}Drequires
a Nominative Agent (Nom-AGT), an Accusative Patient (Acc-PAT) and an optional or
-- -- - - - - - --
riin2ndex .
khJ1om +V nygIndex +trns 4ndex+N I([+ND phiisaa khmeer +Cnjc+pmn I[+Nom] 3ndex 3[+N]
I [+AGT] +N S[+N]4([+ND 4>3[+N]4[+Acc] 4<S[+N]4[+PAT]6([+PD6([ +lctn D
6([+LOCD2>1[+Nom]2<4[+Acc]2<6[+lctn]
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 11
adjunct locational prepositional phrase. The 2>1[+Nom],2<4[+Acc], 2<6[+lctn] imply
that the verb rikm has the [+Nom] noun khflom placed before it, the [+Acc] nouns
following it, and the preposition location n§w2at the sixth index. The conjunction nftJ
[4ndex] is the head of the exocentric coordinate construction which has nouns phii:1saa
khmet:r andphii:1saaJji:1wasits dependents. This word nftJ is interpreted as bearing the
category features of its immediate dependents. The details of this convention have not
yet been formalized in lexicase dependency grammar. The exocentric prepositional
phrase n§w2 hawaiihas the preposition n§w2as its head and the noun hawaii as its
dependent. The preposition n§w2 [6ndex] bears the location feature [+lctn] and has the
contextual feature [7[+N]]marking hawaii as its obligatory dependent. The prepositional
phrase as a whole carries the locus LOC case relation and the location [+1ctn]case form.
1.5 Data and Orthography
1.5.1 Data
The data used in this investigation, Khmer sentences, have been constructed by
the author, a native speaker of Khmer. These sentences were cross-checked with other
Khmer speakers in order to identify any idiosyncraticproperties that do not generalize for
the language as a whole. They are based on my intuition, and reflect colloquial speech
rather than formal or descriptive language. To accommodate a wide audience, the
examples are written in phonemic transcription (Jenner and Pou 1980-81). Each word is
glossed in English and then a free translation is given for the sentence as a whole.
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 12
1.5.2 Orthographic Representation
Modem Khmer uses a script which consists of 33 basic consonant symbols and 12
independent vowel symbols. In addition there are 16vowel symbols and 31 (subscript)
consonant symbols which can be used only in combination with the various basic
symbols. Khmer has 10 diacritical marks which are used to modify the sounds of the
independent and dependent symbols.
Each Khmer consonant symbol carries an inherent vowel sound. These consonant
sounds are either aa qakkhoosak'voiceless', or aakhoosak'voiced' and form two series
of consonants called the high register and the low register. The 16 vowel symbols may
have different pronunciations depending on whether they are used with aa or aa series
consonants. The 'voiceless' consonants are pronounced with the inherent aavowel and
the 'voiced' consonants are pronounced with the inherent aavowel.
The following table gives the Khmer script forms in their alphabetical order
together with their phonemic transcriptions.
-- - - --- ~ -
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 Table 1.1 Khmer Consonants, Monophthongs and Diphthongs
DIPHTHONGS
Complex
Long
Broken
Long and High
Short and mid
aac, aa~, aao
ii~, li~, yY~, yY~, uu~, ilU~
e~, o~
13
CONSONANT PHONEMESbilabial dental palatal velar glottal
Voiceless stops p t c k q
Voiced implosive stops b d
Nasal resonants m n J1 g
Voiced liquids r 1
Voiceless spirants s h
Semivowels w J
MONOPHTHONGS
Front Nonfront Back RoundedUnrounded Unrounded
long short long short long short
High 11 1 yY yY y y uu u
LowerhighHighermid ee eMeanmid ee e 60 00 6 0
Lower mid cc :):) :)
Low aa a aa a
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 Table 1.2 Phonemic Transcription of Khmer Consonants
14
---
TRANSCRIPTION OF MODERN KHMER
Consonant SymbolsKhmer Phonemic Khmer Phonemic
Transcription Transcription
Ii k- -k u b- [6] + V,,
I) kh- -kh -p,
Ii k- -k p + C,,
m kh- -kh 11 ph-, -p,
i1 1] n p-, -p
(j c- -c Ii ph-, -p,
iJ ch-, -c y m
£j c- -c 111 J-, -J,
Jill ch-, -c i r- -r,
ill p ill 1- -1,
tI d [d]-,-t t w- -w,
t1 th-, -t fU s- -h,
(I t- -t m h- -h, '
OJ th-, -t ij I-
nIl n it q-, -q
Ii d-t- +V,
- t
t- + C,
£j th- -t,
1/ t- -t,
ti th-, -t
S n
Chhan
y Sak
-Hum
phry
1996
Chhany Sak-Humphry 1996 15
Table 1.3 Phonemic Transcription of Khmer Vowels
Vowel SymbolsKhmer Phonemic Transcription
Syllabic Conjunct High Register Low Register
---~
HHH1Hli'f
-
:>:>
u-u:)1m
e:) - 0:)
1
-ih.
i'fJ
-
-
y-yhyYu
.
\1
~ vuu
"-if:
~~
uu:)
;):)
-;)h
yY:)
Ii:)
yyBUi1
rY
ryY
ly
lyYee
-eh
££
-£h
;)j
60
-6h
i
i:
i1 i
i
nn:i1 ;)W
-urn
-urn
-o:)m
-e:)I]
-e:)h
n5
~J.
.1
111
:
-J
-<7
- - ~ -- - ---
11
-11:)-- -yy:)-
-i:)-- -y:)-
-uu:)- - -u:)-
00
aaaa~-~h
ee + F
~:)j
Y-yhyY0
00 + FoowUU:)
aa:)
-a:)h
yY:)
Ii:)
-
-
-
-
ee
-eh
aa£
-aeh
aJ
aao
-aoh
aw-om-om-am
-aIJ-ah
Chhan
y Sak
-Hum
phry
1996