Maija BaltiņaEverita Andronova

Normunds Grūzītis

Institute of Mathematics and Computer ScienceUniversity of Latvia

A brief historical background how the idea of corpus development came up

A general introduction to the Corpus of Early Written Latvian Texts

The place of Glück’s Bible data in our corpus

Some remarks on the corpus exploration

A brief historical backgroundof the idea of corpus

development 1930s Jānis Endzelīns: Thesaurus linguae letticae where all the Latvian words should be collected both from spoken language and written texts (Endzelīns 1933: 818)

K. Mühlenbachs Lettisch –deutsches Wörterbuch (1923-32; revised and supplemented by J. Endzelīns and E. Hauzenberga 1934-1946)

V.Rūķe –Draviņa: encouraged to aware a complete material of the Latvian language and proposed to develop A dictionary of the early written texts (Rūķe 1961)

1990s the idea to start work on The historical dictionary of the Latvian language (Baldunčiks 1994)

Collecting early printed texts

End of 1980s Maija Baltiņa started her work on the analysis of the Book of Job.

1992-1994 the Glück's Bible has been input at the Laboratory of Artificial Intelligence

1995-96 - database of early written Latvian texts has been initiated and several early printed Latvian sources (from the 17th century) have been prepared in an electronic form (Luther`s ENCHIRIDION (1615); Langewünschte Lettische Postill by G. Mancelius in 3 parts (1654); Lettische geistliche Lieder und Kollekten.. (1685); Lettische geistliche Lieder und Psalmen.. (1685); Extract from "Krigz-Artiklar.." (published in Stockholm, 1683) (1694))

"FrakturaSpecial" fonts (Latvian software company Tilde, 1996)

1997 a research on Glück's translation has been carried out under the supervision of Maija Baltiņa

Call for dictionary again

2001 prof. Trevor Fennell invited and encouraged scholars of Latvia to start work on the Dictionary of Old Latvian.

Pilot project on The development of electronic dictionary of the 17th century (2002, funded by Latvian Culture Foundation).

Joint project of the Department of Baltic languages, Faculty of Philology, UL and the IMCS, UL, project was lead by prof. P.Vanags.

A test bed and contributed towards building up the first Latvian corpus available publicly.

Pilot Dictionary

To create a dictionary of the 17th century

June to December, 2002 our team of 7 people from Department of Baltic Languages, IMCS, and Institute of the Latvian language carried out several tasks:

1) to check previously keyboarded 17th century texts

2) scanning texts

3) new sources have been added (Evangelia und Episteln (1615); Psalmen und geistliche Lieder.. (1615); Lettisch Vade mecum. (1631), Die Sprüche Salomonis.. (1637), some legal texts (1684;1696)

4) Corpus of Early Written Latvian Texts has been launched in January 2003

Corpus of Early Written Latvian Texts

Corpus development:

1) selection of texts ("Gesamtkatalog Die Älteren Drucke in Lettischer Sprache (1525-1855)"

2) text preparation

3) software development (Normunds Grūzītis, AILab): word list, extended context, search in an index, a reverse dictionary, processing results are available also for downloading, concordancer

Static indices

Dynamic indices

Context (NT)

Frequency lists

Inverse Dictionaries

The aim of the corpus

Is to promote and facilitate the diachronic study of Latvian as well as to offer a new data to those interested in the development and varieties of the language. It will serve as basis for the dictionary of the 16th - 18th century.

Our tasks in future

1) To continue a work on corpus building.2) To add more possibilities for corpus exploitation (detailed statistical analysis, sorting possibilities in search results, collocation analysis).

Place of Glück's Bible Data in our Corpus




Sermon Book

Glück’s Bible


Top Frequencies

Use of Corpus



Translation strategy and different translation phases of the Bible

Data in the history of the Latvian language

Use of Corpus

Let’s have a look at 4 Latvian texts:

Evangelia und Episteln (1615);

Lettisch Vade mecum. (1631) (editited and supplemented by G. Mancelius);

Langewünschte Lettische Postill by G. Mancelius (1654);

New Testament by E. Glück (1685).

Sample of Mt 21:2 2. und sprach zu ihnen: Gehet hin in den Flecken, der vor euch liegt, und bald werdet ihr eine Eselin finden angebunden und ein Füllen bei ihr. Löset sie auf und führet sie zu mir!


Vnde ßatczy vs24: thems / No eyget exkan tho Jelgouwe /1: kattra prexan yums gir / Vnde tudelyn yuus attra§-2: §eth pe§ete wene E§elemate / vnd wene kummelle py täs /3: atrayßet thös / vnde at weeddet thös py man /



vnd śatziya vs teems: No6: eita tanni Jällgawa` / kattra juh§śo preek§cha gir / vnd tu-7: deļļ attra§śeeta juhs pee§śeetu (alij, peerißtu ) weenu E§eļa=8: maht / vnd weenu Kumeļu py tahß / attrai§śiyu§chi /9: (attri§śu§chi) attweddeta tohß py man /

Manc1631_LVM 24.lpp. 11: und śatziya us teem : No-12: eita tanni Jällgawa` / kattra juh§śo13: preek§cha gir / und tudeļļ attra§śeeta juhs pee§śetu1: weenu E§eļamaht / unnd weenu Kumeļu py tahß /2: attrai§śiyu§chi attri§śu§§chi / attweddeta tohs py man /

Manc1654_LP1 1.-2.lpp.


2. pants

Un śazzija us teem: Eitetanni^ Mee§ta^/ kas juhśo preekścha^;un tuhdaļ atraddiśeet juhs peeśee-tu weenu Eh§eļa Mahti/ un wee-nu Kummeļu pee thas: Atrai-śijśchi atweddeet tohs pee man-nim:

JT 1685 Mt21:2


2 λεγων αυτοις πορευθητε εις την κωμην την απεναντι υμων και ευθεως ευρησετε ονον δεδεμενην και πωλον μετ αυτης λυσαντες αγαγετε μοι

TR1550 MATT21:2 http://bible.gospelcom.net/bible?passage=MATT+21&language=greek&version=TR1550

  2 Saying unto them, Go into the village over against you, and straightway ye shall find an ass tied, and a colt with her: loose them, and bring them unto me.

Matthew 21:2 (King James)http://www.biblegateway.com/passage/?book_id=47&chapter=21&version=9


reduction of suffix in participles

attrai§śiyu§chi / (attri§śu§chi) (1631) vs. attrai§śiyu§chi attri§śu§§chi (1654) vs.atraiśijśchi (1685)

Word formation and morphology

a) zero-ending of Accusative (for ē-stem nouns) vs full case form

E§elemate (1615) vs. E§eļa=maht (1631) vs.E§eļamaht (1654) vs. Eh§eļa Mahti (1685)

b) kept vs. lost old Dat.Pl. form with –ms

vs thems (1615) vs. vs teems (1631) vs. us teem (1654; 1685) c) pronoun katrs vs. kas kattra (1615; 1631; 1654) vs. kas (1685)


d) archaic relict in future tense of verbs attra§§eth (1615) vs. attra§śeeta (1631; 1654) vs. atraddiśeet (1685)  e) prefixed verb vs. verb with no prefix noeyget (1615) vs. noeita (1631;1654) vs. eite (1685) f) compound vs. collocation

E§elemate (1615) vs. E§eļa=maht (1631) vs. E§eļamaht (1654) vs. Eh§eļa Mahti (1685)


preposition government:py man (1615; 1631; 1654) vs. pee mannim (1685)

 in corpus:no katru M. – no kā Gl.pī töw M. – pie tewis Gl.pī man M. – pie manim Gl.priekšan töw M. – priekš tevim, tavā priekšā Gl.  Glück:krājiet sevim mantas debesīs; darīsim trīs būdas / tevim vienu; lai tā manim palīdz.


Jelgouwe (1615), Jällgawa` (1631; 1654) ‘town’ vs. Mee§ta^ (1685) pils ‘eine feste Stadt’ and Jelgava ‘eine offene Stadt’ (Ph.L. XXXVIII)miests ‘den kleines Städtchen, ein Flecken’ related to li. miẽstas (from slav. město) pe§ete wene E§elemate (1615) vs. pee§śeetu (alij, peerißtu ) (1631) vs. pee§śetu (1654) vs. peeśeetu (1685)


Some influence of German language in these texts

a) construction of preposition ex(k)an vs. Locative form

exkan tho Jelgouwe (1615) vs. tanni Jällgawa` (1631; 1654) vs.tanni^ Mee§ta^ (1685)

b) pronoun viens ‘one’ serve as indefinite article (influence of German ein)  yuus attra§§eth pe§ete wene E§elemate / vnd wene kummelle py täs (1615), attra§śeeta juhs pee§śeetu (alij, peerißtu ) weenu E§eļa=maht / vnd weenu Kumeļu py tahß (1631), attra§śeeta juhs pee§śetu weenu E§eļamaht / unnd weenu Kumeļu py tahß (1654), atraddiśeet juhs peeśeetu weenu Eh§eļa Mahti/ un weenu Kummeļu pee thas (1685)


Thank You!Thank You!

Wellcome toWellcome to www.ailab.lv/SENIE

