Post on 21-Apr-2018
transcript
1
:
Abstract
The paper is the first public presentation of the research programme Basic Corpus of Greek Texts , under the co-operation of the University of Athens and the University of Cyprus, aiming at build ing a new, extensive and representative corpus of Greek. In particular, the Corpus of Greek Texts (CGT) is envisaged as collecting a substantial amount of data (30 million words) in a short time span (1-2 years) as a basis for linguistic research and a resource for teaching applications. The scope and representativeness of the genres included, as well as free accessibility to it, will make CGT one of the most necessary tools for the study of Greek. The paper presents the research area, the aims and needs of the programme, the identity and structure of the CGT, as well as the methodological issues and linguistic implications and applications related with the compilation of the corpus.
-
, ( )
1.
« »,
.
,
.1
( ), ( )
.
, ,
,
.
(computer corpus
linguistics).
,
, , ,
. . ( 1999: 168,
Leech 1992, Sinclair 1991). «
, , »
( 1999: 170, . Aarts 1991)
2
( . .: 219). ,
,
(Leech 1992: 106).
Sinclair (1996),
,
,
.
, ,
,
.
,
( . ).
Kennedy, «
, » (1998:
291). ,
.
, , , , ,
, , ,
( 1999: 170).
.
2.
,
, ,
,
.
:
-
/
- ,
- ,
- ( . . , . .)
3
-
.
, (30
) (1-2 )
.
,
.
.
,
( . Goutsos, King Hatzidaki 1994, ). ,
90,
( . .) 10 .
,
( . . ),
, , ,
.
, ,
. ,
, :
:
BNC Corpus: 100 . (10 . )
Bank of English Corpus: 329 .
(60 . )
Cancode corpus: 5 .
:
Ottawa-Hull: 3,5 .
ELRA Parole Corpus: 20 .
TLF: 150 .
:
Mannheim Corpus: 8 .
Muenster Textbank: 94 .
:
Pisa Corpus: 10 .
: Corpus Oral De Referencia Del Espanol: 1,1 .
Mark Davies Modern Newspapers: 35 .
: Mark Davies Modern Newspapers 26 .
: INL 1995 27 .
INL 1996 38 .
: 10 . .
( . Goutsos, Hatzidaki, King 1994)
( ) ,
4
, ,
.
.
( . , . Renouf 1987 ,
). (30 )
, . ,
, Cobuild ,
, 20 (Sinclair
1987).
. ,
,
(Georgakopoulou Goutsos 1998).
3.
,
,
. ,
.
1990
. , .
:
: 30
:
3 . (10 %)
:
0,5 .
:
1,5 .
, :
1 .
:
27 . (90 %)
: :
5 .
: :
5 .
:
5 .
: :
5 .
: :
5 .
:
2 .
5
3
(1 2
).
,
.
( , , , . .)
.
. ,
-
.
4.
,
« » , ,
Sinclair (1996), .2
« »
. Barnbrook (1996:
24),
.
, ,
,
BNC ( . )
,
, , , Bank of English
(monitor corpus),
( . Barnbrook 1996: 25).
(Biber, Conrad Reppen 1998: 248).
(
, , , . .),
( 1999: 56)
.
Ku era (2002)
,
6
, , , « »
.
(30 ),
.
,
,
( . . , Cobuild ,
).
, ,
, . ,
( ) .
( .
) ( )
.
,
,
,
.3
, ,
,
.
.
,
, , ,
.4
,
,
. (
,
).
,
,
- .
,
( . )
,
,
. ,
,
. ,
.
7
5.
:
/
( )
( )
,
.
, ,
. ,
, , ,
. , ,
( , , , . .).5
, , ( . .
, , . .) ASCII. ,
, , , . .,
. ,
. ,
, , ,
, . ., .
:
: -
: , , , , , , , ,
: -
: , , - , , , , , , ,
,
- : 01-99
: -
-
,
. ,
8
- · ,
-
. , ,
( -
)
. - ,
.
,
.
. (2003),
,
,
. (2004)
, ,
.
. ,
. ,
.
12 . ,
:
www.ucy.ac.cy/ sek.
6.
,
. ,
:
) :
,
, ,
, -
( . Chafe, Du Bois Thompson 1991: 64-66).
,
, Goutsos, Hatzidaki King (1994), , King
(1995), Georgakopoulou Goutsos (1998) Goutsos (1999).
) :
( .
Wichmann, Fligelstone, McEnery Knowles 1997, , ).
9
, ,
, CD .
) :
:
1) ,
,
2) ,
(1998),
3) ,
, ( . . Perseus Project
).
) :
.
,
(tagging)
(annotation) .
,
.
, , .
-
50.000
200.000
50.000
200.000
1 1
50.000
1 1
100.000
1 1
50.000
1 2-3
50.000
1
150.000
1
100.000
250.000
1.250.000
1 1
200.000
1 2-3
200.000
1 1
90.000
10.000
2.052.000
1.539.000
564.300
10
410.400
359.100
102.600
51.300
51.300
270.000
/
1.166.400
1.166.400
.
583.200
/
1.270.080
1.270.080
.
635.400
64.800
1.382.400
/
1.382.400
691.200
259.200
/
259.200
129.600
86.400
/
86.400
43.200
777.600
/
648.000
-
388.800
- -
259.200
129.600
129.600
259.200
259.200
/
216.000
-
129.600
- -
86.400
43.200
43.200
86.400
/
777.600
777.600
-
518.400
- -
518.400
518.400
518.400
518.400
518.400
259.200
259.200
-
756.000
-
756.000
32.400
.
81.000
32.400
11
.
81.000
32.400
64.800
108.000
216.000
1 « » :
:
( )
( )
:
( , )
(Department of Byzantine and Modern Greek Studies, King s College London)
Philip King (School of English
EISU, Birmingham)
-
-
- ( )
:
« »,
:
,
( )
:
,
, ,
, .
2 .
3 , (10 %), , ,
, (Goutsos, King & Hatzidaki 1994).
4 .
5 , ,
.
Aarts, Jan. 1991. Intuition-based and observation-based grammars . English Corpus Linguistics,
. Karen Aijmer Bengt Altenberg, 44-62. London: Longman.
Barnbrook, Geoff. 1996. Language and Computers. Edinburgh: Edinburgh University Press.
12
Biber, Douglas, Conrad , Susan
Reppen, Randi. 1998. Corpus Linguistics. Investigating
Language, Structure and Use. Cambridge: Cambridge University Press.
Chafe, Wallace L., Du Bois, John W.
Thompson, Sandra A. 1991. Towards a new corpus of
spoken American English . English Corpus Linguistics, . Karen Aijmer
Bengt
Altenberg, 64-82. London: Longman.
Georgakopoulou, Alexandra
Goutsos, Dionysis. 1998. Conjunctions versus d iscourse
markers in Greek: The interaction of frequency, positions and functions in context .
Linguistics 36 (5). 887-917.
, . 1999. . :
.
Goutsos, Dionysis. 1999. Translation in bilingual lexicography. Editing a new English-Greek
Dictionary . Babel 45 (2). 107-126.
Goutsos, Dionysis, Hatzidaki, Rania
King, Philip 1994. A corpus-based approach to
Modern Greek language research and teaching . Themes in Greek Linguistics: Papers from
the First International Conference on Greek Linguistics. Reading, September 1993. . Irene
Philippaki-Warburton, Katerina Nicolaid is
Maria Sifianou, 507-513.
Amsterdam/Philadelphia: John Benjamins.
Goutsos, Dionysis, King, Philip
Hatzidaki, Rania. 1994. Towards a Corpus of Spoken
Modern Greek . Literary and Linguistic Computing 9 (3). 215-223.
, , King, Philip
, . 1995. corpus
. .
15
, 11-14 1994, 843-854. :
.
Kennedy, Graeme. 1998. An Introduction to Corpus Linguistics. London: Longman.
Ku era, Karel. 2002. The Czech National Corpus: Principles, design and results . Literary and
Linguistic Computing 17 (2). 245-257.
Leech, Geoffrey 1992. Corpora and theories of linguistic performance . Directions in Corpus
Linguistics, . Jan Svartvik, 105-122. Berlin/New York: Mouton de Gruyter.
, . 1998. . :
.
, . 1999. . . :
Gutenberg.
Renouf, Antoinette. 1987. Corpus development . Looking Up. . John Sinclair, 1-40. London
and Glasgow: Collins ELT.
Sinclair, John. ( .) 1987. Looking Up. London and Glasgow: Collins ELT.
Sinclair, John. 1991. Corpus, Concordance, Collocation. Oxford: Oxford University Press.
Sinclair, John. 1996. Preliminary recommendations on corpus typology . EAGLES
( http: / / www.ilc.pi.cnr.it/ EAGLES/ corpustyp/ corpustyp.html).
13
Wichmann, Anne, Fligelstone, Steven, McEnery, Tony Knowles,
Gerry ( .). 1997.
Teaching and Language Corpora. London: Longman.
, . ( ).
.
(19-22 2001).
This document was created with Win2PDF available at http://www.daneprairie.com.The unregistered version of Win2PDF is for evaluation or non-commercial use only.