Date post: | 14-Dec-2015 |
Category: |
Documents |
Upload: | jaron-croll |
View: | 213 times |
Download: | 0 times |
Using a parallel corpus in translation
practice and research
Ana Frankenberg-Garcia
The study of human translation
Traditionally not a hard scienceDifficult to be systematic
But with the technology of corpus linguistics, things
can change …
Advantages of using corpora to study human translation
An enormous amount of translated texts
Systematic analyses
Quantifiable results
A bi-directional parallel corpus of Portuguese and English
COMPARAProject leadersAna Frankenberg-Garcia & Diana Santos
Research assistantsRosário Silva & Susana Inácio
Initial support (1999-2000)FCT (Portugal)ISLA (Lisboa) Oxford University (Language Centre)
Present funding (2001-2006)Linguateca: FCT/ POSI (POSI/PLP/43931/2001)
COMPARA
English PortugueseOriginal Translated Portuguese Portuguese
Original TranslatedEnglish English
Source TranslationsTexts
COMPARA 8.0 varieties
Portugal
Brazil
Angola
Mozambique
UK
US
South Africa
PORTUGUESE ENGLISH
Unbalanced distribution!
COMPARA 8.0 authors
Portuguese writersCamilo Castelo Branco
Eça de Queirós
José Cardoso Pires
José Saramago
Jorge de Sena
Lídia Jorge
Mário de Carvalho
Sá Carneiro
COMPARA 8.0 authorsBrazilian writersAluísio Azevedo
Autran Dourado
Chico Buarque
Jô Soares
José de Alencar
Machado de Assis
Manuel Antônio de Almeida
Marcos Rey
Patrícia Melo
Paulo Coelho
Rubem Fonseca
COMPARA 8.0 authorsBritish writersDavid Lodge
Ian McEwan
Julian Barnes
Joseph Conrad
Joanna Trollope
Kazuo Ishiguro
Lewis Carrol
Mary Shelley
Oscar Wilde
COMPARA 8.0 authors
American writersHenry James
Edgar Allan Poe
Richard Zimler
South African writersNadine Gordimer
Can any text be included in the corpus?
Only published source texts and translations
Only English translated directly from Portuguese
Portuguese translated directly from English
Only human translations!
COMPARA 8.0 size
1,536,269 1,423,937
words words
in in English Portuguese
Largest edited parallel corpus containing Portuguese
COMPARA users and usesLanguage learners - bilingual dictionary with examples
Language teachers - exercises and tests
Translators - language equivalents
Translation lecturers - exercises & problems
Translation theorists - test translation hypotheses
Lexicographers - bilingual dictionaries
Computational linguists - machine translation
Latest statistics: + 6000 queries per month
Studies using COMPARA
1. Observing source texts and translations
2. Constrasting Portuguese and English
3. Comparing translated and untranslated language
4. Examining the characteristics of translated texts
1. Observing source texts & translations
Improving bilingual dictionaries and machine-translation programs
Frankenberg-Garcia (2002) nod
Ribeiro & Dias (2005) grande
Specia et al. (2005) word-sense disambiguation
2. Contrasting English and Portuguese
Contrasting original fiction in English and Portuguese
Frankenberg-Garcia (2005)
PTLoan words
EN Loan words
PTLoan languages
EN Loan languages
3. Comparing translated and untranslated language
diferente(s)
simplesmente
end.* up
translations source texts *
30,7 15,4
15,6 5,1
13,5 2,8
* frequency/100 K words in COMPARA 7.0.4
2 x
3 x
4 x
lemma “rezar”
5,6 12,42 x
4. Examining the characteristics of translated texts
Are translations longer than source texts?Frankenberg-Garcia (2004)
Explicitation Hypothesis
Pt1500
words
Pt1500
words
Pt1500
words
Pt1500
words
Pt1500
words
Pt1500
words
Pt1500
words
Pt1500
words
En1500
words
En1500
words
En1500
words
En1500
words
En1500
words
En1500
words
En1500
words
En1500
words?
Source texts Translations
8 PT authors8 EN authors
8 PT translators8 EN translators
ST
TTTT
TTTT
TTTT
TTTT
TTTT
TTTT
TTTT
TTTT
+ 5%
Matched t-test: 95% probabilityTT longer than ST
Source texts Translations
Studies such as these were unthinkable before corpora
Many other studies are possible!
COMPARA is free and available online
Contact us: [email protected] [email protected]
To conclude....