+ All Categories
Home > Documents > Using corpora as an innovative tool to compare varieties of English … ·  · 2009-11-24developed...

Using corpora as an innovative tool to compare varieties of English … ·  · 2009-11-24developed...

Date post: 11-May-2018
Category:
Upload: doandiep
View: 214 times
Download: 2 times
Share this document with a friend
42
Clavier09 Josef Schmied Josef Schmied English English Language Language & & Linguistics Linguistics Chemnitz University of Technology Chemnitz University of Technology http:// http:// www.tu www.tu - - chemnitz.de chemnitz.de / / phil phil / / english english /Schmied /Schmied Using corpora as an innovative tool to Using corpora as an innovative tool to compare varieties of English around the world: compare varieties of English around the world: the International Corpus of English the International Corpus of English ICE ICE story story 2
Transcript

Clavier09

Josef SchmiedJosef SchmiedEnglish English LanguageLanguage & & LinguisticsLinguistics

Chemnitz University of TechnologyChemnitz University of Technologyhttp://http://www.tuwww.tu--chemnitz.dechemnitz.de//philphil//englishenglish/Schmied/Schmied

Using corpora as an innovative tool to Using corpora as an innovative tool to compare varieties of English around the world: compare varieties of English around the world:

the International Corpus of Englishthe International Corpus of EnglishICEICE

storystory2

22/40/40Clavier09

Motivation and structureMotivation and structure

1. The (1. The (hi)storyhi)story of ICE of ICE beginnings and concepts (ICE0)beginnings and concepts (ICE0)ICE problemsICE problems

2. The present2. The present--day status of ICEday status of ICEresourcesresourceswork in progresswork in progress

3. Case studies: modalities3. Case studies: modalities 4. The future of ICE (ICE2)4. The future of ICE (ICE2)changing communication patterns 1989 changing communication patterns 1989 --20092009new practical opportunitiesnew practical opportunitiesnew theoretical challengesnew theoretical challenges

Motivation ICE History ICE status Case studies ICE2future Conclusion

33/40/40Clavier09

1. The (1. The (hi)storyhi)story of ICEof ICE

a personal account a personal account

1.1 Beginnings and concepts1.1 Beginnings and conceptsMay 1988 ICAME Birmingham:May 1988 ICAME Birmingham:SchmiedSchmied: : ““Compiling a Corpus of East African EnglishCompiling a Corpus of East African English””discussion on Brown/LOB categories?discussion on Brown/LOB categories?more sociolinguistics variable (gender, status, age, 1more sociolinguistics variable (gender, status, age, 1stst language, etc.)language, etc.)

Oct. 1988 proposal:Oct. 1988 proposal:GreenbaumGreenbaum, Sidney. , Sidney. ““A proposal for an international A proposal for an international

computerised corpus of English. computerised corpus of English. World World EnglishesEnglishes 77, 315 , 315

Motivation ICE History ICE status Case studies ICE2future Conclusion

44/40/40Clavier09

1.1 ICE0 concepts 1.1 ICE0 concepts

GreenbaumGreenbaum (1988: 31):(1988: 31):(1)(1) to sample standard varieties from other countries to sample standard varieties from other countries

where English is the first language, for example where English is the first language, for example Canada and AustraliaCanada and Australia

(2)(2) to sample national varieties from countries where to sample national varieties from countries where English is an official additional language, for English is an official additional language, for instance India and Nigeria; andinstance India and Nigeria; and

(3)(3) to include spoken and manuscript English as well to include spoken and manuscript English as well as printed English.as printed English.

Motivation ICE History ICE status Case studies ICE2future Conclusion

55/40/40Clavier09

1.1 Discussing the corpus design1.1 Discussing the corpus design

SchmiedSchmied (1990):(1990):““CorpusCorpus--linguistics and the linguistics and the nativizationnativization of of

EnglishEnglish””. . World World EnglishesEnglishes 99, 255, 255--268268

““corpuscorpus--compilation paradoxcompilation paradox””::A A ““nationalnational”” corpus should contain culturecorpus should contain culture--

specific specific text(type)stext(type)s, but we can only identify , but we can only identify them through corpus analysisthem through corpus analysis

Motivation ICE History ICE status Case studies ICE2future Conclusion

66/40/40Clavier09

1.2 ICE problems 1.2 ICE problems

corpuscorpus--compilation:compilation:1)1) funding (e.g. ICEfunding (e.g. ICE--US, ICEUS, ICE--Nigeria)Nigeria)2)2) adaptations in corpus compilation:adaptations in corpus compilation:

technology and culturetechnology and culture3)3) copycopy--right for distributionright for distribution4)4) corpus processing: corpus processing:

annotation and parsingannotation and parsingcorpuscorpus--application:application:1)1) manuals for restriction (interpretation)manuals for restriction (interpretation)2)2) query (query (WordSmithWordSmith or or AntConcAntConc) ) –– statisticsstatistics

individual solutionsindividual solutions

Motivation ICE History ICE status Case studies ICE2future Conclusion

77/40/40Clavier09

1.2.2 Adaptation1.2.2 Adaptation

representativenessrepresentativeness vs. comparabilityvs. comparability

McEneryMcEnery/Wilson 1996 Edinburgh U.P./Wilson 1996 Edinburgh U.P.http://www.lancs.ac.uk/fss/courses/ling/corpus/Corpus2/2FRA1.HTMhttp://www.lancs.ac.uk/fss/courses/ling/corpus/Corpus2/2FRA1.HTM

4 characteristics of the modern corpus: 4 characteristics of the modern corpus: Sampling and Sampling and representativenessrepresentativenessFinite size Finite size MachineMachine--readable form readable form A standard reference A standard reference

Motivation ICE History ICE status Case studies ICE2future Conclusion

88/40/40Clavier09

Appendix 6: List of written texts from Tanzania (word count)Appendix 6: List of written texts from Tanzania (word count)PRINTEDPRINTED

Informational: LearnedInformational: LearnedHumanities Humanities W2A001T W2A001T –– W2A010TW2A010T 20.17220.172Social SciencesSocial Sciences W2A011T W2A011T –– W2A020TW2A020T 20.15120.151Natural Sciences Natural Sciences W2A021T W2A021T –– W2A027TW2A027T 20.11420.114Technology/Agriculture/Environmental dev.Technology/Agriculture/Environmental dev. W2A031T W2A031T –– W2A040TW2A040T 20.14820.148

totaltotal 80.58580.585Informational: PopularInformational: PopularHumanities Humanities W2B001T W2B001T –– W2B010TW2B010T 20.13320.133Social Sciences Social Sciences W2B011T W2B011T –– W2B020TW2B020T 20.22320.223Natural Sciences Natural Sciences W2B021T W2B021T –– W2B24TW2B24T 6.5426.542Technology/Agriculture/Small Industry Technology/Agriculture/Small Industry W2B031T W2B031T –– W2B040TW2B040T 20.06520.065General General W2BGEN1T W2BGEN1T -- W2BGEN8TW2BGEN8T 13.78913.789

totaltotal 80.75280.752Informational: ReportageInformational: ReportageSplash Splash W2C001T W2C001T -- W2C0010TW2C0010T 20.01820.018Reportage/Features Reportage/Features W2C011T W2C011T -- W2C020TW2C020T 20.13920.139

totaltotal 40.15740.157InstructionalInstructionalAdministrative/Regulatory Administrative/Regulatory W2D001T W2D001T -- W2D010T W2D010T 20.12020.120PersuasivePersuasiveInstitutional Institutional W2E001T W2E001T –– W2E010TW2E010T 20.07820.078Personal Column Personal Column W2E011T W2E011T –– W2E020TW2E020T 20.12520.125

totaltotal 40.20340.203

from ICEfrom ICE-- East Africa handbook East Africa handbook

Motivation ICE History ICE status Case studies ICE2future Conclusion

99/40/40Clavier09

1.2.2 Spoken text categories in ICE corpora1.2.2 Spoken text categories in ICE corpora

SPOKEN (300)SPOKEN (300)Dialogues Dialogues (180) (180) PrivatePrivate (100) (100)

FaceFace--toto--face conversations (90) face conversations (90) PhonecallsPhonecalls (10) (10)

PublicPublic (80) (80) Classroom Lessons (20) Classroom Lessons (20) Broadcast Discussions (20) Broadcast Discussions (20) Broadcast Interviews (10) Broadcast Interviews (10) Parliamentary Debates (10) Parliamentary Debates (10) Legal crossLegal cross--examinations (10) examinations (10) Business Transactions (10) Business Transactions (10)

MonologuesMonologues (120) (120) UnscriptedUnscripted (70) (70)

Spontaneous commentaries (20) Spontaneous commentaries (20) Unscripted Speeches (30) Unscripted Speeches (30) Demonstrations (10) Demonstrations (10) Legal Presentations (10) Legal Presentations (10)

ScriptedScripted (50) (50) Broadcast News (20)Broadcast News (20) Broadcast Talks (20)Broadcast Talks (20) NonNon--broadcast Talks (10) broadcast Talks (10)

Motivation ICE History ICE status Case studies ICE2future Conclusion

1010/40/40Clavier09

1.2.2 ICE categories and ICE1.2.2 ICE categories and ICE--EA/EA/Ke/TzKe/Tz

ICEICE KeKe + + TzTzSPOKENSPOKEN 300300 250250

DIALOGUE DIALOGUE 180 180 130 130 (written as spoken(written as spoken 50)50)privateprivate 100100 3030direct direct convconv.. 90 90 3030distanced distanced convconv. . 10 10 ----publicpublic 8080 100100

WRITTENWRITTENpress editorialspress editorials 10 10 ---- ----institutional institutional ---- 10 10 1010personal columns personal columns ---- 10 10 1010

Motivation ICE History ICE status Case studies ICE2future Conclusion

1111/40/40Clavier09

1.2.4 Textual 1.2.4 Textual MarkupMarkup

In written texts, features of the original layout are marked, including sentence and paragraph boundaries, headings, deletions, and typographic features.

Spoken texts are transcribed orthographically, and are marked for pauses, overlapping strings, discourse phenomena such as false starts and hesitations, and speaker turns.

The markup manual is available here.

Motivation ICE History ICE status Case studies ICE2future Conclusion

1212/40/40Clavier09

1.2.4 1.2.4 WordclassWordclass TaggingTagging

Motivation ICE History ICE status Case studies ICE2future Conclusion

ICE texts are automatically tagged for wordclass by the ICE Tagger, developed by Sean Wallis at the Survey of English Usage, University College London. This assigns wordclass tags to each lexical item in the corpus. The tagset has been developed especially for ICE, and is largely based on Quirk et al (1985) A Comprehensive Grammar of the English Language. e.g. Each PRON(univ,sing)

of PREP(ge)these PRON(dem,plu)is V(cop,pres)the ART(def)responsibility N(com,sing)of PREP(ge)one NUM(card,sing)person N(com,sing)

1313/40/40Clavier09

1.2.4 Syntactic Parsing1.2.4 Syntactic Parsing

Every sentence in the corpus is analysed at phrase, clause, and sentence level, and the analysis is shown in the form of a parsetree:

Motivation ICE History ICE status Case studies ICE2future Conclusion

1414/40/40Clavier09

2. The present2. The present--day status of ICEday status of ICE

resources:resources:1)1) WWW for ICE and IceWWW for ICE and Ice--corporacorpora2)2) corpora corpora availabeavailabe3)3) publicationspublications

Motivation ICE History ICE status Case studies ICE2future Conclusion

1515/40/40Clavier09 1515/10/10

1616/40/40Clavier09 1616/10/10Motivation ICE History ICE status Case studies ICE2future Conclusion

1717/40/40Clavier09 1717/10/10Motivation ICE History ICE status Case studies ICE2future Conclusion

1818/40/40Clavier09

2.1 ICE 2.1 ICE webpagewebpage: Corpus : Corpus designdesign

500 files 500 files àà 2,0002,000--word texts in specific categoriesword texts in specific categories The texts in the corpus date from 1990 or later. The authors The texts in the corpus date from 1990 or later. The authors

and speakers of the texts are aged 18 or above, were and speakers of the texts are aged 18 or above, were educated through the medium of English, and were either educated through the medium of English, and were either born in the country in whose corpus they are included, or born in the country in whose corpus they are included, or moved there at an early age and received their education moved there at an early age and received their education through the medium of English in the country concerned.through the medium of English in the country concerned.

The corpus contains samples of speech and writing by both The corpus contains samples of speech and writing by both males and females, and it includes a wide range of age males and females, and it includes a wide range of age groups. The proportions, however, are not representative of groups. The proportions, however, are not representative of the proportions in the population as a whole: women are not the proportions in the population as a whole: women are not equally represented in professions such as politics and law, equally represented in professions such as politics and law, and so do not produce equal amounts of discourse in these and so do not produce equal amounts of discourse in these fields. Similarly, various age groups are not equally fields. Similarly, various age groups are not equally represented among students or academic authors.represented among students or academic authors.

Motivation ICE History ICE status Case studies ICE2future Conclusion

1919/40/40Clavier09

2.2 Currently available ICE corpora2.2 Currently available ICE corpora

1st Language (ENL)) Great BritainGreat Britain Ireland*Ireland* Jamaica*Jamaica* New ZealandNew Zealand

* * free as downfree as down--loadload

22ndnd Language (ESL)Language (ESL) East Africa* (East Africa* (Ke/TzKe/Tz)) Hong Kong*Hong Kong* India*India* The Philippines*The Philippines* Singapore*Singapore*

Motivation ICE History ICE status Case studies ICE2future Conclusion

2020/40/40Clavier09

2.3 2.3 PublicationsPublications

Motivation ICE History ICE status Case studies ICE2future Conclusion

2121/40/40Clavier09 2121/10/10Ideas& Ideals Methods Problems Results Outlook: projects

http://http://www.tuwww.tu--chemnitz.dechemnitz.de//philphil//englishenglish/Schmied/Schmied

2222/40/40Clavier09

3. ICE case studies: modalities3. ICE case studies: modalities

hypotheses for comparing ICE corpora: English auxiliaries are very unevenly distributed in

actual language usage epistemic use is more frequent than deontic ENL varieties: “American innovativeness, British conservativism and Australian independence from both”(Collins 2009) ESL varieties have a smaller number of modal

auxiliaries than ENL varieties: ICE-T has smaller frequencies than ICE-K generally and for epistemic usage specifically

Motivation ICE History ICE status Case studies ICE2future Conclusion

2323/40/40Clavier09

Modal auxiliaries in Tanzania and KenyaModal auxiliaries in Tanzania and Kenya

Motivation ICE History ICE status Case studies ICE2future Conclusion

2424/40/40Clavier09

Case study: Case study: Modal auxiliaries Modal auxiliaries in ICEin ICE--GB, ICEGB, ICE--Phil and ICEPhil and ICE--EA (EA (--K/K/--T)T)

Modal auxiliary

Total per corpus Per million words

ICE-GB ICE-Phil/10 ICE-K ICE-T GB ICE-Phil ICE-K ICE-T

can 3574 425 2212 1681 3574 3601,69 1580 1201could 1635 130 1485 1150 1635 1101,69 1061 821may 1219 120 1143 782 1219 1016,95 816 559might 693 45 249 208 693 381,36 178 149must 687 55 652 468 687 466,10 466 334shall 222 30 159 287 222 254,24 114 205should 1117 100 1155 1075 1117 847,46 825 768will 2841 505 2011 1628 2841 4279,66 1436 1163would 3037 270 1496 1176 3037 2288,14 1069 840Total 15025 1680 10562 8455 15025 14237,29 7545 6040

Motivation ICE History ICE status Case studies ICE2future Conclusion

2525/40/40Clavier09

Core modal aux per million words in ICECore modal aux per million words in ICE--GB,GB,--Phil, Phil, --K, K, --TT

Motivation ICE History ICE status Case studies ICE2future Conclusion

0

500

1000

1500

2000

2500

3000

3500

4000

4500

can

could

may

migh

t mus

t sh

all

shou

ld will wou

ld

GBICE-PhilICE-KICE-T

2626/40/40Clavier09

Results:Results:DeonticDeontic and epistemic modals comparedand epistemic modals compared

Background Concepts Examples Hypotheses Results Interpretation Conclusion

2727/40/40Clavier09

Summary modal Summary modal auxilariesauxilaries

hypotheses confirmed:hypotheses confirmed:English in Kenya has developed further towards a New English

variety than English in Tanzania

exceptions (like exceptions (like shallshall) can be explained:) can be explained:English in English in Tanzania is the more formal variety (informal texts are expressed more often in Kiswahili)

further analyses:further analyses: simple distinction epistemic vs. deontic

is not always detailed enough: e.g. lexeme-specific cases like habitual or historic would

n-grams / collocations, negation, etc.

Motivation ICE History ICE status Case studies ICE2future Conclusion

2828/40/40Clavier09

4. The future of ICE (ICE2)4. The future of ICE (ICE2)

1) a broader corpus basis:1) a broader corpus basis: diachronic corpus: cf. Brown familydiachronic corpus: cf. Brown family larger monitorlarger monitor--corpus: webcorpus: web--basedbased2) more comparative studies:2) more comparative studies: applied issues:applied issues:acknowledgment of new normsacknowledgment of new normsreplacing the native speakerreplacing the native speaker

theoretical issues:theoretical issues:dynamic model: reanalysisdynamic model: reanalysisNew English subNew English sub--categories: categories: deletersdeleters vs. preservers vs. preservers

Motivation ICE History ICE status Case studies ICE2future Conclusion

2929/40/40Clavier09

4.0 Changes in international communication 4.0 Changes in international communication since 1989since 1989

““globalglobal”” communicationcommunicationthrough internet, esp. WWW, chats, through internet, esp. WWW, chats, blogsblogsreplaces snailreplaces snail--mail / letters, ??mail / letters, ??or additional categories?or additional categories?

English as a English as a ““globalglobal”” language:language:EIL= English as an International LanguageEIL= English as an International Languageesp. European Union, Chinaesp. European Union, Chinabut but ““(secondary) education through the (secondary) education through the medium of Englishmedium of English””??

Motivation ICE History ICE status Case studies ICE2future Conclusion

3030/40/40Clavier09

4.1 Diachronic changes to ICE categories4.1 Diachronic changes to ICE categories

ICEICE--EA 1990 EA 1990 –– 2010:2010:replacing categories: emailreplacing categories: emailfull 1 Million words eachfull 1 Million words each

integrated in a larger monitor corpusintegrated in a larger monitor corpus

Motivation ICE History ICE status Case studies ICE2future Conclusion

3131/40/40Clavier09

4.1.2 Web monitor corpus: 4.1.2 Web monitor corpus: keywords approachkeywords approach

Business letters:Business letters: Dear, Yours sincerely/faithfully/truly/etc, invoice, Dear, Yours sincerely/faithfully/truly/etc, invoice, memo, fax, bank, account, financial, enquiries, thank you for, memo, fax, bank, account, financial, enquiries, thank you for, manager, secretary, order, PO Box, date, I enclose, enclosed, I manager, secretary, order, PO Box, date, I enclose, enclosed, I look look forward to, c.c.forward to, c.c.

Popular natural science: Popular natural science: popular, everyday, environment, diet, popular, everyday, environment, diet, disease, plants, animals, reptiles, medicine, health, birds, fisdisease, plants, animals, reptiles, medicine, health, birds, fish, h, whales, conservation, zoos, natural history, green issueswhales, conservation, zoos, natural history, green issues”” rainforest, rainforest, everything you need to know about...., Guide to..., made easy, everything you need to know about...., Guide to..., made easy, global warming, wildlife, botanical, ozone layer. global warming, wildlife, botanical, ozone layer.

Administrative writing: Administrative writing: policy, regulations, procedures, guide, policy, regulations, procedures, guide, benefits, grants, entitlements, Guide to..., University calendarbenefits, grants, entitlements, Guide to..., University calendar, , safety, register/registration, code of conduct, license/licensinsafety, register/registration, code of conduct, license/licensing, g, health regulations, FAQhealth regulations, FAQ

Motivation ICE History ICE status Case studies ICE2future Conclusion

3232/40/40Clavier09

4.1.2 4.1.2 WebcrawlerWebcrawler

CustomisableCustomisable, to exclude unwanted files, e.g. images, , to exclude unwanted files, e.g. images, sounds, movies, .exe. sounds, movies, .exe. CustomisedCustomised settings can be saved in settings can be saved in an an ““optionsoptions”” file, [file, [icelite.opticelite.opt]]

Fast:Fast: can download entire websites in a relatively short can download entire websites in a relatively short time (depending on the size of the site)time (depending on the size of the site)

StableStable: it never crashed, even when the download was : it never crashed, even when the download was aborted.aborted.

Can be run Can be run ‘‘in the backgroundin the background’’, and won, and won’’t interfere with t interfere with other processes.other processes.

Can be run overnight, and will safely switch off your PC.Can be run overnight, and will safely switch off your PC. Inserts Inserts time & datetime & date accessedaccessed in each downloaded file. in each downloaded file. courtesy Nelson, Gerry 2009 ICEcourtesy Nelson, Gerry 2009 ICE--lightlight

Motivation ICE History ICE status Case studies ICE2future Conclusion

3333/40/40Clavier09

4.1.2 Workflow for Monitor Corpus4.1.2 Workflow for Monitor Corpus

Use Google Advanced Search to identify major English-language sites in each domain

Use HTTrack to download sites

Select texts, and record details in a spreadsheet

Targeted search to fill gaps, using Keywords

Use Google Advanced Search to identify major English-language sites in each domain

courtesy Nelson, Gerry 2009 ICEcourtesy Nelson, Gerry 2009 ICE--lightlight

Motivation ICE History ICE status Case studies ICE2future Conclusion

3434/40/40Clavier09

4.2.1 Applied issues4.2.1 Applied issues

Comparative studies and a comparative Comparative studies and a comparative database help decide questions of usage database help decide questions of usage and normand norm

Quantitative comparisons allow more Quantitative comparisons allow more gradient usage decisions than native gradient usage decisions than native speakersspeakers

Motivation ICE History ICE status Case studies ICE2future Conclusion

3535/40/40Clavier09

4.2.2 Theoretical issues: 4.2.2 Theoretical issues: dynamic model dynamic model -- new categorisation?new categorisation?

Motivation ICE History ICE status Case studies ICE2future Conclusion

evolutionary stages:evolutionary stages:•• foundationfoundation•• exonormativeexonormative stabilisationstabilisation•• nativisationnativisation•• endonormativeendonormative stabilisationstabilisation•• differentiationdifferentiation

ICEICE--K > ICEK > ICE--T; ICET; ICE--SgpSgp > ICE> ICE--MyMy

3636/40/40Clavier09

4.2.2 Theoretical issues: 4.2.2 Theoretical issues: deletersdeleters vs. preservers?vs. preservers?

MesthrieMesthrie/Bhatt 2008:90/Bhatt 2008:90““One such broad dichotomy involves varieties One such broad dichotomy involves varieties that favour deletion of elements and those that favour deletion of elements and those that disfavour it. In this regard the that disfavour it. In this regard the differences between differences between SgpSgp Eng (especially Eng (especially amongst those with Chinese substrates) and amongst those with Chinese substrates) and African varieties are striking.African varieties are striking.””

(91): (91): Come what may (come).Come what may (come).He made me (to) do it.He made me (to) do it.As you know (that) I am from the As you know (that) I am from the CiskeiCiskei..

EAsiaEAsia vs. Africa?vs. Africa?ICEICE--Phil/Phil/--SgpSgp vs. ICEvs. ICE--EA/EA/--ZA/ZA/--NigNig

Motivation ICE History ICE status Case studies ICE2future Conclusion

3737/40/40Clavier09

5. Conclusion5. Conclusion

ICEICE--corpora are a good basis for empirical corpora are a good basis for empirical ““nationalnational”” and comparative corpus work and comparative corpus work ““EastEast--AfricanismAfricanism”” (marked <ea>):(marked <ea>): in grammar (modality), in grammar (modality), lexicon (lexicon (matatumatatu )) morphology/morphology/idiomaticityidiomaticity ((grass rootsgrass roots) ) etc.etc.

Motivation ICE History ICE status Case studies ICE2future Conclusion

3838/40/40Clavier09 Motivation ICE History ICE status Case studies ICE2future Conclusion

3939/40/40Clavier09

ICEICE--EA/ESL EA/ESL idioms are less fixed/more flexibleidioms are less fixed/more flexible

Motivation ICE History ICE status Case studies ICE2future Conclusion

Kenya Tanzania written spoken written spoken grassroots 4 3 1 16 10 6 20 grass roots 12 11 1 9 3 6 21 grass root 1 1 2

16 25 41

4040/40/40Clavier09

Current issuesCurrent issues

Can results of corpus-linguistic analyses of “real usage” help to decide choices of norm and standards on a national and international basis?YES Can “objective” corpus-linguistic resources replace “subjective” native speaker intuition as a neutral international standard?YES Can corpus analyses add the cognitive dimension to variety formation?YES

Motivation ICE History ICE status Case studies ICE2future Conclusion

4141/40/40Clavier09

ReferencesReferences

Motivation ICE History ICE status Case studies ICE2future Conclusion

GreenbaumGreenbaum, S. (1988). , S. (1988). ““A proposal for an international computerised A proposal for an international computerised corpus of English". World corpus of English". World EnglishesEnglishes 7 , 315 7 , 315

MeshrieMeshrie, R., R.M. , R., R.M. BhattBhatt (2008). (2008). World World EnglishesEnglishes: The study of new : The study of new language varieties. Cambridge: CUP language varieties. Cambridge: CUP

SchmiedSchmied, J. (1990). , J. (1990). ““CorpusCorpus--linguistics and the linguistics and the nativizationnativization of Englishof English””. . World World EnglishesEnglishes 9, 2559, 255--268268

Schneider, E. (2006). Postcolonial English. Varieties around theSchneider, E. (2006). Postcolonial English. Varieties around the world. world. New York: Cambridge Press.New York: Cambridge Press.

ICES ICES -- International Corpus of English Studies International Corpus of English Studies VolVol 1, No 1 (2009). 1, No 1 (2009). UniversitUniversitäätsverlagtsverlag derder TechnischenTechnischen UniversitUniversitäätt Chemnitz, Chemnitz, https://www.bibliothek.tuhttps://www.bibliothek.tu--chemnitz.de/ojschemnitz.de/ojs//

4242/40/40Clavier09

UsingUsing corporacorpora as an innovative as an innovative tooltool to to comparecomparevarietiesvarieties of English of English aroundaround thethe worldworld: : thethe ICE ICE storystoryThis contribution presents the story of the International CorpusThis contribution presents the story of the International Corpus of English (ICE) from the of English (ICE) from the earliest discussions in 1989 to the current project applicationsearliest discussions in 1989 to the current project applications and the big issues of the and the big issues of the future. future. The ICE teams in the past have always worked independently, so tThe ICE teams in the past have always worked independently, so that some corpora were hat some corpora were finished early, others were processed with the help of finished early, others were processed with the help of taggerstaggers and parsers, others were and parsers, others were given up and restarted several times. Although modern computing given up and restarted several times. Although modern computing facilities and facilities and processing tools (like processing tools (like AntConcAntConc) have made simple data analyses relatively easy for ) have made simple data analyses relatively easy for everyone, the most challenging innovations are possible in corpueveryone, the most challenging innovations are possible in corpus collection. Nowadays, s collection. Nowadays, the WWW can facilitate some data collection and offers new optiothe WWW can facilitate some data collection and offers new options for multimedia ns for multimedia corpora. This will make future corpus corpora. This will make future corpus compliationcompliation and exploitation a real challenge. and exploitation a real challenge. The big issues in applying ICE corpora in theory and practice arThe big issues in applying ICE corpora in theory and practice are, for instance:e, for instance: Can results of corpusCan results of corpus--linguistic analyses of linguistic analyses of ““real usagereal usage”” help to decide choices of norm help to decide choices of norm

and standards on a national and international basis?and standards on a national and international basis? Can corpora justify national varieties of English all around tCan corpora justify national varieties of English all around the world?he world? Can corpusCan corpus--linguistic resources replace the native speaker as a neutral intlinguistic resources replace the native speaker as a neutral international ernational

standard?standard? Can corpus analyses add the cognitive dimension to variety forCan corpus analyses add the cognitive dimension to variety formation?mation?This contribution will illustrate that the International Corpus This contribution will illustrate that the International Corpus of English can, despite some of English can, despite some shortshort--comings, be used as a very innovative tool in exploring geographcomings, be used as a very innovative tool in exploring geographic varieties for ic varieties for researchers of every level.researchers of every level.

Motivation ICE History ICE status Case studies ICE2future Conclusion


Recommended