Francesca PalmaEquipe de recherche CLILLAC-ARP – Université Paris Diderot – Paris 7
Développer un outil d’aide à la
rédaction en communication
scientifique - Une étude basée sur un
corpus anglais pour les biologistes
L'anglais : langue internationale des
sciences
700 millions de locuteurs dont < 50 % LM (Brumfit 2001)
Science Citation Index:
Anglais (95 %), Français, Allemand et Russe (4.9 %),
Autres langues (0.7 %)
Lingua franca (Gnutzmann 2000) ou global language (Crystal
1997) mais aussi un «Tyrannosaurus rex » (Swales 1997)
- Perte de genres spécialisés
- Certains auteurs bénéficient d'un statut privilégié
«Publish in English or perish» (Volanschi 2008)
Questionnaire
UFR Sciences de la Vie - Univ. Paris Diderot (Volanschi 2007)
56 réponses (sur environ 300 questionnaires envoyés)
L'anglais est véritablement une langue de travail :
96 % des documents sont rédigés en anglais
95 % des chercheurs rédigent directement en anglais
53 % des chercheurs « pensent » en anglais
Questionnaire/2
Difficultés:
grammaire (62,5 %), expressions idiomatiques (69,64 %),
influence de la langue française (64,28 %), termes
spécialisés (14,28 (%
Solutions:
collocations en Langue Scientifique Générale (Pecman 2004)
(87 %) et terminologiques(% 83), structures prédicatives
(73%) et concordances (55 %)
Outils et méthodes pour l'étude des collocations dans
différents domaines scientifiques (Sciences de la Terre,
Médecine, Chimie, Biologie, Informatique ...)
Dictionnaire combinatoire sous forme de base de données
bilingue de la phraséologie en anglais scientifique
interdisciplinaire
Corpus bilingue de référence pour différents genres
scientifiques (article de recherche, communication orale,
dissertation, ouvrage de vulgarisation ...)
Saisie des données et interface d'interrogation en ligne
Le projet ESIDIS-ARTES (Kübler & Pecman)
Étude des Spécificités et Invariants des DIscours Scientifiques –
Aide à la Rédaction de TExtes Scientifiques
Le projet ESIDIS-ARTES (Kübler & Pecman)
Étude des Spécificités et Invariants des DIscours Scientifiques –
Aide à la Rédaction de TExtes Scientifiques
Publics visés :
traducteurs, spécialistes de la recherche
d'information, rédacteurs techniques, scientifiques,
jeunes chercheurs, étudiants, linguistes,
épistémologue, spécialistes de la recherche
d’information
Applications pratiques visées :
applications pédagogiques, applications
lexicographiques, aide à la rédaction ou à la
traduction, recherche d'informations ciblées pour
l'information scientifique et technique
Phraséologie dans la langue de spécialité (Volanschi 2009)*
8 années : travaux des étudiants de Master PRO ILTS
(bases des données terminologiques dans différents
domaines)
- 20.000 termes et 22.000 collocations
- Problème d'exploitation des ressources sur longue
période
* http://ytat2.ijm.univ-paris-diderot.fr/LangYeast/
La phraséologie en anglais scientifique
Phraséologie dans la Langue Scientifique Générale (Pecman 2004)
Collocations restrictives :
working, reasonable, attractive hypothesis
to support / test / suggest a hypothesis
Formules regroupées selon leur fonction rhetorique
ex. “anonymous reference”:
il est courant de penser que...
il est communément / généralement / unanimement admis / reconnu que...
on admet que, on a longtemps pensé / cru que, on a souvent dit que..., etc.
it is commonly / generally / universally / widely accepted that...
it is widely / well known that...
it [is / has been] (often) asserted / noted / recognised / believed / claimed / argued that...
La phraséologie en anglais scientifique
Premières étapes : PLoS *
20 millions de mots
Open Access et XML
IMRaD standard => 4 sous-corpus
6 revues :
Genetics
Pathogens
Biology
Computational Biology
Medicine
Neglected Tropical Diseases
(3 nouvelles revues : Plos One, Plos Current, Plos Clinical Trials)
* Public Library of Science: www.plos.org
Le corpus PLoS
JOURNAL Impact
Factor
At least one author
from USA, UK, etc.
One author from
FR, BE, etc
PLOS BIOLOGY
(5.5 million words)13,5 91,34% 15,65%
PLOS PATHOGENS
(3.8 million words)9,3 98,84 % 21,35%
PLOS GENETICS
(4.95 million words)8,7 97,09% 16,59%
PLOS MEDICINE
(2.2 million words)12,6 76,52% 16,36%
PLOS COMPUT. BIO.
(3.5 million words)6,2 94,42% 15,82%
PLOS NTD
(2.2 million words)4,2 92,56 % 32,51%
Statistiques d'affiliation dans PLoS
USA (54,05 %)
UK (10,22 %)
France (5,51 %)
Germany (4,27 %)
Canada (3,4 %)
NL (2,29 %)
Japan (2,02 %)
Switzerland (1,85 %)
Australia (1,83%)
OTHER (13,21 %)
L'introduction de l'article de recherche : le
modèle CARS (Swales 1990)
Move 1: Establishing a research territory
step 1: Claiming centrality, and/or
step 2: Placing your research within the filed, and/or
step 3: Reviewing items of previous research
Move 2: Establishing a niche
step 1a: Counter-claiming, or
step 1b: Indicating a gap in current research, or
step 1c: Question raising, or
step 1d: Continuing a tradition
Move 3: Occupying the niche
step 1a: Outlining purposes, or
step 1b: Announcing present research
step 2: Announcing principle findings
step 3: Indicating research article structure
L'introduction de l'article de recherche : le
modèle CARS (Swales 1990)
Move 1: Establishing a research territory
step 1: Claiming centrality, and/or
step 2: Placing your research within the filed, and/or
step 3: Reviewing items of previous research
Move 2: Establishing a niche
step 1a: Counter-claiming, or
step 1b: Indicating a gap in current research, or
step 1c: Question raising, or
step 1d: Continuing a tradition
Move 3: Occupying the niche
step 1a: Outlining purposes, or
step 1b: Announcing present research
step 2: Announcing principle findings
step 3: Indicating research article structure
- However, the previouslymentioned methodssuffer from some limitations...
- The first group...cannottreat... and is limited to...
- The second group...is time consuming, andtherefore expensive,is not sufficiently accurate...
Travaux sur les articles suivant le
modèle IMRAD
Introduction: Swales (1981, 1990), Dudley-Evans (1986), Swales &
Najjar (1987), Hughes (1989), Gledhill (1995, 2000), Bathia
(1997), Samraj (2002)
Abstract: Gledhill (2000), Hyland (2000), Swales & Feak (2009),
Bordet (2009)
Methods: Weissberg & Buker (1990), Nwogu (1997)
Results: Brett (1994), Thompson (1993), Williams (1999), Yang &
Allison (2003)
Discussion: Belanger (1982), McKinlay (1983), Dudley-Evans (1986,
1994), Hopkins & Dudley-Evans (1988), Holmes (1997, 2001),
Peacock (2002)
Quelques modèles
Kanoksilapatham (2004): 15 moves (3 for Introduction, 4 for Methods, 4 for
Results and 4 for Discussion section)
Nwogu (1997): 11 moves (3 for Introduction, 3 for Methods, 2 for Results and
3 for Discussion section)
Peacock (2002): RA discussion sections
three-part framework involving a series of move cycles: Introduction,
Evaluation, Conclusion
combining 2 or more of 8 moves: information, finding, expected or
unexpected outcome, reference to previous research, explanation,claim,
limitation, reccomandation.
Projets similaires
AMADEUS (Amiable Article Development for User Support :
Aluisio 1994) : seléction de phrases et collocations
« standard » fréquents dans les textes scientifiques
SCIENTEXT (Univ. Grenoble 3, Lorient, Chambéry) : corpus
textes scientifiques principalement en français
http://scientext.msh-alpes.fr/scientext-site/spip.php?article1
TYOS (Type Your Own Scripts – Univ. Bordeaux) : outil
pédagogique ; corpus restreint (30 textes) ; annotation des
« moves », formes verbales, connecteurs du discours, etc.
http://www.tyos.org
Extractiondes formules
Manuellement : annotation des moves, des steps et des
marqueurs linguistiques sur un échantillon (Flowerdew
2010 ; Kanoksilapatham 2004, 2007)
•Légende
• Background information
• Domain and present situation
• Gap in knowledge or remaining problem
• Reference to previous research
• Occupying the niche
As a group, parasites are extraordinarily diverse. Even closely related parasites may behave very differently, infecting different host species, causing different pathologies, or infecting different tissues. For example, Escherichia coli bacteria, a typically harmless inhabitant of the human gut, can, in different forms, cause diarrhea, intestinal bleeding, urinary tract infections, kidney bleeding, meningitis, and other diseases. Underlying this diversity is evolution. It is widely appreciated that parasites are prone to rapid evolution, and because of their often short generation times and large population sizes, parasites may evolve far more rapidly than their hosts. Attempts to understand parasite evolution, and the relevance of that evolution to disease, go back at least half a century to the first observations of drug resistance evolution in bacteria. However, the application of evolutionary theory to parasites remains fertile ground for original research. Indeed, evolutionary biology and parasitology have undergone such rapid advances in recent years that it has been difficult to keep abreast of both. Some recent papers, including the study of Babayan et al. in this issue of PloS Biology, apply results from one branch of evolutionary theory—life history theory—to the characteristics of pathogens of medical interest such as parasitic roundworms (nematodes) and malaria. Babayan et al. propose that the life history of parasitic microfilarial worms shows evidence of adaptive “plasticity.” Specifically, they propose that worm development inside a mammalian host changes in response to the host's immunity, and that the parasite's response matches predictions from life history theory. […] Babayan et al. point out further experiments are needed to distinguish between these hypotheses. The first hypothesis could be tested by determining whether the outcome of the experiments described above differs when the parasite infects its natural host (the cotton rat) rather than laboratory mice. Transmission experiments (ideally using vectors to transmit the parasite between vertebrate hosts) could be used to discriminate between the second and third hypotheses. If there is less transmission from the IL-5–treated mice than the untreated mice (because the Mf in the IL-5–treated mice are less fit), then we would favor the third hypothesis.
Background information; Domain and present situation; Gap in knowledge or remaining problem; Reference to previous research; Occupying the niche
Extraction des formules
Manuellement : annotation des moves, des steps et des
marqueurs linguistiques sur un échantillon
Semi-automatique: commandes Linux138 <s> It is important to note that
83 <s> It is interesting to note that
29 <s> It will be interesting to determine
17 <s> It is worth noting that the
16 <s> It will be of interest to
16 <s> It will be interesting to see
16 <s> It is also important to note
55 <s> It should be noted that the
11 <s> It should be noted that our
9 <s> It should be noted that this
21 <s> It should also be noted that
15 <s> It is also worth noting that
13 <s> It is important to point out
9 <s> It is also interesting to note
Extraction des formules
Manuellement : annotation des moves, des steps et des
marqueurs linguistiques sur un échantillon
Semi-automatique: commandes Linux138 <s> It is important to note that
83 <s> It is interesting to note that
29 <s> It will be interesting to determine
17 <s> It is worth noting that the
16 <s> It will be of interest to
16 <s> It will be interesting to see
16 <s> It is also important to note
55 <s> It should be noted that the
11 <s> It should be noted that our
9 <s> It should be noted that this
21 <s> It should also be noted that
15 <s> It is also worth noting that
13 <s> It is important to point out
9 <s> It is also interesting to note
It is important / interesting / of interestto note / determine / see /point out that …
It should be noted that …
It is worth noting / mentioning that
Accès aux données
● Par mot
● Par fonction rhétorique
● Par moves
● Par section (pour l'article de recherche)
Quelle est la meilleure façon d'accéder aux données pour
un utilisateur n'ayant pas ou ayant peu de formation
linguistique ?
Cas de figure 1: accès combiné
Introduction
Mat. & Methods
Results
Discussion
Research outcome
Consolidating results
Limitations of the study
Further research
Contextualizing the study
...
Cas de figure 1: accès combiné
Introduction
Mat. & Methods
Results
Discussion
Research outcome
Consolidating results
Limitations of the study
Further research
Contextualizing the study
...
Limitations about the findings
Limitations about the claims made
Limitations about the methodology
Cas de figure 1: accès combiné
Introduction
Mat. & Methods
Results
Discussion
Research outcome
Consolidating results
Limitations of the study
Further research
Contextualizing the study
...
Limitations about the findings
Limitations about the claims made
Limitations about the methodology
As yet, we have not detected...
...we are unable to provide any new insight into...
Many questions still remain as to how...
Cas de figure 1: accès combiné
Introduction
Mat. & Methods
Results
Discussion
Research outcome
Consolidating results
Limitations of the study
Further research
Contextualizing the study
...
Limitations about the findings
Limitations about the claims made
Limitations about the methodology
As yet, we have not detected...
...we are unable to provide any new insight into...
Many questions still remain as to how...
Corpus-driven examples
Cas de figure 2 : Accès par fonction
rhétorique / communicative
Liste : ~30 fonctions
=> Hiérarchie des fonctions pour un accès plus simple
Liste : ~30 fonctions
=> Hiérarchie des fonctions pour un accès plus simple
expression de la découverte
Stating results
Referring to a table/figure
Contextualizing the study
Stating limitations of the study
Detailing procedures
Confirming a hypothesis
Proposing a hypothesis
Cas de figure 2 : Accès par fonction
rhétorique / communicative
...
Cas de figure 2 : Accès par fonction
rhétorique / communicative
Liste : ~30 fonctions
=> Hiérarchie des fonctions pour un accès plus simple
expression de la découverte
Stating results
Referring to a table/figure
Contextualizing the study
Stating limitations of the study
Detailing procedures
Confirming a hypothesis
Proposing a hypothesis
This hypothesis / interpretation
is supported by theis consistent with the
These / Our data / findings
are consistent withsupport
the hypothesis that
...
Cas de figure 3: accès par mot
data
Our data do not adress the possibility that...
We have insufficient data for...
The limited data available suggest...
A growing body of data shows that...
These data are consistent with...
The data presented here suggest that...
Cas de figure 3: accès par mot
data
Confirming a hypothesis
Stating limitations
Indicating source of data
Indicating criteria for data selection
Cas de figure 3: accès par mot
data
Confirming a hypothesis
Stating limitations
Indicating source of data
Indicating criteria for data selection
Our data do not enable us to...
Our data do not address the possibility that...
We have insufficient data for...
The limited data available suggest...
Perspectives de recherche
● Déterminer quelle est la méthode la plus adaptée
● Évaluation sur une sélection d'article de différentes
disciplines
● Soumission des différents « cas de figure » à des
utilisateurs potentiels
● Conception de tutoriels pour un public vaste et pour les
cours d'anglais de spécialité
● Création de ressources pour les enseignants d’anglais
de spécialité
● Multilingue
Merci de votre attention
Francesca Palma
CLILLAC-ARP
Université Paris 7 – Paris Diderot
Aluisio, S. and Oliveira, O., A Case-Based Approach for Developing Writing Tools Aimed at Non-native English Users, in Case-based reasoning, Proceedings of ICCBR, Sesimbra, Portugal, 1995.
Ammon, U., The Present Dominance of English in Europe. With an Outlook on Possible Solutions to the European Language Problems, Sociolinguistica, Vol. 8, pp.1-14.
Bathia, V.K., Genre-mixing in academic introductions, English for Specific Purposes, Vol.16, Issue 3, 1997, Pages 181-195
Belanger, M., A preliminary analysis of the structure of the discussion sections in ten neuroscience journal articles, 1982, Mimeo, LSU, Aston University Reference Coll.
Bordet, G. A comparative study of Phd abstracts written in English by native and non native speakers across different disciplines, 2009, Proceedings of Corpus Linguistics Conference, Liverpool
Brett, P. A genre analysis of the results section of sociology articles, English for Specific Purposes, Vol.13, Issue 1, 1994, Pages 47-59
Dudley-Evans, T., Genre-analysis: An investigation of the introduction and discussion sections of M.Sc. Dissertations, 1986, in M. Coulthard (Ed.), Talking About Text, Birmingham, U.K., ELR, Birmingham University
Flowerdew, L. (in press) ESP and Corpus Studies. In D. Belcher, A. Johns & B. Paltridge (eds.) New Directions for ESP Research. Ann Arbor, MI: University of Michigan Press.
Flowerdew, L. (in press) Corpus-based Discourse Analysis. In J.P. Gee & M. Handford (eds.) Routledge Handbook of Discourse Analysis. London: Routledge.
Fontana, N., Caldeira, S., De Oliveira, C., Oliveira Jnr., N., Computer Assisted Writing: Applications to English as a Foreign Language, in CALL, 1993, Vol. 6, No. 2, pp. 145-161.
Gledhill, C., Collocation and Genre Analysis. The discourse function of collocation in cancer research abstracts and articles, 1995, Zeitschrift für Anglistik und Amerikanistik, , Vol. 1, pp. 1-26.
Gledhill, C., The discourse function of collocation in research article introductions, 2000, English for Specific Purposes, Vol. 19, pp. 115-135.
Holmes, R., Genre analysis, and the social sciences: An investigation of the structure of research article discussion sections in three disciplines, English for Specific Purposes, Vol. 16, Issue 4, 1997, pp. 321-337
Hopkins, A. and Dudley-Evans, T., A genre-based investigation of the discussion sections in articles and dissertations, English for Specific Purposes, 1988, Vol.7, No.2, pp. 113-121
Hughes, G., Article introductions in computer journals. 1989, M.A. Dissertation, University of Birmingham.
Bibliographie / 1
Bibliographie / 2
Hyland, K. Disciplinary discourses: writer stance in research articles. 2000, in Texts, Process, and Practices, Longman, London, pp. 99-121.
Kanoksilapatham, B., Rhetorical structure of biochemistry research articles, in English for Specific Purposes, 2005, No. 24, pp.269-292.
McKinlay, J., An analysis of discussion sections in medical journal articles., 1983, M.A. Dissertation, University of Birmingham
Nwogu, K. N., The Medical Research Paper: Structure and Functions, in English for Specific Purposes, 1997, Vol.16, No. 2, pp. 119-138.
Peacock, M., Communicative moves in the discussion section of research articles, in System, 2002, Vol.30, pp.479-497.
Pecman, M., Phraseologie contrastive anglais-français : analyse et traitement en vue de l’aide à rédaction scientifique, 2004, PhD. Dissertation, Université de Nice- Sophia Antipolis
Samraj, B., Introductions in research articles: variations across disciplines, English for Specific Purposes, Vol. 21, Issue 1, 2002, Pages 1-17
Swales, J., Aspects of article introductions, 1981, ESP Research Report, (1), Aston University, Birmingham
Swales, J., Genre Analysis: English in academic and research settings, 1990, Cambridge, Cambridge University Press
Swales, J., English as a Tyrannosaurus rex, World Englishes, 1997, Vol. 16, pp. 373-382.
Swales, J., and Feak, C., English in Today’s Research World: Abstracts and the Writing of Abstracts, 2009, EAPP, Ann Arbor, MI: The University of Michigan Press.
Swales, J. and Najjar, H., The Writing of Research Article Introductions, 1987, Written Communication 2(4), Sage Publications, pp. : 175-191.
Thompson, D., Arguing for experimental facts in science., 1993, Written Communication, Vol. 10, No.1, pp. 106-128.
Volanschi, A. Etude et modélisation des phénomènes collocationnels Implémentation dans un système d'aide à la rédaction en anglais scientifique, 2008, PhD Dissertation, Université Paris 7
Volanschi, A., Kübler, N., Building an electronic combinatory dictionary as a writing aid tool for researchers in biology. eLexicography in the 21st century, Proceedings of eLex 2009, Louvain-la-Neuve.
Weissberg, R., and Buker, S., Writing up research. 1990, Englewood Cliffs, NJ: Prentice-Hall
Williams, I. Results section of medical research articles., English for Specific Purposes, Vol. 18, Issue 1, pp. 347-366.
Yang, R., & Allison, D., Research articles in applied linguistics: moving from results to conclusions. 2003, English for Specific Purposes, Vol. 22, pp. 365-385.