+ All Categories
Home > Documents > Folkert de Vriend & Martin Snijders 18/11/2011

Folkert de Vriend & Martin Snijders 18/11/2011

Date post: 10-Feb-2016
Category:
Upload: alize
View: 31 times
Download: 0 times
Share this document with a friend
Description:
Bridging the Gap between First Language Acquisition and Historical Dialectology with the Help of Digital Humanities. Folkert de Vriend & Martin Snijders 18/11/2011. Time and team. Project duration: 1 year (may 2011 - may 2012) Multi-disciplinairy team: Leonie Cornips Wilbert Heeringa - PowerPoint PPT Presentation
Popular Tags:
15
Bridging the Gap between First Language Acquisition and Historical Dialectology with the Help of Digital Humanities Folkert de Vriend & Martin Snijders 18/11/2011
Transcript
Page 1: Folkert de Vriend & Martin Snijders 18/11/2011

Bridging the Gap between First Language Acquisition and

Historical Dialectology with the Help of Digital Humanities

Folkert de Vriend & Martin Snijders18/11/2011

Page 2: Folkert de Vriend & Martin Snijders 18/11/2011

Time and team• Project duration: 1 year (may 2011 - may 2012)

• Multi-disciplinairy team:o Leonie Cornipso Wilbert Heeringao Marc Kemps-Snijderso Martin Snijderso Student assistants: Anke, Gertruud, Yvonneo Jos Swanenbergo Folkert de Vriend

Page 3: Folkert de Vriend & Martin Snijders 18/11/2011

• COAVA: COgnition, Acquisition and VAriation Tool

• Aims of COAVA:A) Curation of resources from two separate linguistic subdisciplines: first language acquisition and dialect geography. B) Development of a demonstrator tool for interdisciplinary research into the lexical characteristics of concepts

General

Page 4: Folkert de Vriend & Martin Snijders 18/11/2011

A) Curation

Page 5: Folkert de Vriend & Martin Snijders 18/11/2011

Resources in COAVA• Seven corpora from CHILDES

• The Netherlands and Flanders• Children (mostly between 2 and 3,5 years)

• Part III of WBD/WLD• (Dutch and Flemmish) Brabant and Limburg• Adults

Page 6: Folkert de Vriend & Martin Snijders 18/11/2011

CLARIN-complianceDialect data and CHILDES data• CMDI-metadata• Persistent identifiers• ISOcat

Dialect data• Lexical Markup Framework (LMF)

Page 7: Folkert de Vriend & Martin Snijders 18/11/2011

B) Demonstrator

Page 8: Folkert de Vriend & Martin Snijders 18/11/2011

Lexical characteristics• First language acquisition:

For some concepts the lexical form typically is acquired early (‘dog’ for instance) while for other concepts the lexical form typically is acquired later (‘blue titmouse’ for instance.).’

• Dialect geography:For some concepts there is lot of lexical variation while for other concepts there is very little variation.

Page 9: Folkert de Vriend & Martin Snijders 18/11/2011

Value of combined interpretation

•For researchers in both disciplines these characteristics are interesting for at least two reasons:•Research into the ‘basic level

vocabulary’ of a community•Research into the relation

between age of acquisition and (dialect)variation

Page 10: Folkert de Vriend & Martin Snijders 18/11/2011

Implementation• A concept taxonomy is constructed. This

taxonomy will only contain concepts for which lexical forms can be found in both resources

• Since the Dutch CHILDES data mostly contain data for children aged between 2 and 3,5 years of age we focus on lexical forms that are nouns.

• To enable linking from this taxonomy to the CHILDES data, these first need to be lemmatised and tagged for their POS (Lexicon by Gilles)

Page 11: Folkert de Vriend & Martin Snijders 18/11/2011

Demo

Page 12: Folkert de Vriend & Martin Snijders 18/11/2011

Technology

• Client server application• Search services

• Java/Google Web Toolkit• Apache/Tomcat• Solr search server• Open Source

Page 13: Folkert de Vriend & Martin Snijders 18/11/2011

Solr

• Indices, multi core• Facetted search• Fast

Page 14: Folkert de Vriend & Martin Snijders 18/11/2011

Demo

Page 15: Folkert de Vriend & Martin Snijders 18/11/2011

Thank you


Recommended