The new research data centre IANUS – approaches to more data quality
Dr. Felix F. Schäfer, German Archaeology Institute (DAI) Berlin
CAA Netherlands – Germany Joint Chapter Meeting Groningen/NL, 30.11. – 01.12.2012
http://www.ianus-fdz.de
2 CAA Groningen / NL, 01.12.2012
In!uenced by a growing amount of (digital) data, which is generally • produced by different, but thematically related (single-)disciplines
such as archeology, philology, ancient history, anthropology, geography, geology, surveying, archaeobiology, archaeometry, climatology etc.
• based on different methods such as excavations, surveys, building research, material analysis, dendrochronology, remote sensing, geophysical prospection, etc.
• stored in heterogeneous infrastructures and projects, such as museums, universities, academies, state conservation authorities, research institutes etc.
• stuck in local systems with limited access and exchange opportunities
• without adequate documentation and concepts for research data management (back up, archive, re-use, sharing) exist
Archaeological research today
3 CAA Groningen / NL, 01.12.2012
projects / activities: (individuals, groups, institutions)
with dynamic data
research data center with static data
web portal with shared data
Data lifecycle within archaeology
4 CAA Groningen / NL, 01.12.2012
Different types of primary and secondary data: • unstructured texts: reports, diaries, comments, catalogs, editions ...
• structured text: tables, lists, statistics ...
• raster images: photos, aerial photos, satellite images, Geoprospektion ...
• vector data: drawings, reconstructions, Plana / pro#les ...
• 3D data: 3D models, reconstructions, point clouds, object scans, ...
• coordinates: survey data, ...
• complex data: databases, GIS, ...
• measurement series: chemical & physical analysis, samples, geophysical measurements ...
Heterogenity of data
5 CAA Groningen / NL, 01.12.2012
Scienti#c data in archaeology and classical studies is highly • unique because they describe individual, non-reproducible objects and contexts
• durable because they have beyond the limits of projects high scienti#c relevance
• distributed and disparate as players and use in administration, tourism, science and education is very different
• heterogeneous in content and form, as it includes different disciplines - humanities, natural sciences and social sciences
• at risk because specialized concepts and infrastructures to sustainable management of digital data are missing
Characteristica of data
6 CAA Groningen / NL, 01.12.2012
Key challenges to improve the quality of data • documente your methods, your terms,
your systems, your questions
• use standards and de#ne working rules
• make your data explicit, not implicit
• implement research data management plans
• structure your data in a comprehensible way
• involve all relevant actors and describe work"ows
è the higher the data quality is the easiert it can be archived for the future and the better it can be reused by others
Data quality in archaeology
Data-Life-Cycle, UK Data Archive http://data-archive.ac.uk/create-manage/life-cycle
7 CAA Groningen / NL, 01.12.2012
Reasons for responsible data management, quali#ed archiving and long-term dissemination of data
• Transfer of knowledge to others irrespective of individuals, projects and institutions
• Preservation of primary and secondary data for the future, not only by publications
• Allow reuse of data for new tasks, questions and methods
• Cost reduction in the generation of new data and avoid redundant data collection
• More efficient work due to better interoperability and exchange
• Compliance with legal requirements, such as the obligation to keep information
• Increase the relevance of own work through increased visibility and access
Bene#ts and value
8 CAA Groningen / NL, 01.12.2012
Need for a central IT infrastructure, which
• publishes technical and semantic standards and supports compliance
• helps to overcome existing barriers between different disciplines, methods, actors and institutions
• improves research by linking and sharing existing data
• archives and distributes the data generated today for posterity
è Research data management, quality control and long-term archiving of data is not a task for the end of project, but must be taken into account before the production of #rst data
Conclusion
9 CAA Groningen / NL, 01.12.2012
2007 establishing an expert group by the DFG to evaluate current situation of archaeological research in Germany, to name challenges and to promote the speci#c needs
2010 proposal to the DFG for funding the development of a national research center for archaeological data in Germany
2011 April approval of the proposal
2011 Sept. start of the project IANUS with 2 employes (at the DAI in Berlin) and various working groups
2011-2014 #rst 3-year funding period for the conceptual work (including testbeds)
2014-2017 2nd (planned) 3-year funding period to implement the concept
2018 targeted independent operation of the center
Actions so far
10 CAA Groningen / NL, 01.12.2012
IANUS�Research Data Centre for
Archaeology & Ancient Studies
Beteiligte Institutionen
Participating institutions
Steering Commitee: DFG-Working-Group
Coordination: German Archaeology Institute (DAI)
Basis: member (appr. 100) of the different working groups from various institutions (appr. 40)
11 CAA Groningen / NL, 01.12.2012
Time schedule
12 CAA Groningen / NL, 01.12.2012
Future Services
Long term preservation: • archiving of static, largely processed data from projects and institutions
(e.g. deposit after the end of project funding)
• preparation of data along de#ned work!ows and standards, e.g. formal veri#cation, completion of metadata, format migration, and documentation
• submission into an offline archive system for bitstream preservation Data dissemination: • dissemination of the archive data via an online portal and de#ned interfaces for re-
use for new projects
• granular access according to rules and rights de#ned by the depositors
• associated metadata made searchable in an “archive catalogue“
• allocation of unique persistent identi#ers (PID) for citeability of digital resources
13 CAA Groningen / NL, 01.12.2012
Catalogue: • central web portal to retrieve information about #nd places, objects, activities,
research projects, and archive collections in German • records refer both to content within the IANUS archive and to digital and analoge
ressources of other institutions • if information is available online, redirects will be provided to original platform Guides to good practice: • Central web portal with information about the application of IT in archaeology,
adressing all phases of a data lifecycle • Collect, curate and promote exsiting standards, including practical help to apply
them (e.g. tutorials, templates, tools, best-practice-examples) • Wiki to enable collaborative development on the standards and guides
Future Services
14 CAA Groningen / NL, 01.12.2012
Project support: • Support of ongoing projects with highly dynamic, changing data by a trustworthy,
secure cloud-system to ease the syncronisation and sharing of project data
• By enabeling access to a complete and up-to-date #le-directories the management and the backup of datat will become easier for project with distributed partners
Education and quali#cation: • Provision of courses, summer schools, trainings, online-materials for students,
teachers and researchers
• Transfer of necessary IT-knowledge relating to the production, curation, management and archiving of digital research data
• De#nition of educational standards and core skills for university courses to reduce the gap between the academic training and technical requirements during work life
Future Services
15 CAA Groningen / NL, 01.12.2012
Services and data will be
• related to #les / documents (e.g. pictures, texts, drawings, databases, 3d-modells, point-clouds, ...)
• related to data / metadata in specialised systems (e.g. registries, inventories, online-editions, catalogues, ... )
• free of charge for data consumers, but they have to commit to licences of usage
• liable to non-exclusive agreements, i.e. data can be deposited and published elsewhere
• offered as supplement to own solutions or as substitute for missing infrastructures
Principles
16 CAA Groningen / NL, 01.12.2012
Still in conceptual phase but already decided: • Reference model will be OAIS-standard (Open-Archive-Informations-System) • Depositors de#nie the visablity of their archived #les and related metadata
• Different levels of access • free • registered • restricted to groups • access on demand • embargo
Principles
17 CAA Groningen / NL, 01.12.2012
• Archaeology Data Service – York/UK
• E-dpot Nederlandse archeologie – Den Haag/NL
• GESIS Archive of Social Sciences Data – Köln/D
• PANGAEA: Data Publisher for Earth & Environmental Science – Bremen/D
• MPI for Psycholinguistic: The Language Archive – Nijmegen/NL
Existing examples
18 CAA Groningen / NL, 01.12.2012
Guidelines for the application or IT in archaeological research
In cooperation with the DFG
De#nition of minimal standards on the basis of existing guides
Guidelines: old platform
19 CAA Groningen / NL, 01.12.2012
AG Archivierung der Kommission „Archäologie und Informationssysteme“ im Verband der Landesarchäologen der Bundesrepublik Deutschland
Ratgeber zur
rchivierung digitaler Daten
T r i t t s t e i n e a u f d e m W e g z u m D i g i t a l a r c h i v
Vorabversion 0.07 Die AG Archivierung der Kommission „Archäologie und Informationssysteme“ im Verband der Lan-desarchäologen Deutschlands verfolgt das Ziel, allgemein verfügbare methodische und technische Ansätze zur Archivierung digitaler archäologischer Daten zu begutachten sowie praxisorientierte Emp-fehlungen, Vorschläge und Tipps zu geben. Die Ergebnisse dieser Arbeit münden in diesen Ratgeber. Sind auch noch einige Themenbereiche in Arbeit, so sind doch wesentliche Teile schon verfügbar und werden in dieser Vorabversion vorgestellt
Existing guidelines
20 CAA Groningen / NL, 01.12.2012
Guidelines: new platform
21 CAA Groningen / NL, 01.12.2012
Guidelines: new platform
22 CAA Groningen / NL, 01.12.2012
Guidelines: new platform
23 CAA Groningen / NL, 01.12.2012
IANUS c/o Deutsches Archäologisches Institut
Podbielskialle 69-71
14195 Berlin
Tel.: +49 30 - 187711 359
http://www.ianus-fdz.de [email protected]
Project leader:
Prof. Dr. Ortwin Dally, DAI
Prof. F. Fless, DAI
Coordination:
Dr. Felix Schäfer / Maurice Heinrich
Funding:
DFG – Deutsche Forschungsgemeinschaft
Website & contact