+ All Categories
Home > Education > Idcc kansa-kansa-arbuckle

Idcc kansa-kansa-arbuckle

Date post: 15-Dec-2014
Category:
Upload: eric-kansa
View: 315 times
Download: 3 times
Share this document with a friend
Description:
Presentation for the San Francisco #IDCC14 conference (http://www.dcc.ac.uk/events/idcc14/day-two-papers). The presentation covers publishing zooarchaeology data with Open Context (http://opencontext.org) to study the spread of farming from the Near East to Europe through Anatolia. It looks at editorial processes, linked data annotation, and other workflow concerns relating to making raw data more usable for comparative analysis.
Popular Tags:
49
Publishing and Pushing: Mixing Models for Communicating Research Data in Archaeology Sarah Whitcher Kansa The Alexandria Archive Institute & Open Context Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/> Benjamin Arbuckle University of North Carolina, Chapel Hill Eric C. Kansa (@ekansa) UC Berkeley D-Lab & Open Context
Transcript
Page 1: Idcc kansa-kansa-arbuckle

Publishing and Pushing: Mixing Models for Communicating

Research Data in Archaeology

Publishing and Pushing: Mixing Models for Communicating

Research Data in Archaeology

Sarah Whitcher KansaThe Alexandria Archive Institute

& Open Context

Unless otherwise indicated, this work is licensed under a Creative Commons Attribution 3.0 License <http://creativecommons.org/licenses/by/3.0/>

Benjamin ArbuckleUniversity of North Carolina,

Chapel Hill

Eric C. Kansa (@ekansa)UC Berkeley D-Lab

& Open Context

Page 2: Idcc kansa-kansa-arbuckle

IntroductionIntroduction

Challenges in Reusing Data1. Background2. Data publishing workflow3. Data curation and dynamism

Page 3: Idcc kansa-kansa-arbuckle
Page 4: Idcc kansa-kansa-arbuckle
Page 5: Idcc kansa-kansa-arbuckle

Need more carrots!1. Citation, credit,

intellectually valued2. Research outcomes

(new insights from data reuse!)

Page 6: Idcc kansa-kansa-arbuckle

EOL Computable Data Challenge(Ben Arbuckle, Sarah W. Kansa, Eric Kansa)

Page 7: Idcc kansa-kansa-arbuckle
Page 8: Idcc kansa-kansa-arbuckle

Large scale data sharing & integration for exploring the origins of farming. Funded by EOL / NEH

Page 9: Idcc kansa-kansa-arbuckle

1. 300,000 bone specimens2. Complex: dozens, up to 110

descriptive fields3. 34 contributors from 15

archaeological sites4. More than 4 person years

of effort to create the data !

Page 10: Idcc kansa-kansa-arbuckle

Relatively collaborative bunch, Ben Arbuckle cultivated relationships & built trust over years prior to EOL funding.

Page 11: Idcc kansa-kansa-arbuckle

“204: Dynamics of Data Reuse when Aggregating Data through Time

and Space: The Case of Archaeology and Zoology”

Elizabeth Yakel; Ixchel Faniel; Rebecca Frank

Page 12: Idcc kansa-kansa-arbuckle

IntroductionIntroduction

Challenges in Reusing Data1. Background2. Data publishing workflow3. Data curation and dynamism

Page 13: Idcc kansa-kansa-arbuckle

1. Referenced by US National Science Foundation and National Endowment for the Humanities for Data Management

2. “Data sharing as publishing” metaphor

Page 14: Idcc kansa-kansa-arbuckle

Raw Data: Idiosyncratic, sometimes highly coded, often inconsistent

Page 15: Idcc kansa-kansa-arbuckle

Raw Data Can Be UnappetizingRaw Data Can Be Unappetizing

Page 16: Idcc kansa-kansa-arbuckle

Publishing Workflow

Improve / Enhance1. Consistency2. Context

(intelligibility)

Page 17: Idcc kansa-kansa-arbuckle

Sometimes data is better served cooked

Page 18: Idcc kansa-kansa-arbuckle

- Documentation- Review, editing

- Annotation

Page 19: Idcc kansa-kansa-arbuckle

- Documentation- Review, editing

- Annotation

Page 20: Idcc kansa-kansa-arbuckle

?- Documentation- Review, editing

- Annotation

Page 21: Idcc kansa-kansa-arbuckle

- Documentation- Review, editing

- Annotation

Page 22: Idcc kansa-kansa-arbuckle

Decoding: Time consuming effort; 10 times (!) longer…

- Documentation- Review, editing

- Annotation

Page 23: Idcc kansa-kansa-arbuckle

“Ovis orientalis”

Code: 14

Wild sheep

Code: 70

Code: 16

Ovis orientalis

Code: 15

Sheep, wild

O. orientalis

Sheep (wild)

Page 24: Idcc kansa-kansa-arbuckle

- Documentation- Review, editing

- Annotation

Page 25: Idcc kansa-kansa-arbuckle

“Ovis orientalis”http://eol.org/pages/311906/

Code: 14

Wild sheep

Code: 70

Code: 16

Ovis orientalis

Code: 15

Sheep, wild

O. orientalis

Sheep (wild)

Page 26: Idcc kansa-kansa-arbuckle

● Controlled vocabulary● Linked Data applications

Page 27: Idcc kansa-kansa-arbuckle

“Sheep/goat”http://eol.org/pages/32609438/

1. Needed to mint new concepts like “sheep/goat”

2. Vocabularies need to be responsive for multidisciplinary applications

Page 28: Idcc kansa-kansa-arbuckle
Page 29: Idcc kansa-kansa-arbuckle
Page 30: Idcc kansa-kansa-arbuckle

Linking to UBERON1. Needed a controlled vocabulary for

bone anatomy2. Better data modeling than common in

zooarchaeology, adds quality.

Page 31: Idcc kansa-kansa-arbuckle

Linking to UBERON1. Models links between anatomy,

developmental biology, and genetics2. Unexpected links between the

Humanities and Bioinformatics!

Page 32: Idcc kansa-kansa-arbuckle
Page 33: Idcc kansa-kansa-arbuckle
Page 34: Idcc kansa-kansa-arbuckle

7000 BC (many pigs, cattle)

7500 BC (sheep + goat dominate, few pigs, few cattle)

6500 BC (few pigs, mixing with wild animals?)

8000 BC (cattle, pigs,sheep + goats)

• Not a neat model of progress to adopt a more productive economy. Very different, sometimes piecemeal adoption in different regions.

• Separate coastal and inland routes for the spread of domestic animals, over a 1000-year time period.

Page 35: Idcc kansa-kansa-arbuckle

Easy to Align1. Animal taxonomy2. Bone anatomy3. Sex determinations4. Side of the animal5. Fusion (bone growth, up to

a point)

Page 36: Idcc kansa-kansa-arbuckle

Hard to Align (poor modeling, recording)1. Tooth wear (age)2. Fusion data3. Measurements

Despite common research methods!!

Page 37: Idcc kansa-kansa-arbuckle

“Under the hood” exposure will lead to better data documentation practices?

Page 38: Idcc kansa-kansa-arbuckle

Nobody expected their data to see wider scrutiny either..

Page 39: Idcc kansa-kansa-arbuckle

Professional expectations for data reuse

1. Need better data modeling (than feasible with, cough, Excel)

2. Data validation, normalization

3. Requires training & incentives for researchers to care more about quality of their data!

Page 40: Idcc kansa-kansa-arbuckle

Data are challenging!1. Decoding takes 10x longer2. Data management plans should also

cover data modeling, quality control (esp. validation)

3. More work needed modeling research methods (esp. sampling)

4. Editing, annotation requires lots of back-and-forth with data authors

5. Data needs investment to be useful!

Page 41: Idcc kansa-kansa-arbuckle

IntroductionIntroduction

Challenges in Reusing Data1. Background2. Data publishing workflow3. Data curation and dynamism

Page 42: Idcc kansa-kansa-arbuckle

Investing in Data is a Continual Need1. Data and code co-evolve. New

visualizations, analysis may reveal unseen problems in data.

2. Data and metadata change routinely (revised stratigraphy requires ongoing updates to data in this analysis)

3. Problems, interpretive issues in data (and annotations) keep cropping up.

4. Is publishing a bad metaphor implying a static product?

Page 43: Idcc kansa-kansa-arbuckle
Page 44: Idcc kansa-kansa-arbuckle

Data sharing as publication

Data sharing as open source release cycles?

Page 45: Idcc kansa-kansa-arbuckle

Data sharing as publication

Data sharing as open source release cycles?

Page 46: Idcc kansa-kansa-arbuckle

Data sharing as publicationAND

Data sharing as open source release cycles

Page 47: Idcc kansa-kansa-arbuckle

One does not simply walk into Mordor

Academia and share usable data…

Image Credit: Copyright Newline Cinema

Page 48: Idcc kansa-kansa-arbuckle

Final ThoughtsFinal Thoughts

Data require intellectual investment, methodological and theoretical innovation.

Institutional structures poorly configured to support data powered research

New professional roles needed, but who will pay for it?

Page 49: Idcc kansa-kansa-arbuckle

Thank you!Thank you!

IDCC reviewers (excellent, very helpful

comments!)


Recommended