+ All Categories
Home > Education > Using OA Content

Using OA Content

Date post: 09-May-2015
Category:
Upload: philip-bourne
View: 579 times
Download: 1 times
Share this document with a friend
Description:
Joint ICTP-IAEA-UNESCO Workshop on New Trends for Science Dissemination Sep 7 2011.
55
Using Open Access Content: Ten Simple Observations SciVee & Beyond the PDF Philip E. Bourne University of California San Diego [email protected] www.sdsc.edu/pb http://www.slideshare.net/pebourne/p-lo-s
Transcript
Page 1: Using OA Content

Using Open Access Content:Ten Simple Observations

SciVee & Beyond the PDF

Philip E. Bourne

University of California San Diego

[email protected]

www.sdsc.edu/pb

http://www.slideshare.net/pebourne/p-lo-s

Page 2: Using OA Content

My Two Lectures

1. The promise - Open Access, Open Science with particular reference to PLoS

2. The fulfillment - What Open Access facilitates and examples of how it benefits science

Page 3: Using OA Content

The fulfillment - What Open Access facilitates and examples of how it benefits science

• What you might get from this lecture:

– How others are using open science including open access content

– Ideas for how you might use the content

Page 4: Using OA Content

Todays Exemplars

http://www.mendeley.com/

http://getutopia.com/documents/http://www.scivee.tv/node/17389

Page 5: Using OA Content

Let me Start with a Few Observations

Observation 1. Scientific culture is causing us to try and write more

and read more

Page 6: Using OA Content

You Cannot Possibly Read a Fraction of the Papers You Should

write more and read more Renear & Palmer 2009 Science 325:828-832

Page 7: Using OA Content

Scanning More Reading Less

Renear & Palmer 2009 Science 325:828-832write more and read more

Page 8: Using OA Content

And So…

• There has been a paradigm shift which places more emphasis on writing and less on reading – witness blogs, use of literature aggregators (e.g. PubMed), H-factors, etc.

• We need help in assimilating knowledge

write more and read more

Page 9: Using OA Content

Observation 2

In 1993 there were very few electronic journals, by 2003 nearly all were on-line, by 2013 there will

be little or no paper

Most traditional publishers have only really achieved an electronic print like experience – the power of the medium is for the taking

Page 10: Using OA Content

Observation 3. The Sociology of Scientific Disciplines is Different

Page 11: Using OA Content

Observation 4:

• The biomedical sciences is progressive:– Alternative business models have gained

ground – Open Access– Databases are becoming more like journals and

journals are becoming more like databases– New modes of knowledge and data access are

gaining some ground e.g.• Textpresso – ontology-based mining and retrieval

system• iHOP Information Hyperlinked over Proteins

Page 12: Using OA Content

Observation 5.I Believe Open Access IF

Fully Accepted Could Profoundly Change Scholarly

Discourse

It remains a big IF

Open Access: Taking Full Advantage of the ContentPLoS Comp. Biol. 2008 4(3) e1000037

Page 13: Using OA Content

Its Happening in the Closed Access Space

• A very clever idea – The App model

• Leverage content• Provide an open API• Get the community to

do all the work• Drive folks to buy

content

Why Don’t We Have Such Developments in OA?

Page 14: Using OA Content

Growth of PubMed Central

Open access could profoundly change scholarly discourse

Page 15: Using OA Content

Open Access(Creative Commons License)

1. All published materials available on-line free to all (author pays model)

2. Unrestricted access to all published material in various formats eg XML provided attribution is given to the original author(s)

3. Copyright remains with the author

Open access could profoundly change scholarly discourse

Page 16: Using OA Content

Open Access(Creative Commons License)

1. All published materials available on-line free to all (reader pays model)

2. Unrestricted access to all published material in various formats eg XML provided attribution is given to the original author(s)

3. Copyright remains with the author Open Access: Taking Full Advantage of the ContentPLoS Comp. Biol. 2008 4(3) e1000037

Open access could profoundly change scholarly discourse

Page 17: Using OA Content

Observation 6

A biological database is not really that different from a biological journal – this can be exploited

PLoS Comp. Biol. 2005 1(3) e34

Page 18: Using OA Content

The Data Knowledge Cycle

BiocurationElectronic Supplements

Databases versus journals

Page 19: Using OA Content

Both Are Under Stress

• PubMed contains ~21M entries (May 2011)

• ~100,000 papers indexed per month

• In Feb 2009:– 67,406,898 interactive

searches were done– 92,216,786 entries were

viewed

• 1330 databases reported in NAR 2011

• MetaBase http://biodatabase.org reports 2,651 entries edited 12,587 times

PLoS Comp. Biol. 2005 1(3) e34

Page 20: Using OA Content

Some More Comparisons

• Journals have a pretty standardized interface

• Journals have a business model

• The quality is declining as numbers increase (?)

• Audience believes they are sustainable

• Efforts to make the interfaces different!

• Little attempt at a business model compared to the Web 2.0 world

• Quality is increasing (?)• Not well sustained

PLoS Comp. Biol. 2008. 4(7): e1000136Databases versus journals

Page 21: Using OA Content

Some More Comparisons

• New publishing models eg open access, self publishing, open review

• Web 2.0 influence eg social networks

• Use of rich media• The review process is

failing• New metrics

• Read and write eg Wikis

• New services eg restful, widgets

• Use of Rich Media• Crowd review emerging

Databases versus journals

Page 22: Using OA Content

Duh

• If we need to acquire more knowledge quickly

• If more literature and data are becoming open

• If both are under stress

• Why don’t we merge journals and databases for a new learning experience

Page 23: Using OA Content

23

The Test Bed

http://www.wwpdb.org/

http://www.plos.org/ http://www.pubmedcentral.nih.gov/

Merge journals and databases

Page 24: Using OA Content

The World Wide Protein Data Bank

• The single worldwide repository for data on the structure of biological macromolecules

• Vital for drug discovery and the life sciences

• 38 years old• Free to allhttp://www.wwpdb.org

Merge journals and databases

Page 25: Using OA Content

The World Wide Protein Data Bank

• Paper not published unless data are deposited – strong data to literature correspondence

• Highly structured data conforming to an extensive ontology

• DOI’s assigned to every structure

http://www.wwpdb.org

Merge journals and databases

Page 26: Using OA Content

The PLoS/PMC Corpus – Under the Hood

• Conforms well/partially to the NLM DTD – little markup of content

• PMC – some PDFs !

• The lack of conformance will come back to haunt us!

Page 27: Using OA Content

Author Submission via the Web Depositor Submission via the Web

Syntax Checking Syntax Checking

Review by Scientists &Editors

Review by Annotators

Corrections by AuthorCorrections by Depositor

Publish – Web Accessible Release – Web Accessible

Similar Processes Lead to Similar Resources

Merge journals and databases

Page 28: Using OA Content

So the processes are not that dissimilar it is the final product that is perceived so differently

Even that might be changing slowly?

PLoS Comp. Biol. 2008 4(12) e1000247

Merge journals and databases

Page 29: Using OA Content

www.rcsb.org/pdb/explore/literature.do?structureId=1TIM

Merged: The Database View

Merge journals and databases

Page 30: Using OA Content

Merged: The Literature ViewNucleic Acids Research 2008 36(S2) W385-389

http://biolit.ucsd.edu

Merge journals and databases

Page 31: Using OA Content

Merge journals and databases

Page 32: Using OA Content

ICTP Trieste, December 10, 2007 32

Merge journals and databases

Page 33: Using OA Content

1. A link brings up figures from the paper

0. Full text of PLoS papers stored in a database

2. Clicking the paper figure retrievesdata from the PDB which is

analyzed

3. A composite view ofjournal and database

content results

The Near Future

1. User reads a paper

2. Clicks on a figure. Figure can be manipulated, annotated, interrogated

3. Clicking the figure gives a composite database journal view

4. This takes you to yet more papers or databases

4. The composite view haslinks to pertinent blocks

of literature text and back to the PDB

1.

2.

3.

4.

The Knowledge and Data Cycle

http://biolit.ucsd.edu

Enhanced modes of learning

Page 34: Using OA Content

Observation 7: This is Literature Post-processing

Better to Get the Authors Involved

• Authors are the absolute experts on the content

• More effective distribution of labor

• Add metadata before the article enters the publishing process

Merge journals and databases – requires semantic enrichment

Page 35: Using OA Content

Word 2007 Add-in for authors

• Allows authors to add metadata as they write, before they submit the manuscript

• Authors are assisted by automated term recognition– OBO ontologies– Database IDs

• Metadata are embedded directly into the manuscript document via XML tags, OOXML format– Open– Machine-readable

• Open source, Microsoft Public License

http://www.codeplex.com/ucsdbiolit

Merge journals and databases – requires semantic enrichment

Page 36: Using OA Content

Challenges• Author use

– Familiarity with ontologies, terms– Agreement between co-authors

• End-use of semantically enriched manuscript

– Combine with NLM XML standard• Article Authoring Add-in

Merge journals and databases – requires semantic enrichment

Page 37: Using OA Content

Challenges:Author Use

IF one or more publishers fast tracked a paper that had semantic

markup I would argue it would catch on in no time

Merge journals and databases – requires semantic enrichment

Page 38: Using OA Content

Observation 8: There Are Some Simple Things We Can Do to

Mine the Corpus

Page 39: Using OA Content

Where We Would Like to Be: Data Clustering via the Literature

Immunology Literature

Cardiac DiseaseLiterature

Shared FunctionEnhanced modes of learning

Page 40: Using OA Content

Observation 9: The Use of Rich Media is Underutilized

Page 41: Using OA Content

Yes YouTube Can Increase the Rate of Discovery

Page 42: Using OA Content

Pubcast – Video Integrated with the Full Text of the Paper

Page 43: Using OA Content

AndroidiPhone

Windows Phone 7

Step 1presenter starts

PowerPoint

Step 2presenter starts

recording onsmart phone

Step 3presenter stops recording and

initiates upload

Slides

Website

Step 5slides and podcastare automatically

synchronizedSync FilePodcast

Step 6listener

plays back synchronized presentation

Proposal - The TeachU WorkflowMacPC

Step 4slides areuploaded

Page 44: Using OA Content

Lessons

• It is a form of expression the current YouTubers embrace and may become as ubiquitous as papers and slide presentations in the next few years

• We are reinventing television

• Its only going to work if it is easy to publish and the reward is obvious

Page 45: Using OA Content

Observation 10:Scientific Reproducability

Requires we Publish Workflows

Page 46: Using OA Content

Yes The Workflow is Real

Page 47: Using OA Content

Reproducibility

• My views of reproducibility:– We all express the importance, but the only time

it is tested is when something is truly novel or error is suspected

– Reproducability covers a spectrum of meaning – by whom and with how much effort

– The longer the time lag the less likely something is reproducible

Page 48: Using OA Content

Workflow Tools Might be the Answer

Taverna

Wings

Page 49: Using OA Content

Consider an Example: Our Own Experience in Capturing the Scientific Process to Make

it Open and Reproducable

• Its hard and embarrassing• We have a working prototype using Wings• I can feel the potential productivity gains• My students are more doubtful• Its been a lot of fun and will enable us to

improve our processes regardless of the workflow system itself

Page 50: Using OA Content

Problems with Publishing Workflows

• Workflows are not linear• Workflow : paper is not 1:1• Confidentiality• Peer review• Infrastructure• Community acceptance• Reward system• No publisher seems willing to touch them

Page 51: Using OA Content

Where Will It All End?http://richard.cyganiak.de/2007/10/lod/lod-datasets_2010-09-22_colored.html

Page 52: Using OA Content

General References

• What Do I Want from the Publisher of the Future PLoS Comp Biol 6(5): e1000787

• Fourth Paradigm: Data Intensive Scientific Discovery http://research.microsoft.com/enus/collaboration/fourthparadigm/

Page 53: Using OA Content

References to Exemplars• Semantic Biochemical Journal - 2010: Using Utopia

• Article of the Future, Cell, 2009:• Prospect, Royal Society of Chemistry, 2009:• Adventures in Semantic Publishing, Oxford U, 2009:

• The Structured Digital Abstract, Seringhaus/Gerstein, 2008• CWA Nanopublications – 2010• https://sites.google.com/site/beyondthepdf/

• https://sites.google.com/site/futureofresearchcommunications/

Page 54: Using OA Content

Acknowledgements

• BioLit Team– Lynn Fink– Parker Williams– Marco Martinez– Rahul Chandran– Greg Quinn

• Microsoft Scholarly Communications– Pablo Fernicola– Lee Dirks– Savas Parastitidas– Alex Wade– Tony Hey

• wwPDB Team– Boki Beran

– Wolfgnag Bluhm

– Andreas Prlic

– Greg Quinn

– Peter Rose

– Ben Yutick

– Chunxaio Zhu

http://biolit.ucsd.eduhttp://www.codeplex.com/ucsdbiolit

Page 55: Using OA Content

Questions?

[email protected]


Recommended