Date post: | 11-Apr-2017 |
Category: |
Education |
Upload: | karen-estlund |
View: | 129 times |
Download: | 0 times |
Newspapers, IIIF, & ALTO
ALA 2016Karen EstlundCo-chair, IIIF Newspaper Interest Group
Associate Dean for Technology & Digital StrategiesPenn State University [email protected]
What is IIIF?
IIIF APIs
Image API
{scheme}://{server}{/prefix}/{identifier}/{region}/{size}/{rotation}/{quality}.{format}
Presentation API
…..
Search API: Use Cases
● Searching OCR generated text to find words or phrases within a book, newspaper or other primarily textual content.
● Searching transcribed content, provided by crowd-sourcing or transformation of scholarly output.
● Searching multiple streams of content, such as the translation or edition, rather than the raw transcription of the content, to jump to the appropriate part of an object.
● Searching on sections of text, such as defined chapters or articles.
● Searching for user provided commentary about the resource, either as a discovery mechanism for the resource or for the discussion.
● Discovering similar sections of text to compare either the content or the object.
IIIF Newspaper Interest Group Goals
● To determine development + Usage of IIIF for digital Newspapers
● To demonstrate best practice in exploitation of IIIF for Newspapers
● To promote usage of IIIF for Newspapers
● To consider related formats, especially serials
● To explore and exploit possibilities for search, discovery, and annotation of Newspapers
Chairs: Karen Estlund, Penn State & Glen Robson, National Library of Wales
Special Thanks to Glen! I’ll be using examples from National Library of Wales throughout.
IIIF Newspaper Resources
● Newspaper IG Page: http://iiif.io/community/groups/newspapers/
● Newspaper IG Working Documents: goo.gl/jNFfVw
● IIIF Awesome: https://github.com/IIIF/awesome-iiif
● IIIF User Stories: https://github.com/IIIF/iiif-stories/
● Slack Channel: iiif.slack.com #newspapers
○ Email [email protected] to be added
● Code of Conduct: http://iiif.io/event/conduct/
IIIF Mapping
Newspapers IIIF
Title Collection
Issue Manifest
Edition Manifest
Article Range
Page Canvas
Image Image
Alto Annotations
IIIF Newspapers Best Practices Document (Draft)
https://goo.gl/yY3T9T
OCR in Annotation List
{ "@id": "http://dams.llgc.org.uk/iiif/4342443/annotation/list/ART8.json", "@type": "sc:AnnotationList", "label": "SIEraOROLlOIOAt, OBSERVATIONS,", "within": { "@id": "http://dams.llgc.org.uk/iiif/4342439/annotation/layer/ART8.json", "@type": "sc:Layer", "label": "OCR Article Text" } },
OCR and ALTO in Open Annotations { "@type":"oa:Annotation", "motivation":"sc:painting", "resource": { "@type":"cnt:ContentAsText", "format":"text/plain", "chars":"SIEraOROLlOIOAt, OBSERVATIONS," }, "on":"http://dams.llgc.org.uk/iiif/4342439/canvas/4342443#xywh=2003,4684,554,34" },
Also Recommend Link to the OCR in the Manifest
"seeAlso": [ { "profile": "https://www.loc.gov/standards/alto/", "@id": "http://oni.mith.us/lccn/sn83045396/1911-09-17/ed-1/seq-1/ocr.xml", "format": "text/xml" } ],
Getting Newspapers Into IIIF
● Quick Start Guide for IIIF: http://iiif.io/technical-details/
● Newspaper Best Practices Model: Forthcoming, Draft: https://goo.gl/yY3T9T
● NDNP Data○ Open ONI (open source fork from Chronicling America
software): https://github.com/open-oni/open-oni ○ RAIS image server: https://github.com/uoregon-
libraries/rais-image-server ○ Python Library to host static images: https://github.
com/umd-mith/ndnp_iiif
IIIF Guidance in Open ONI / How to use APIs
Example: http://[url]/about/api/
Open ONI Newspapers in IIIF Viewer
IIIF Newspapers in IIIF Viewer
http://newspapers.library.wales/view/4342439/4342445/53/
Comparing Newspapers in IIIF Viewer (Mirador)
http://iiif.github.io/mirador/demo/
OCR and ALTO as Annotation Layers
What this Means for Newspapers