+ All Categories
Home > Education > British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

Date post: 11-Jan-2017
Category:
Upload: labsbl
View: 92 times
Download: 0 times
Share this document with a friend
39
1 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/ British Library Labs Opening up the British Library’s Cultural Heritage Data Mahendra Mahey 1445 – 1500, 24-25 November 2016, Open Cultural Data Symposium Vasari Research Centre for Art and Technology & Birkbeck Centre for Technology and Publishing Birkbeck University of London, Keynes Library, 43 Gordon Square, London, WC1H 0PD. https://goo.gl/
Transcript
Page 1: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

1 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

British Library LabsOpening up the British Library’s Cultural Heritage DataMahendra Mahey

1445 – 1500, 24-25 November 2016,Open Cultural Data SymposiumVasari Research Centre for Art and Technology & Birkbeck Centre for Technology and PublishingBirkbeck University of London, Keynes Library,43 Gordon Square, London, WC1H 0PD.

https://goo.gl/

Page 2: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

2 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

http://www.bl.uk/projects/british-library-labsFunded by the Andrew W. Mellon Foundation

Page 3: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

3 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Collections – not just books!> 180* million items

> 0.8* m serial titles

> 8* m stamps

> 14* m books

> 3* m sound recordings> 4* m maps

> 1.6* m musical scores

> 0.3* m manuscripts

> 60* m patents

King’s Library *Estimates

Page 4: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

4 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

#bldigital1-2 %* digitised

* estimate

Digitisation

Partnerships Commercial & Other Organisations

Amountincreasing rapidly

Bias in digitisation

http://goo.gl/bR9UJL Sample Generator

Page 5: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

5 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

digitalData all around us!

/

Knowledge Quarter London55 knowledge organisations within 1 mile radius of Kings Cross, http://www.knowledgequarter.london

https://goo.gl/pGO7QY

digitalData all around us!

Page 6: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

6 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Curators / Researchers

Access & Reuse Group

©

Developers/ Technical

Staff

Project Board

Universities & wider

The World

ResearchersBL Labs

British Library

Digital Scholarship

DigitalContent

United Kingdom

Advisory Board

Digital Research

Stakeholders involved in Labs

Page 7: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

7 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Wider…not just Researches

Researchershttps://goo.gl/WutNyi

Artistshttp://goo.gl/nNKhQ2

LibrariansCurators

https://goo.gl/9NWZUW

Software Developershttps://goo.gl/7QQ5Tf

Archivistshttps://goo.gl/x7b4tg Educators

https://goo.gl/qh01Mi

Page 8: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

8 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Playbills, Books, Newspapers (includes OCR)

Digital collections and Datasets

British National Bibliography

http://bnb.data.bl.uk

http://sounds.bl.ukhttp://dml.city.ac.uk/

Music (Recordings & Sheet) & Soundshttp://goo.gl/frSMJtBroadcast News (TV and Radio)

http://goo.gl/cwThHw

http://goo.gl/pBkisZhttp://goo.gl/E8aRyQ

Usage dataImages, Manuscripts & Maps

http://www.qdl.qa/ Qatar Digital Library

http://idp.bl.uk/International Dunhuang

Project

Mapshttp://www.bl.uk/maps/

Hebrew Manuscriptshttp://goo.gl/4sbCp9

Flickr & Wikimedia Commons

https://goo.gl/LZRmaZ

Page 9: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

9 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

only in Reading

Rooms due to ©

only on site due to

© or ethical etc

not online / available –

various storage devices,

personal data

online and open

British Library

online behind paywall

Challenges of Digital access at the Library

Page 10: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

10 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

The Story of the Collection!

Collection

Curator

Who paid for the digitisation?

Who did the digitisation?Technology used

Born digital?

Published

Unpublished

Where is it?

Can it still be accessed?

Generates income

Reputational RiskLegalities

Political

Ego SurprisesMetadata

Old format not supported

What media was the digitisation done from?

Documentation

No Metadata

Messy Metadata

Still there?

Sometimes it’s complicatedBetter to know as much as possibleIf you want to open it up!

Page 11: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

11 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Finding Open Digital Collections• Curated?

Learn the story behind a collection!Is there a human who knows the ‘story’ about the collection, who wants it used, are there any surprises lurking?

• Where is it, is it accessible?

• Licensing?Internal Access and Reuse and Licensing Group (Risk assessment group – Strategic, Commercial, Copyright, Curatorial, Technical)

• Metadata available? What state is it and does it need cleaning?

https://goo.gl/Qjeqo1

https://goo.gl/Kfc4qc

Access & Reuse Group

©

Page 12: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

12 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Open Licensed Digital Content?

15% Openly Licensed

Working through

Breakdown by collection*Manuscripts 59%Books 9%Maps and Views 7%Newspapers 3%Archives and Records 3%Paintings, Prints and Drawings 2%

*Based on digitisation projects

Largest proportion of fundingPublic / Private Partnership

Page 13: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

13 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Cultural Heritage DatasetsDatasets about our collections Bibliographic datasets relating to our published and archival holdings

Datasets for content mining Content suitable for use in text and data mining research

Datasets for image analysisImage collections suitable for large-scale image-analysis-based research

Datasets from UK Web ArchiveData and API services available for accessing UK Web Archive

Digital mapping Geospatial data, cartographic applications, digital aerial photography and scanned historic map materials https://data.bl.uk

Launched November 7, 2016

Discussion list: http://www.jiscmail.ac.uk/CULTURAL-HERITAGE-DATASETS

Page 14: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

14 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Competition

Awards

Projects

Tell us your ideas of what to do with our digital content

Show us what you have already done with our digital content in research, artistic, commercial and learning and

teaching categories

Talk to us about working on collaborative projects

Page 15: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

15 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Labs Engagement 2016• 18 institutions visited

• 5000 miles travelled

• 50 presentations & 25 workshops

• 900 researchers / artists/ entrepreneurs / educators

• 400 expressions of interest

• 40 researchers, artists, entrepreneurs & educators supported

• 60TB of data via post

• 9TB of data via data.bl.uk (Nov 16)

• Over half a billion views on BL Flickr Commons since launch in Dec 2013

It’s hard work!

Page 16: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

16 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Why and how are doing this?

• Working closely with and listening to those who want use our digital collections and data for their work and helping to build services, tools and processes to support them

• We can learn how we are and should be supporting them.– Is the access to digital collections we provide sufficient?– Do we have the right tools?– Do we provide the right support?– Where are the gaps between what they want and what we can

give?– How do we build the bridges to overcome them?– Many more reasons…

Page 17: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

17 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

only in Reading

Rooms due to ©

only on site due to

© or ethical

not online / available –

various storage devices,

personal data

online and open

British Library

online behind paywall

Digital access at the Library

Labs Residency Modelhttps://goo.gl/tvNVRB

http://goo.gl/ii0XHG

Page 18: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

18 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Some Lessons Learned and Challenges so far…• Everything starts from a conversation (external and internal)!

• Need to have several conversations with several stakeholders and tap into their tacit knowledge that isn’t always written down (esp. internal).

• It’s hard work at the beginning!

• Expectations change when researchers actually see the data, systems and experience the ‘culture’ of the organisation.

• We tend to work with researchers who can be ‘flexible’ with their research questions and are willing to embrace challenges.

• Often misunderstandings because of jargon & different meaning of words.

• Embrace dirty data, it may never be perfect!

Page 19: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

19 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Some Lessons Learned and Challenges so far…(2)• Many researchers have the domain knowledge but lack the technical skills to use

Digital Research methods. Should they be teamed up with those that have problems that need solving (Computing) or get trained?

• Identifying / bridging gaps for researchers to use data, help them ‘navigate’ through the Library to get the data they want (sometimes).

• Huge appetite to use digital content & data (e.g. Flickr Commons stats).

• Start small and simple, but think big!

• Create and embrace serendipity, stimulate the imagination, work fast, give it energy.

• Letting go of the emotional and psychological connection to “my” collection

• If digitised collections are not used, what is the point of digitising them?

• Fail faster (don’t be afraid), small experiments, reject perfectionism.

Page 20: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

20 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

What did people

actually do?

Page 21: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

21 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Machine Learning / Reading

• Analogies to how humans read

• Machines acquire knowledge

• Use that knowledge to make sense of new situations

• Not well understood area…

Page 22: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

22 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

The smell of soup!

Thanks to Memo Akten (@memotv on twitter) for the inspiration!

https://goo.gl/toq4Bo Nasreddin, 13th Century Turkish Sufi

http://web2.uvcs.uvic.ca/elc/studyzone/330/reading/smell1.htm

Page 23: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

23 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Finding things in messy data

Mrs Folly• Clean up manually• Get ‘ground truth’• Write code to find things

reliably in it automatically• Try code on messy content• Tweak if necessary

Mrs Folly

Page 24: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

24 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

http://victorianhumour.tubmblr.com

Victorian Meme Machine (2014)

https://goo.gl/HMqDt3

Bob Nicholson

http://victorianhumour.tumblr.com/Bob Nicholson interviewed on BBC Radio 4 Making History Programme:

http://goo.gl/fmV9ep

And telling jokes to the public:http://goo.gl/xIDRhz

https://www.youtube.com/watch?v=-GRgj7Q5OM0

Rob Walker, Victorian Mother-in-law Jokes

Victorian Comedy Night, 7 Nov 2016

Page 25: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

25 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Katrina Navickas (2015) Political Meetings Mapper

http://politicalmeetingsmapper.co.ukhttps://goo.gl/Qq78Oa

Labs Symposium 2015

https://goo.gl/BSA3be

Interview 2015

The Chartist Newspaperhttp://goo.gl/vOLSnH

Chartist Monster Meeting

Chartists Re-enactment London

Page 26: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

26 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Black Abolitionist Performances & their Presence in Britain (2016) – Hannah-Rose Murray

FrederickDouglass

EllenCraft

JosiahHenson

Ida B Wells

A Performance by Joe Williams &

Martelle Edinborough

http://frederickdouglassinbritain.com/

Page 27: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

27 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Data-mining verse in 18th Century newspapersBL Labs Project 16-17, Jennifer Batt

https://goo.gl/5Akthd

Page 28: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

28 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

The Magic of Openness!

• By opening collections up we are creating the possibility to have them used in ways only restricted by human imagination.

• Need to work hard to tell people about our Digital Collections and Data especially if not easy to find, creating serendipity and opportunities for use!

• Give plenty of examples to inspire use!

• Support and celebrate the use!

Page 29: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

29 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

What can 65,000books tell us?

Image: Artwork by Alicia Martin

Just one open digital collection

Page 30: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

30 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Worked better for female faces than men’s

Press

http://mechanicalcurator.tumblr.comPosts image every 30 minutes

http://www.flickr.com/photos/britishlibrary/

1,020,418 imagesneed tagging!

Creative uses of images

Face recognition

Mechanical Curator

http://goo.gl/qPPgxX

Flickr

Snipping out imagesfrom 65,000 Digitised Books*

>600,000,000 views

>15,500,000 tags

https://goo.gl/FgZ4HM

Work @ BL by Ben O’Steen, Labs

and Digital Research Team*Matt Prior - http://goo.gl/j29Tnx

Since Dec 2013

Page 31: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

31 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Tagging a million imagesIterative Crowdsourcing

http://goo.gl/j6fxac

Cardiff University’sLost Visions Project

http://www.metadatagames.org/

Metadata Games

James Heald

Mario Klingemann

Chico 45

Use computational methods

Human Tagger

Top British Library Flickr Commons Taggers

http://goo.gl/8SkfM1

Machine LearningSearch Engine

& Google Imagesearch

Page 32: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

32 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Special Jury’s Prize (2015)James Heald – Wikimedia and Map work

https://goo.gl/WYZCB2

http://goo.gl/HNQq5e

https://goo.gl/VPgffL

https://commons.wikimedia.org/

https://goo.gl/djtm1b

Labs Symposium (2015)Geotagging maps

54,000 Maps

Page 33: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

33 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Adam Crymble (2015)Crowdsource Arcade

What if crowd sourcing

looked like this?

http://goo.gl/LBfJ4W

http://goo.gl/OH9pOZ

https://goo.gl/7z0j8p

30 mins talkLabs Symposium (2015)

https://goo.gl/SSRsdd

5 min interview (2015)

http://goo.gl/0APpE8

Game Jam

Page 34: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

34 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

SherlockNet: Competition Winner 2016Karen Wang, Luda Zhao and Brian Do

Using Convolutional Neural Networks to Automatically Tag and Caption the British Library Flickr Commons 1 million Image Collection

Classify into one of 12 categories

>15 million tags added (total now 15.5 million overall)>100,000 experimental captions

bit.ly/sherlocknet

Pooled surrounding Optical Character Recognised

text on page from similar images

Used various publically available sources training sets.

Tags Captions

Page 35: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

35 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Artistic / Creative Works

http://goo.gl/dM8ieA

Mario Klingeman (2015)

http://www.crossroadsofcuriosity.com

David Normal 2014 and 2015

https://www.youtube.com/watch?v=-GRgj7Q5OM0

Rob Walker 2014

http://goo.gl/bNxGZZ

Kris Hoffman (2016)

https://goo.gl/QilqqT

Jiayi Chong 2016Ling Low 2016

https://www.youtube.com/watch?v=bcOP1E5bRE0https://www.facebook.com/RealmlandStory/

Paul Rand Pierce 2016

A Hat on the GroundSpells trouble

Tragic Looking Women44 Men who Look 44

(Notice the direction faces)

Page 36: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

36 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Mario Klingemann 2016

https://www.youtube.com/watch?v=xgnxnmqnR7YGoogle Arts and Culture Lab – Experiments with Machine Learning

https://artsexperiments.withgoogle.com/

Page 37: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

37 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Imaginary Cities – BL Labs Project 16-17Michael Takeo Magruder

https://goo.gl/4ARwTy

An artistic exploration seeking to create provocative fictional cityscapes for the Information Age from the British Library’s digital collection of historic urban maps

Page 38: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

38 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

More ideas for inspiration!

http://labs.bl.uk/Ideas+for+Labs http://labs.bl.uk/Other+Uses+of+Collections

Page 39: British Library Labs Presentation at the Open Cultural Data Symposium - Birkbeck

39 #bbkculturaldata @BL_Labs @BL_DigiSchol #bldigital https://goo.gl/

Contact us

Mahendra MaheyManager of BL Labs

[email protected]@bl.uk


Recommended