+ All Categories
Home > Government & Nonprofit > Andrew Janes UKAD 2016 Forum

Andrew Janes UKAD 2016 Forum

Date post: 29-Jan-2018
Category:
Upload: the-national-archives
View: 491 times
Download: 0 times
Share this document with a friend
22
#UKArcDiscovery16
Transcript
Page 1: Andrew Janes   UKAD 2016 Forum

#UKArcDiscovery16

Page 2: Andrew Janes   UKAD 2016 Forum

Andrew Janes

Senior Archivist – Future Catalogues

UKAD forum, 17 March 2016

Twitter: @cartivist #UKArcDiscovery16

State of the Catalogue

2

Page 3: Andrew Janes   UKAD 2016 Forum

C Madsen and M Hurst, Resource Discovery @ The University of Oxford 2015, p 24

http://discovery.bodleian.ox.ac.uk/wp-content/uploads/sites/5/2016/02/OxfordResourceDiscoveryReport_Final_public.pdf

“To state the obvious, collections cannot be

discovered using electronic search tools

unless they have some sort of

representative electronic description.”

3

Page 4: Andrew Janes   UKAD 2016 Forum

S van Hooland and R Verborgh, Linked Data for Libraries, Archives and Museums (2014), p 71

“All metadata is dirty,

but you can do something about it.”

4

Page 5: Andrew Janes   UKAD 2016 Forum

The next 15 minutes

• Data-(in)adequacy: the good, the bad and the ugly

• How did we end up like this?

• The State of the Catalogue approach

• “Data-centric” or “neo-Jenkinsonian”?

5

Page 6: Andrew Janes   UKAD 2016 Forum

TNA’s catalogue within Discovery

6

Discovery

Finding Archives

data

ePRO and DRI

data

TNA Catalogue

data

Page 7: Andrew Janes   UKAD 2016 Forum

Inadequate catalogue data

7

Page 8: Andrew Janes   UKAD 2016 Forum

“It does exactly what it says on the tin”

Reference: OS 33/1812

Description: Map showing the boundaries of the

following places: Somerset: Clapton-in-

Gordano (parish), Portbury (parish)

Date: 1964

Held by: The National Archives, Kew

Legal status: Public Record

Closure status: Open Document, Open Description

8

Page 9: Andrew Janes   UKAD 2016 Forum

The bare minimum?

Essential elements

according to ISAD(G)2*

• Reference code

• Title

• Date(s)

• Extent

• Level of description

• Name of creator(s)

TNA measures of basic

data adequacy

• Description

• Date

• Unique reference code

Non-measurable data

adequacy

• Hierarchical structure

• Browsing order*ISAD(G), 2nd edition (2000), para I.12

www.ica.org/10207/standards/isadg-general-international-

standard-archival-description-second-edition.html

9

Page 10: Andrew Janes   UKAD 2016 Forum

The extent of the problem challenge

File- and item-level catalogue entries that fail the

Ronseal test as of January 2016:

10

Unreferenced items: 477,723

Files and items with no descriptions: 305,429

Files and items with blank dates: 454,345

Total: 1,237,497

Page 11: Andrew Janes   UKAD 2016 Forum

Uneven distribution of data problems

11

30% in worst 10 series

39% in worst 20 series

56% in worst 50 series

69% in worst 100 series

81% in worst 200 series

92% in worst 500 series

97% in worst 1000 series

Page 12: Andrew Janes   UKAD 2016 Forum

PRO finding aids system in the mid 1990s

12

Page 13: Andrew Janes   UKAD 2016 Forum

Creating PROCAT,

the original PRO online

catalogue

13

• Series level and

above: curated entries

• Subseries level and

below: bulk conversion

of entries

Page 14: Andrew Janes   UKAD 2016 Forum

Residual issues after

retrospective conversion

14

OS 33/1812

[blank description]

[blank date]

OS 33/1812

Clapton-in-Gordano

1964

OS 33/1812

Portbury

1964

Page 15: Andrew Janes   UKAD 2016 Forum

Subsequent developments

15

Data enhancement

• More conversion of

paper finding aids

• More cataloguing

from original

records

• Tidying up for data-

adequacy

• Correcting errors

• Accessions: more

records, more data

Changes in the context

of presenting the data

• Users expect to search

bottom-up not top-down

• Growth of digital and

digitised content

• Discovery: a combined

search tool

Page 16: Andrew Janes   UKAD 2016 Forum

Where are we now with our catalogue data?

16

Grows and improves

every day

Multifaceted approach

to enhancement

Very uneven richness

and granularity

1 million + major

inadequacies

Page 17: Andrew Janes   UKAD 2016 Forum

State of the Catalogue: purpose

17

• Assess the scope and extent of bad data

• Reduce bad data proactively and efficiently

• Working series by series

• Making records more accessible

• Faciliating research

• Influence wider cataloguing priorities and decisions

Page 18: Andrew Janes   UKAD 2016 Forum

DO

Understand context,

structure and content

Consult relevant

colleagues

Check and fix errors

and anomalies

Edit data in bulk

Check the records

systematically

Expand descriptions

from paper finding aids

Enforce finer detail of

descriptive standards

Proofread every line

State of the Catalogue: in and out of scope

DON’T

18

Page 19: Andrew Janes   UKAD 2016 Forum

State of the Catalogue: typical outline process

19

1. Download data to Excel

2. Fix errors and anomalies

3. Make bulk changes

4. Fix errors and anomalies

5. Upload data to PROCAT

Editorial

6. Manual editing (e.g. of

subseries)

7. Release into Discovery

8. Delete any redundant

entries

9. Check and mop up

Page 20: Andrew Janes   UKAD 2016 Forum

Some simple and useful data tools in Excel

• Sorting

• Filtering

• Conditional formatting

• Text to columns

• Functions

Functions include:

• AND, OR, NOT

• IF

• CONCATENATE

• LEFT, MID, RIGHT

• VLOOKUP

• LEN

• TRIM

20

Page 21: Andrew Janes   UKAD 2016 Forum

Progress in eliminating bad data

21

• January 2013:

2.85 million data problems

• January 2016:

1.23 million data problems

Page 22: Andrew Janes   UKAD 2016 Forum

Is State of the Catalogue “data-centric”?

• A radically-traditional approach

• Moral defence of the record requires adequate data

• Helps redress huge variation in depth and richness of

catalogue data (see MAD’s rule against bias*)

• Prioritises adequate breadth over extra depth

• A record-centric approach

• A data-centric approach?

22

*M Proctor and M Cook, A Manual of Archival Description, 3rd edition (2000), para 8.6E


Recommended