+ All Categories
Home > Documents > History Data Service1 Good Design for Historical source based Databases History Data Service Hamish...

History Data Service1 Good Design for Historical source based Databases History Data Service Hamish...

Date post: 28-Mar-2015
Category:
Upload: amia-daly
View: 215 times
Download: 2 times
Share this document with a friend
Popular Tags:
16
History Data Service 1 Good Design Good Design for for Historical Historical source based source based Databases Databases H H istory istory D D ata ata S S ervice ervice Hamish James Hamish James
Transcript
Page 1: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 1

Good DesignGood Design

for for HistoricalHistoricalsource basedsource based DatabasesDatabases

HHistory istory DData ata SServiceerviceHamish JamesHamish James

Page 2: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 2

DDatabasesatabases

• A database is a computerised record keeping system.

• A DataBase Management System (DBMS) is a computer application built around a database that provides a flexible way of storing, manipulating, and examining data.

– A DBMS consists of data, hardware, software, and users

A DBMS on a personal computer will provide facilities for:

– inputting data, modifying, retrieving and deleting data– querying the data (SQL)– producing reports based on the data– building ‘front-ends’ for users

Page 3: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 3

DData ata MModelsodels

• Data models are abstract definitions of structures and relationships used to organise data in a database.

Data models can be characterised by how they organise the connections between different records:

– flat file– hierarchical – network– relational– object orientated

• Most DBMS’s available for personal computers are either flat file or relational.

Page 4: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 4

EEntity ntity RRelationship elationship MModellingodelling

• A data modelling technique that transforms information into a form that meets the requirements of the relational data model.

• Entities are the things that the database will contain a representation of.

– Entities can be anything; people, places, events, physical objects, or concepts.

– All the entities with the same characteristics can be collectively called an entity type.

• Relationships describe the way entities are connected to each other.

Page 5: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 5

RRelationshipselationships

• one to one relationships connect one entity to one other entity.

• one to many relationships connect one entity to one or more other entities.

• many to many relationships connect many entities to many other entities.

Page 6: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 6

DDataata

• The field is the basic unit of data in a database. A field stores a single piece of information of a particular data type.

• Fields are combined to form records. A record matches an entity.

• A set of records with the same fields are collected together in a table

Page 7: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 7

HHistorical Uses for a istorical Uses for a DDatabaseatabase

• To store and organise large amounts of information automatically.

• To provide easy access to the information contained in the original source.

• An environment for manipulating (changing and adjusting) the source.

• To search/filter/summarise complex information quickly.

Page 8: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 8

HHistorical istorical DDatabase atabase EExamplexample

H'hold_n ADDRESSA010 PINNER COMMONA011 PINNER COMMONA012 PINNER COMMONA013 PINNER WOODA014 PINNER GREEN TOLL

H'hold_n SURNAME FORENAME OCCUNAMEA010 SNOOK GEORGE POLICEMANA010 SNOOK ANNA010 SNOOK SARAH HANNAHA011 DEAN JAMES SAWYERA011 DEAN MARGARETA012 ROBERTSON MARIA INDEPENDENT LADYA012 EDMONDS EMILY SERVANTA013 CRAWLEY GEORGE AG LABA013 CRAWLEY MARY ANNA013 CRAWLEY CAROLINEA013 CRAWLEY ELIZABETH

OCCUCODEOCCUNAME(blank) SCHOLAR AT HOME2PP13 SCHOOL MISTRESS2PP13 SCHOOLMISTRESS4DS1 SERVANT4DS3 SERVANT AND GROOM3AG1 SHEPHERD

Page 9: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 9

HHistorical istorical DDatabasesatabases

• Technical decisions are often the least important.

• Historians work with information they do not control.– incomplete, poorly structured information of varying quality.

• A historical source based database is a representation of the primary source, but it is not an exact replica of the primary resource.– Some information may be left out.

– some extra information may be included.

• A historical source based database mixes elements of a primary source with elements of a secondary source.

Page 10: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 10

The The TThree hree LLayer ayer MModelodel

Standardisation Layer•provides a foundation for analysing the data.

•codes and standardisation rules are applied.

Source Layer•an accurate digital representation of the source.

•defines level of detail captured.

Interpretation Layer•incorporates researcher’s knowledge and judgement.

•Links records and forms aggregates.

Page 11: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 11

TThreehree L Layer ayer DDesign esign EExamplesxamples

Source Standardise Intrepretation6 mnths 0.5 infant

ag. lab. agricultural labourer farm occupations

J. SmithJohn A. Smith, bakerJ.A. SmithSmith & Son Bakers

J. ? SmithJ. A. SmithJ. A. Smith? ? Smith

John Smith, Baker

MdlboroughMdsbroMeddlesbroMedelsbro

Middlesbrough Middlesbrough,Yorkshire

Page 12: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 12

SSimple imple DDesign esign HHintsints

• Make sure the smallest unit of data matches the smallest unit of analysis.

– If you want to look at people by last name then have separate first and last name fields, not just a name field.

• Don’t mix data types– separate numbers and words.

• Document everything you, either in the database or with the database.

– Data entry, data standardisation and coding, data transformations, limits of data etc.

– Keep information that tracks the origin and history of the database.

• Add information, don’t delete information.

Page 13: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 13

FFurther urther IInformationnformation

Starting Out

Michael J. Hernandez, Database Design for Mere Mortals : A Hands-On Guide to Relational Database Design, Addison-Wesley, 1997.

Database Central, http://databasecentral.com/

History Data Service, http://hds.essex.ac.uk/

The ‘Classics’

Charles Harvey & Jon Press, Databases in Historical Research, Macmillan Press, 1996.

C. J. Date, An Introduction to Database Systems, Addison-Wesley, 1999 (7th ed.)

Page 14: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 14

SSource ource LLayerayer

• Acts as the reference version of the original source.– An accurate representation of the source, including errors,

omissions etc.

– Contents determine the highest level of detail available about the source in the database.

– Includes a reference to the non-digital original source.

– Includes a unique identifier for each item.

• Implementation:– as long text fields containing full text transcriptions.

– as ‘blob’ fields containing scanned images.

– as a regular database table.

– as a pivoted database table.

id fname lname occup age sex45 J Smith Baker 45 M46 Jane Smith F

id field value45 fname J45 lname Smith45 Occup Baker45 Age 4545 Sex M46 fname Mary46 lname Smith46 sex F

Page 15: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 15

SStandardisation tandardisation LLayerayer

• Organises the information into discrete units with fully defined contents.

– Separates information in the source into separate fields according to data type and data content.

– Simplifies the data by standardising and coding it.– Normalises the data.– Includes links back to the source layer.

• Implementation:– Possibly as addition columns in source layer tables.– Probably as separate tables with, ideally, a one-to-one

relationship to records in the source layer.

A series of rules that are applied to data to ensure that it conforms to the relational data model:

1 remove repeating groups (first normal form).

2 remove partial dependencies (second normal form).

3 remove indirect dependencies (third normal form).

Page 16: History Data Service1 Good Design for Historical source based Databases History Data Service Hamish James.

History Data Service 16

IInterpretation nterpretation LLayerayer

• Creates historical entities from the data and the knowledge and expertise of the historian.

– Incorporates interpolations and extrapolations from the data in the standardisation layer.

– Selectively includes and excludes information from the standardisation layer.

– Links separate records to form entities such as ‘individuals’ or ‘households’.

– Many-to-many relationship with records in the standardisation layer.

Many-to-many relationships are usually converted into two one-to-many relationships to remove data redundancy.


Recommended