Post on 18-Nov-2014
description
transcript
1. EAD Revision2. EAC-CPF: an introduction
Timothy Ryan Mendenhall
Leo Baeck Institute
2012 March 28
EAD Revision
Timetable:Currently: analyzing comments
submitted during open comment period
December 2012: draft schema for revision and comment
August 2013: release of new schema
EAD Revision
What to expect:Migration planInteroperability:
• better support for the semantics of relationships (cf. EAC-CPF, RDA)
Interchange: • data interchange trumps presentation• promote uniform and predictable use to
enable better interchange of data.
EAD Revision: the details. . .
Schema only -- DTD will be deprecated
Simplification: Reduced number of tagsDeprecate presentation-oriented tags
like <emph>, <head>, <table>
DTD Schema
EAD Revision: the details. . .
Simplification: Simplified headerSimplified hierarchical structure
• <c01>, <c02> etc merge into undifferentiated <c> tags
• Wrapper and structural tags like <dsc> might be deprecated
EAD Revision: the details. . .
Make EAD more database-friendly:Less mixed content, more tagged dataMore specific, granular tags: e.g. forenames
and surnamesMore flexibility for normalizing dates (multiple
dates, ranges of dates, etc. Cf. EAC, RDA) Geo-tagging “Profiles” of tag sets for different types of
repositories
EAD Revision: the details. . .
Extend potential for language qualifications:
<geogname language=”ger”>Köln (Deutschland)</geogname>
and/or
<geogname language=”eng”>Cologne (Germany)</geogname>
Date-centered model: Goals
Improve machine-readability of finding aids
Aid in the sharing of finding aid data across platforms, CMS’s, languages, countries, and different aggregators
Move away from the document model: finding aid as a fluid, malleable record, not a fixed document
Affect on CJH
Likely minimal – migration paths will be made available
Conversion from EAD-DTD to EAD-SchemaCreation of task force?Resources, stylesheets available
Creation of new EAD templates New possibilities!
EAC: An introduction
Basics EAC-CPF: Encoded Archival Context
– Corporate Bodies, Persons, Families
XML vocabulary Based on ISAAR-CPF: int’l standard
related to ISAD(G) Adopted by SAA in 2011: standard for
archival authority data
Features
Parallels many RDA changesIncreased granularity of data
• E.g. life dates split into birth and death dates
Emphasis on relations• With other resources• With other corporate bodies, persons,
families• With functions
Features
Compatibility with existing authority data (LCSH, etc.) Wrapper elements allow wholesale
inclusion of outside metadata, i.e. authority MARC-XML
Great flexibility for alternate names, variant forms, local implementations
Features
Accomodates 4 different types of “entities”Single identityMultiple identity
• Many in one (single EAC-CPF instance)• One in many (multiple instances)
Alternative sets (i.e. variant records)
Why EAC?
Part of broader move towards semantic web, linked open data (LOD)
Better end-user experienceImproves capacity for faceted
searchingMore intuitive web interfaces
Standardization of authority data Sharing of authority data Eventually – saves time
Examples
Sample records:http://www3.iath.virginia.edu/eac/cpf/e
xamples/list.html
EAC in action:http://socialarchive.iath.virginia.edu/xtf
/search
Basic structure
Like EAD (and MARC), divided into control and descriptive sections:<eac-cpf>
<control> […] </control>
<cpfDescription>[ALTERNATE
<multipleIdenties><cpfDescription> . . .]
</cpfDescription>
</eac-cpf>
Basic structure : Control
Administrative data about the record itselfRequired elements:
• recordId• maintenanceAgency• maintenanceStatus• maintenanceHistory• languageDeclaration• sources
Basic structure : Control
Optional elementsAllow for local customizationUse of other identifiers for same entity
(i.e. from other thesauri, other national libraries, etc.)
Basic structure : cpfDescription
Descriptive section <cpfDescription>For most records: single <identity>For complex identities
• many-in-one, corporate and compound entities
• multiple <cpfDescription> elements wrapped in <multipleIdentities> tag
Basic structure : cpfDescription
Required: <identity> Optional:
<description><relations><alternateSet> -- alternate records for
the same entity imported from a different authority system, such as LCSH, VIAF, or a different national library.
Basic structure : cpfDescription
Descriptive section: required <identity>Most complex elementParallels RDA changes:
• Increased functionality for parallel and variant forms of names
• Can distinguish between “authorized” and “preferred” forms of a name
• Increased granularity (parts of names, dates)
• Ability to qualify variant forms of names by “use dates”
Basic structure : cpfDescription
Optional <description> Very similar to RDA, but encoded in XML
<existDates>• <date>, <dateRange>, <dateSet>
<places>• May be qualified by dates and roles
• Place of residence, place of birth, place of death, etc.
Basic structure : cpfDescription
Optional <description> All may be qualified by dates:
<occupations><functions><legalStatus> (corporate body)<mandates> (corporate body)
Basic structure : cpfDescription
Optional <description> “Free text” descriptive sections:
<biogHist> -- same as in EAD<generalContext> -- “general social
and cultural context<structureOrGenealogy>
• Structure of corporate bodies• Genealogy of individuals, families
Basic structure : cpfDescription
Relations section:<cpfRelation> -- relations to other
“entities”<functionRelation> <resourceRelation><objectXMLWrap> to include other
records, portions of other records
Basic structure : cpfDescription
Relations section: All have “relation type” attributes to help
specify the type of relation:• cpf: Family, associative, hierarchical-child,
hierarchical-parent, etc.• Functions: controls, owns, performs, etc.• Resources: creator, subject, etc.
To include other records, portions of other records:
• <objectXMLWrap>• <objectBinWrap>
Implementation at CJH?
Via Digitool?Similar to MARC to EAD
• Wholesale batch conversion• Issues:
• data cleanup• Skeletal data• Resolving differences in existing biographical
notes, etc.• Digitool’s interface – not good for “active”
records needing frequent syncing, updating
Implementation at CJH?
Via Digitool?Steps required:
• Batch export of authority data from Aleph• LCNAF is also available for download• MARC to EAC stylesheet• Google Refine: cleanup data• Ingest to Digitool• EAC to HTML stylesheet• Google Refine: resolution with existing
EADS
Implementation at CJH?
Via Digitool?Potential for *labor-intensive* edits
• Roles within collection• Center-wide agreement on relator terms
(RDA?), manually updating EAD “role” attributes
• Expansion of biogHist, structureOrGenealogy, etc.
Custom database outside of Digitool? Eventually – ArchivesSpace?
Future potential
Crowd-sourcingRelationship dataFunction data (“Is correspondent”, “Is
subject” etc)Genealogical data
Harvesting biographical, historical and genealogical dataDBPediaJewishGen
Resources
MARC-XML to EAC stylesheets Entire LCSH, LCNAF available for
download (MADS/RDF):http://id.loc.gov/download/
EAC-Pages: http://eac.staatsbibliothek-berlin.de/
EAC listserv = EAD listserv