M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 1
Developing a Data Format Standard Informed by CCO: The CDWA Lite/OAI PMH
Project
“Expressing CCO”
Murtha Baca
Getty Research Institute
Origin and GoalsThe Getty wanted to make authoritative, up-to-date information on its collectionswidely available, in a variety of “venues.”ARTstor asked the Getty to contribute to its Image Gallery.The Getty and ARTstor worked together to develop a replicable, standards-based way for institutions to contribute data and images relating to cultural heritage collections to union catalogs like ARTstor’s Image Gallery (and OCLC’s WorldCat, RLG’sCultural Materials, etc.).
Goals cont.To develop a data dictionary (specification) & related XML schema suited to representing cultural objects, based on appropriate data structure (CDWA) & data content (CCO) standards To reduce overhead for contributing to union catalogs/service providers: do it once, do it right, share it with everybody To reduce labor and “delivery” costsTo ensure a mechanism for updating dataTo include links from contributed metadata back to records in their “home” context
Essential Elements
An XML schema/data format standard appropriate for expressing information on cultural objects & their visual surrogates CDWA Lite (not Dublin Core!)A replicable, standard technical protocol for delivering, sharing, & disseminating the information expressed in that format the OAI Protocol for Metadata Harvesting (OAI/PMH)
CDWA LiteAn XML schema/XSD for core records for works of art & material culture based on the Categories for the Description of Works of Art (CDWA) core categories & informed by the Cataloging Cultural Objects (CCO) guidelines.
The Players
XML schema developers (Getty & ARTstor)Service provider/content aggregator (in this case, ARTstor)Data provider/contributor (in this case, theGetty Museum and Getty Research Institute)“Internal service provider” (in this case,Getty Web Group, Getty Research Institute information systems department)
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 2
CDWA Lite: OriginsWhat is CDWA? Categories for the Description of Works of Art(CDWA) describes the content of art databases by articulating a conceptual framework for describing and accessing information about works of art, architecture, other material culture, groups and collections of works, and related images.
What is CCO?Cataloging Cultural Objects: A Guide to Describing Cultural Works and Their Images (CCO) provides prescriptive guidelines for selecting, ordering, and formatting data used to populate catalog records. It deals with information related to a subset of the CDWA Categories and the VRA Core Categories.
Generic Concept
Identification
Generic Concept
Identification
Subject Identification
Subject Identification
CDWA/CCO Entity Relationship Diagram
Person/ Corporate Body Authority
Place/Location Authority
Generic Concept Authority
Subject Authority
Object/ Work
Records
Entity Relationship Diagram for CDWA
Data StructureRelated Visual Documentation
Related Textual Documentation
http://www.getty.edu/research/conducting_research/standards/intrometadata/index.html
Origin of the CDWA Lite elements
http://www.getty.edu/research/conducting_research/standards/intrometadata/index.html
Mapping data categories/elements
CDWA CCO CDWA Lite
OBJECT/ WORK (core) . .
Object/ Work - Catalog Level
Record Type
.
Object/Work-Type (core)
Work Type cdwalite:objectWorkType
CDWA/CCO required categories were mapped to an XML schema = CDWA Lite
http://www.getty.edu/research/conducting_research/standards/cdwa/cdwalite/cdwalite.pdf
Draft data dictionary/spec for XML schema to describe core records for works of art and material culture Based on the required fields, structure, guidelines discussed in CDWA Informed by the cataloging rules in CCO
CDWA LITE on the WebCDWA Lite: a closer look
Tag in angled brackets = <cdwalite:objectWorkType>data (here blue) painting end of data indicated with slash and repeat the tag name </cdwalite:objectWorkType>
<cdwalite:objectWorkType>painting</cdwalite:objectWorkType>May add attributes, e.g., source of the term = termsourceplace this inside the tag, distinguishing it from data
<cdwalite:objectWorkType termsource="AAT">painting </cdwalite:objectWorkType>
May nest tags inside each other, forming the hierarchical structure of the schema
<cdwalite:objectWorkTypeWrap><cdwalite:objectWorkType termsource="AAT">painting </cdwalite:objectWorkType> <cdwalite:objectWorkType termsource="AAT">altarpiece</cdwalite:objectWorkType>
</cdwalite:objectWorkTypeWrap>
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 3
DESCRIPTIVE METADATA
1. Element: Object/Work Type Wrapper
1.1. Sub-element: Object/Work Type
2. Element: Title Wrapper
2.1. Sub-element: Title Set
2.1.1. Sub-element: Title
2.1.2. Sub-element: Source of Title
3. Element: Display Creator
Etc.
CDWA Lite is divided into descriptive metadata for the work,and administrative metadata
CDWA LITE ELEMENTS
DESCRIPTIVE METADATA
1. Element: Object/Work Type Wrapper
1.1. Sub-element: Object/Work Type
2. Element: Title Wrapper
2.1. Sub-element: Title Set
2.1.1. Sub-element: Title
2.1.2. Sub-element: Source of Title
3. Element: Display Creator
Etc.CDWA Lite focuses on descriptive data for WorksImages included, but not emphasizedAuthority/controlled vocabulary information is included
CDWA LITE ELEMENTS
DESCRIPTIVE METADATA
1. Element: Object/Work Type Wrapper
1.1. Sub-element: Object/Work Type
2. Element: Title Wrapper
2.1. Sub-element: Title Set
2.1.1. Sub-element: Title
2.1.2. Sub-element: Source of Title
3. Element: Display Creator
Etc.
ADMINISTRATIVE and RESOURCE METADATA
20. Element: Rights for Work
21. Element: Record Wrapper
21.1. Sub-element: Record ID
21.2. Sub-element: Record Type
21.3. Sub-element: Record Source
etc.
CDWA LITE ELEMENTS
Administrative metadata includes copyright, information about the image, etc.
DESCRIPTIVE METADATA
1. Element: Object/Work Type Wrapper
1.1. Sub-element: Object/Work Type
2. Element: Title Wrapper
2.1. Sub-element: Title Set
2.1.1. Sub-element: Title
2.1.2. Sub-element: Source of Title
3. Element: Display Creator
Etc. CDWA Lite is designed so that it could eventually be a subset of a bigger XML schema for the full CDWA Wrappers and Sets organize related elements
CDWA LITE ELEMENTS
CDWA LITE1.1. Sub-element: Object/Work TypeElement tag: <cdwalite:objectWorkType>Description: A term or terms identifying the specific kind of object or work being described. For a collection, include repeatable instances for terms identifying all of or the most important items in the collection.
Attributes: termsource, termsourceIDRepeatableRequiredData values: Controlled. Recommended AAT
Tagging examples:
<cdwalite:objectWorkTypeWrap><cdwalite:objectWorkType>rhyton</cdwalite:objectWorkType> </cdwalite:objectWorkTypeWrap>
<cdwalite:objectWorkTypeWrap><cdwalite:objectWorkType termsource="AAT">painting
The description, attributes, whether or not an element is repeatable or required, how it is to be controlled, are all based on CDWA and CCO
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 4
CDWA LITE ELEMENTS
4.6. Sub-element: Attribution Qualifier CreatorElement tag: <cdwalite:attributionQualifierCreator>Description: A qualifier used when the attribution is uncertain, is in dispute, when there is more than one creator, when there is a former attribution, or when the attribution otherwise requires explanation.
RepeatableNot requiredData values: attributed to, studio of, workshop of, atelier of, office of, assistant of, associate of, pupil of, follower of, school of, circle of, style of, after, copyist of, manner of, used according to the recommendations in CCO and CDWA.
Implementors are referred back to CCO and CDWA for cataloging guidance
CDWA LITE ELEMENTS
7. <cdwalite:displayMaterialsTech>lacquered iron and leather, with silk, and copper-gilt; stenciled leather breastplate</cdwalite:displayMaterialsTech>
Display and indexing guidelines in CCO and CDWA are carried through to CDWA Lite
Image from www.metmuseum.org
8.1. <cdwalite:indexingMaterialsTechWrap><cdwalite:indexingMaterialsTechSet> <cdwalite:extentMaterialsTech>breastplate </cdwalite:extentMaterialsTech> <cdwalite:termMaterialsTech termsource="AAT"termsourceID="aat300011845">leather </cdwalite:termMaterialsTech> <cdwalite:termMaterialsTech termsource="AAT"termsourceID="aat300053433"> lacquering </cdwalite:termMaterialsTech>
</cdwalite:indexingMaterialsTechSet> <cdwalite:indexingMaterialsTechSet>
The Collections (harvestable sets)
Research Library, Getty Research Institute, Photo Study Collection: TapestriesThe Getty Research Institute’s Photo Study Collection contains more than 2 million study photographs, many of which are historic in nature. Featured here are 4,215 images from the Tapestries Collection, most of which are based on the French & Company dealer archive.
The J. Paul Getty Museum at theGetty Center: PaintingsThe J. Paul Getty Museum’s paintings collection includes European paintings from roughly 1300 to 1900. The collection also includes a number of pastels, predominately from the 18th century.
The Data “at home”
• J. Paul Getty Museum Paintings Collection (461 records and 461 digital still images)
• Data lives in a collection management system (relational database)
• Tapestries - Photo Study Collection of the Research Library, Getty Research Institute (4215 records and 9063 digital still images)
• Data lives in a flat-file system with linking capabilities
Attributed to Hans Holbein the Younger An Allegory of Passion
German, 1530Oil on panel
The Data in ARTstor’s
Image Gallery
Harvested Getty Museum
record in ARTstor
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 5
Getty museum object on the Getty Web site, with additional information & images
Harvested Getty Museum record includes link to Getty Web site
Paintings collection on
Getty Web site
Harvested Photo Study Collection record
Photo Study Collection record on the Getty Web site Harvested Photo Study Collection record
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 6
Research Library, Getty Research Institute Photo Study Collection Study Images of Tapestries collection Web page
What are the incentives for using
CDWA Lite?(or another standards-based schema + the OAI harvesting
protocol)
ARTstor records for Getty Museum object; records are from a university slide library
Subject headings enhance access, but not all data is up to date or matches the repository’s data
Work Type: sculpture
Title: Cult Statue of a Goddess, perhaps Aphrodite
Creator: Unknown Greek (South Italian)
Measurements: H: 220 x W: 67 cm (86 5/8 x 26 3/8 in.)
Materials: limestone and Parianmarble with polychromy
Creation Date: 425-400 B.C.E.
Repository: The J. Paul Getty Museum at the Getty Villa (Malibu, California) 88.AA.76
Rights: © J. Paul Getty Museum; http://www.getty.edu/image_rights
Record Source: J. Paul Getty Museum. Object ID: 115100
Image:http://www.getty.edu/art/gettyguide/artObjectDetails?artobj=15050&handle=li
Core CDWA Lite record exported from museum collection management system, “harvested” along with image from getty.edu Web site: up-to-date, authoritative image and metadata from the institution that owns the object. This could be clustered with other records for the same object as the “master” record in a union environment.
Work Type: Necklace
Class: Jewelry
Creator: Unknown Hawaiian
Culture: Hawaiian, Polynesian, Oceanic
Measurements: 44 cm (length)
Materials: ivory beads; glass beads
Creation Date: 19th century
Repository: Emerson Collection, Bishop Museum, Honolulu, Hawaii, artifact number 01293
Example of a CDWA Lite record (display
elements) for an object in the Bishop Museum: data comes
from the holding institution
“I would like to show that we can streamline processes and eventually save time that our database manager has to spend pulling and formatting data for different repositories.” Diana Folsom, Arts & Education Systems Manager, Los Angeles County Museum of Art
Incentives for using CDWA Lite
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 7
Next StepsEvaluate results of “first harvest” (done at the end of 2005) & make any necessary changes to the schema, mapping, & dataOffer metadata records & images to other Service Providers (RLG, OCLC)Share schema with the cultural heritage community & get inputDevelop into a production-grade procedureAdvise/assist other early implementors (LACMA, Cleveland Museum of Art, others?) Work with vendors (e.g. Gallery Systems; Systems Planning) to embed in collection management systems
Another XML schema informed by CCO:VRA Core 4.0
http://www.vraweb.org/datastandards/VRA_Core4_Welcome.html
CDWA Lite and VRA Core 4.0 are included among the Potential Metadata formats for Use with the OAI/PMH on the OAI Best Practices pages of the National Science Digitial Library (NSDL) and the Digital Library Federation (DLF)
http://oai-best.comm.nsdl.org/cgi-bin/wiki.pl?RecommendedFormats
“... this low barrier [for contributing metadata via the OAI/PMH] does not preclude a much higher ceiling [than simple Dublin Core], and the OAI-PMH specifically allows the use of much richer metadata schemes.”
Roy Tennant, “Bitter Harvest: Problems & Suggested Solutions forOAI-PMH Data & Service Providers,” http://www.cdlib.org/inside/projects/harvesting/bitter_harvest.html
More on CCO & CDWA Lite and Their Potential “Interaction”
with Library and Archival Standards
EAD (data structure) & DACS (data content) used at the collection level for an archival
collection with a common provenance
M. Baca CDWA Lite Schema CCO Preconference @ ALA 2006 8
Class: Contemporary ArtWork Type: multimediaCreator: Claes Oldenburg (American sculptor, draftsman, and printmaker, born 1929 in Sweden)Title: False Food SelectionCreation Date: ca. 1965Materials: plastic box containing artificial food made of plasticMeasurements: 13.5 x 18 x 5 cm (5 3/8 x 7 x 2 inches)Style: FluxusSubject: box; food; biscuits; petit fours; kaiser roll; eggs; baconCurrent Location: Special Collections, Research Library, Getty Research Institute (Los Angeles, California) (890164 bx.205)Description: The box of repository’s copy is blue and contains 3 different biscuits, 3 different petit fours in paper baking cups, a pear, a kaiser roll, and 2 sunny-side up eggs and a strip of bacon glued to the inside of the lid.Related Work:
Relationship Type: part of[link to Related Work:]Brown, Jean (American, 1911-1994). Jean Brown Papers, 1916-1995.
CDWA Lite or VRA Core (data structure/data format) & CCO (data content) at the item level for an individual work within the same
collection
MARC (data structure/data format) and AACR (data content) used for a “parent” item (18th-century book with engravings) in OPAC.
Class: PrintsWork Type: engravingCreator: Unknown Spanish Title: Table Setting for Sixty CoversCreation Date: ca. 1747Materials/Techniques: engraving on laid paperMeasurements: plate mark 14.6 x 20 cm (5 34/ x 7 3/4 inches), on sheet 16 x 21.1 cm (6 3/8 x 8 3/8 inches)Subject: table setting; food; decoration; centerpieces; confectionery; garnishes; cookery; desserts; tablecloths; tabletop fountains; food presentation; courts; courtiersDescription: Table setting for sixty covers described under the entry “Mesa de sesenta cubiertos, larga, y sus esquinas redondas.” The sculptural decoration represents a rampart and its fortified towers (no. 1). The table with rounded corners is adorned with platters of glass (no. 2), and vessels for holding sweets, sugar, and caramel figures, compotes, cakes, cheese, and fruit.Current Location: Special Collections, Research Library, Getty Research Institute (Los Angeles, California) (1405.324_pl6)Related Work:
Relationship Type: part of[link to Related Work:]Juan de la Mata, (Spanish, 18th century); Arte de reposteria. Madrid: 1747.
CDWA (data structure), CCO (data content), and CDWA Lite(data format) used at the item level for an individual engraving from the “parent” work represented in the preceding MARC record.
Results list of Special Collections search, with AACR “inscribed titles.”
Results list from same search, with CCO “display titles” along with AACR titles.
More AACR “inscribed titles.”