UVA MDST 3703 Thematic Research Collections 2012-09-18

Post on 21-Nov-2014

610 views 0 download

Tags:

description

 

transcript

Thematic Research Collections

Prof. AlvaradoMDST 3703/7703

18 September 2012

[XKCD]

Business

• Anyone having problems connecting to their home directory?– Come see me if so

• Quiz 1 will be posted on Collab today

Comments

• “It is much more than just the technology, it’s about the conscious decision made by real live humans who design the technology.”

• “I think in order for digital representation to be able to achieve a maximum functionality there needs to be a move away from simply recreating a card catalog online with attachments to the documents.”

• “… nothing can ever truly replace the experience of being in a physical library itself.”

Comments

• “As complex as the hypertext may become, it must remain user friendly to be of any value.”

• “… the first thing that sticks out is the diversity of structure of the collections.”

• “… there [are] still some drawbacks to digital collections, for example, the lack of a great system for annotating documents.”

Comments

• “There is something to be said for walking into a library and pouring over pages, without interruption from technology, for hours and having to forge the way for your own trail of connections from one document to the next, much like the work hyperlinks do for us.”

Review

• So far, we have looked at two big ideas– The idea of hypertext, and its realization in

HTML– The concept of text markup, and its realization

in SGML, XML, TEI, and HTML

• Remember:– TEI and HTML are specific markup languages– SGML and XML are specifications for defining

markup languages– XML lets you create the languages on the fly

What mechanism do SGML and XML provide to define specific markup

languages?

DTDsDocument Type Definitions

More generally, these are called schema

[DTD]

<!DOCTYPE NEWSPAPER [

<!ELEMENT NEWSPAPER (ARTICLE+)><!ELEMENT ARTICLE (HEADLINE,BYLINE,LEAD,BODY,NOTES)><!ELEMENT HEADLINE (#PCDATA)><!ELEMENT BYLINE (#PCDATA)><!ELEMENT LEAD (#PCDATA)><!ELEMENT BODY (#PCDATA)><!ELEMENT NOTES (#PCDATA)>

<!ATTLIST ARTICLE AUTHOR CDATA #REQUIRED><!ATTLIST ARTICLE EDITOR CDATA #IMPLIED><!ATTLIST ARTICLE DATE CDATA #IMPLIED><!ATTLIST ARTICLE EDITION CDATA #IMPLIED>

]>

For example, a DTD for a newspaperNo need to remember the syntax for DTDs, just their purpose

DTDs can also be used to define

genres, such as essays, poems, novels

The distinction between document type and genre is

fuzzy

Genres in the Humanities

• Primary sources– Tax records, letters, diaries, paintings,

oral history, manuscripts, first editions, etc.

• Secondary sources– Essays and “monographs” (books)

• Tertiary sources– Encyclopedias, dictionaries, etc.

Primary Sources

Essays and books are the staplesecondary sources

But these can become primary sources too …

Tertiary Sources

Is this a genre?

[The Rotunda]

[Library of Babel]

Is this a portable library or a book?

[Talmud]

Are libraries and books distinct?

If not, are there schema for libraries?

What about this?

• Trivium– Grammar– Rhetoric– Logic

• Quadrivium– Arithmetic– Geometry– Music– Astronomy

Does this not form the plan of a library?

[Berners-Lee’s diagram]

Hypertext blurs the distinction between documents and

libraries

Instead, we have a docuverse

(or a vast intertext)

The library is one big documentEvery document is a little

library

Overview

• Today, we consider a set of projects that are built on this premise– Either as attempts to fulfill it or as reactions to it (because hypertext can be scary)

• We look at specific examples of “digital collections”

• Within the framework defined by Palmer and McGann– The TRC as an emerging genre of digital

scholarship

What is a thematic research collection?

How is it different from a traditional library?

TRCs overcome the problem that libraries scatter content

They consolidate content

Features of the TRC

• electronic• heterogeneous datatypes• extensive but thematically coherent• structured but open-ended• research oriented• authored or multi-authored• interdisciplinary• collections of digital primary

resources

Critical Convergences and Effects

• They coincide with the move away from theory and toward historicism

• They produce a renewed focus on the materiality of text

• They achieve “contextual mass”• They force collaboration and

inter-disciplinarity• They become laboratories for

research

McGann on Secondary Sources

• “[W]hen scholarly journals publish their work online … in electronic form, they open their materials to integration within a scholarly network whose range and power outstrip current paper-based publication. Furthermore, electronic publishing permits scholars to present their work in far greater depth and diversity. Essays can present all their documentary evidence as part of their argument (in notes and appendices, or in electronic links to the original documents).”

Contextual Mass

Instead of building large collections, “digital research libraries should be systematically collecting sources and developing tools that work together to provide a supportive context for the research process.”

Let’s look at some examples and see how

they stack up

6 Questions

1. What’s in the collection?2. How is the collection organized? Any

guiding metaphors?3. How easy is it to find things?4. How effective is it achieving contextual

mass? How connected are things?5. What tools does it provide for

researchers?6. How much does it involve users in a

community?

Backstory: IATH

• Institute for Advanced Technology in the Humanities– http://www.iath.virginia.edu

• Established in 1992 • Funded by IBM• VOTS and RA two founding projects • VOTS was a demonstration project for

IBM; pitched as "as a research library in a box, enabling students at places without a large archive to do the same kind of research as a professional historian."

Yea, though I walk through the valley of the shadow of death,         I will fear no evil: for thou art with me; thy rod and thy staff they comfort me.

(from Psalm 23)

VOTS Intro

What’s in the site?

• Focused on primary source documents relating to the US Civil War– Thousands of primary source

documents– Newpapers, letters, diaries, maps,

images, gov docs– Augusta Co, VA and Frankln Co, PA– 1859 to 1870

How is it organized?

The Library Metaphor

How easy is it to find things?

Quick exercise: find out if the Confederate Army ever made it to

Carlisle, PA

How connected are the parts? Does it achieve contextual mass?

Not very connected

Items have few connectors to other items

(e.g. no links in the metadata)

What tools does it provide to researchers?

Tools

• Search and browse• Timelines• Animations– http://valley.lib.virginia.edu/VoS/MAPDE

MO/Theater/TheTheater.html

• Resources for using the site

Does the site seek to build a community?

Not internally

The Rosetti Archive

What’s in site?

• Focused on the works the Pre-Raphaelite poet and painter Dante Gabriel Rossetti (1828–1882)– Paintings, poems, letters, etc.

• Also some secondary source material– Art history and literary criticism

How is the collection organized?

The site is organized as a traditional database

Search, List, Display

How easy is it to find things?

Getting to Bocca Baciata

• Find the painting, Bocca Baciata• Search [image records]• What do you do when you get

there?• How is the site structured?

Bocca Baciata 1859

Exercise: Find a painting of Bocca Baciata

Easy, if you know what you are looking for

How connected are the parts? Does it achieve contextual mass?

Some connectivity among parts, but not much.

What tools does it provide to researchers?

Does the site seek to build a community?

The Tibetan Himalayan Digital Archive

What’s in the site?

• A vast collection of Tibetan documents

• An interactive collection of maps• Videos and images

How is the collection organized?

Cross between a database and a library

Hybrid

How easy is it to find things?

Exercise: Find the city of Lhasa

How connected are the parts? Does it achieve contextual mass?

The site is highly connected

It can be confusing knowing where you are

What tools does it provide to researchers?

Tools

• Interactive map• Place dictionary• Thesaurus• Etc.

Does the site seek to build a community?

Yes

Other IATH Examples

• The Blake Project– http://www.blakearchive.org/blake/

• The World of Dante– http://www.worldofdante.org/

• The Chaco Archive– http://www.chacoarchive.org/cra/

Other Examples

• Princeton Dante Project– http://etcweb.princeton.edu/dante/

index.html

• Perseus Project– http://www.perseus.tufts.edu/hopper/

• A House Divided– http://hd.housedivided.dickinson.edu/