Perseus’ Archiving Needs
And What They Mean For Librarians
Preserving Perseus
Data and Behaviors
• What does Perseus have to lose?
• Data– If lost, we cannot do anything.– The primary text is primary.
• Behavior– We lose the ability to make associations
Structure of the Talk
• Perseus’ current and future options for archiving/preserving its data and behaviors
• Use this to motivate new skills required by and emerging new roles for librarians
Perseus’ Preservation Options…
• Be Open– Hard to maintain a black box
• Distribute for Redundancy– Library of Alexandria: Don’t put all your
eggs in one basket.
• Use Institutions for Reliability/Quality– Library of Alexandria: Lots of quality
content
Be Open
Be Open: Data
• Data formats – Non-binary for text
• Images are different
– Application-independent– Easily transformable when possible
• XML
• Licensing– Can other people use this data?– Are other people able to create derivative works?
Be Open: Behaviors
• Protocol Specifications– What does Perseus mean? (semantics)– Defining behaviors
• Browsing by logical citation scheme: CTS protocol
• Perseus’ APIs– Open source implementations– Let people download these
implementations
Distribute For Redundancy
Distributing Data
• Leveraging Geographic Distribution– SRB/iRods
• Desktop/Web-based GUI
• The more copies, the safer our data will be– Perseus lets people download raw data
• Creative commons
Distribute Your Behaviors
• Mirror sites– Enables distribution of behaviors
• Distributed computing power– Performance gain
• For Perseus’ mission: the more copies, the better!– Let people download your specs and
implementations.• GPL license
Use Institutions For Reliability & Quality
Give Institutions Your Data
• Quality– Policies for ingest ensure a standard for
the data and metadata
• Leverage Expertise– Their job is to archive and preserve data
Give Institutions Your Behaviors
• Institutional repositories can preserve behaviors– Fedora
• Forces documentation – Specification – Implementation
• If using a different implementation– Is the specification really implementation-
independent?
Skills Perseus Needs from Future Librarians
• Data formats:– XML
• Manipulating the data– XSLT– Basic Scripting: Perl, Python, Groovy
• Licensing agreements– Creative Commons– GPL
• Grid/Distributed Computing• Investigate Institutional Repositories
– Fedora