+ All Categories
Home > Documents > Digital Libraries: Study into the features of the DSpace Suite Devika P. Madalli Documentation...

Digital Libraries: Study into the features of the DSpace Suite Devika P. Madalli Documentation...

Date post: 18-Dec-2015
Category:
Upload: jordan-fisher
View: 218 times
Download: 0 times
Share this document with a friend
34
Digital Libraries: Study into the features of the DSpace Suite Devika P. Madalli Documentation Research and Training Centre Indian Statistical Institute Bangalore 560059
Transcript

Digital Libraries: Study into the features of the DSpace Suite

Devika P. Madalli

Documentation Research and Training CentreIndian Statistical Institute

Bangalore 560059

2

Introduction

Digital libraries encompass a whole range of information services related work such as– Organization of digital information– Information retrieval– User interface– Archiving and preservation– Services and social issues– Evaluation and applications to particular areas

3

Desirable Features of DL Software

• Structures

• Accessible

• Searchable

• Extensible

• Massive

• Heterogeneous

• Persistent

4

DL’s operation should be examined under…

• Architectural design – Modular and Open

• Backend Database – scalable, robust, data formats

• Network capabilities – web-based and seamless operations, persistent Ids, security and authentication

• Metadata and Interoperability – compatible with world standards such as Dublin Core and OAI-PMH

5

Technical Issues

• Open source software Vs Commercial OS

• Hardware and peripheral requirements

• Network Components

• Standards – data formats, metadata, network, access, interoperability, encoding

6

Approaches to Building DL

• Digitization – retro-conversion of non-digital resources to digital

• Digitally born resources – involves inter-conversion to standard formats and storage

7

Why DSpace Digital Library

• An open source technology platform which can be customized and its capabilities can be extended

• A service model for open access and/or digital archiving for perpetual access

• A platform to build an Institutional Repository and the collections are searchable and retrievable by/on the Web

• To make available institution-based scholarly material in digital formats. The collection will be open and interoperable.

DSpace is

8

Architecture and System Requirement

The DSpace system is organized into three layers

– The Storage Layer: responsible for physical storage of metadata and content

– The Business Layer: deals with managing the content of the archive, users of the archive (e-people), authorization, and workflow

– The Application Layer: containing components that communicate with the networked world outside of the individual DSpace installation,

• for example the Web user interface and the modules for metadata harvesting service

Features of a near ideal DL

• Low cost, including all hardware and software components

• Technically simple to install and manage• Robust• Scalable• Open and inter-operable• Modular• User Friendly• Multi-user (including both searching and

maintenance) • Multimedia digital object enabled• Platform independent (including both client and

server components) interoperable

DSpace is a joint project of MIT Libraries and Hewlett-Packard Labs

What is DSpace?

• Digital Object management system

• Create, search and retrieve digital objects

• Facilitate preservation of digital objects

• An open source software

• Allows open access and digital archiving

• Allows building Institutional Repositories

H/W and S/W requirements

• UNIX recommended (Java-based program should run on anything)

• Open source, built on Apache web server and Tomcat Servlet engine

• Uses postgreSQL or Oracle relational database

What DSpace can do?

• Captures– Digital content in any formats directly from creators

(e.g. researcher, authors)

• Describes– Descriptive, technical, rights metadata– Persistent identifiers

• OAI-PMH version 2.0 compliant– Allow metadata creation

Possible types of Content

• Preprints, articles• Postprints • Technical Reports• Conference Papers• Theses/Dissertations• Datasets

– e.g. statistical, geospatial, scientific

• Images– visual, scientific, etc.

• Audio files

• Video files

• Digitized library collections

Formats of Content

File Formats

Supported: Repository administrator can inform the submitters which file formats will be supported in the future by his organization

Known: recognizes the format, but cannot guarantee full support

Unsupported: cannot recognize a format; these will be listed as "application/octet-stream", -- Unknown

Information Model

• Communities – Departments, Labs, Research Centers, Schools…

• Collections • Items • Files (bitstreams)

– Multiple formats - same content– Complex objects – multiple files

Intellectual Property

• Click-through license during submission

• Grants DSpace non-exclusive right to acquire, manage, preserve, distribute the item

• Does not grant DSpace copyright

• Copy of license stored with item

Goodies

• Modular architecture, well-defined APIs

• 100% open source– Programmed in java– RDBMS and SQL for metadata

• CNRI “handles” for persistent identifiers

• OpenURL linking

• OAI-PMH for exposing metadata

Backend Technology

• Apache, Tomcat, OpenSSL/mod_ssl

• Java

• PostgreSQL/Oracle

• CNRI Handle System 5 (persistent ids)

• Lucene Search Engine

Standards

• Dublin Core only– Descriptive metadata only

• OAI-PMH v 2.0 (Open Archive’s Initiative Protocol for metadata harvesting)

• UNICODE Compliant

Capabilities

• Exports in XML format

• Supports crosswalks through OAI-PMH– DC (Dublin Core)– Qualified DC– METS (Metadata Encoding and Transmission Standard– MODS (Metadata Object Description Schema – sibling

of MARCXML)

• Can be extended to any Metadata Schema

Customization

• Screens (Manakin)• E-mails• Any language interface• Metadata• Input-forms• Display of results• Fields to be Indexed• Access restrictions• License (in addition to Creative Commons)

Advanced Feature

• Grid Compliant (Storage)• LDAP authentication• Usage statistics generation• SFX Server integration• RSS (Really Simple Syndication)• Item Recommendation to a friend• Use of Thesaurus (though not OWL/SKOS/RDF)• Full-text indexing of PDF, MS-WORD files

Important Sites

• http://www.dspace.org• http://www.sourceforge.net/projects/dspace• http://wiki.dspace.org• http://mailman.mit.edu/mailman/listinfo/dspace-

general• http://lists.sourceforge.net/lists/listinfo/dspace-tech• http://lists.sourceforge.net/lists/listinfo/dspace-

devel

DRTC Sites

• https://drtc.isibang.ac.in (Librarians' Digital Library)• http://drtc.isibang.ac.in/dlrg (Discussion Forum)• http://drtc.isibang.ac.in/sdl (Harvester in LIS)• http://drtc.isibang.ac.in• http://drtc.isibang.ac.in/blog

Questions?

Thank You

[email protected]


Recommended