23 rd Mar 2012 National Seminar on Managing ETDs Creation and Management of ETD Archive at IISc...

Post on 29-Mar-2015

213 views 0 download

Tags:

transcript

23rd Mar 2012 National Seminar on Managing ETDs

Creation and Management of ETD Archive at IISc Bangalore using Open Source Software

By Filbert Minj

National Centre for Science InformationIndian Institute of Science

Bangalore – 560 012(E-Mail:filbert@ncsi.iisc.ernet.in)

23rd Mar 2012 National Seminar on Managing ETDs

Outline of the presentation

Introduction to Indian Institute of Science (IISc) and etd@IISc

Software selection for etd@IISc Repository implementation steps Thesis submission and Archival Handle Server Configuration / OAI complaint Challenges encountered in implementation Back up and restore mechanism Access Policies Conclusion

23rd Mar 2012 National Seminar on Managing ETDs

About IISc

23rd Mar 2012 National Seminar on Managing ETDs

About IISc cont…

Academic and research Institution 2500+ active researchers including 500+

faculty 2000+ research publications per year 200+ M.Sc Engineering and Ph.D thesis

per year

23rd Mar 2012 National Seminar on Managing ETDs

About ePrints@IISc

Research publications repository of IISc (http://eprints.iisc.ernet.in)

Pioneering efforts towards the cause of Open Access Initiative in India

Started in 2002 and has 32878+ publications as of now and growing steadily

Accessed significantly from around the world

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

About etd@IISc

Digital repository of theses and dissertations of Indian Institute of Science

Was started in June 2004 as a student project and launched the service towards the end of the month February 2005

Accessible at http://etd.ncsi.iisc.ernet.in/ Thesis Repository has 1567 thesis as of now and

growing slowly IISc library has taken the initiative to archive old theses

of the Institute With formulation of Institute’s policy, submissions are

expected to grow steadily

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

Need for etd@IISc

A centralized system for managing and presenting the research output of the Institute in an organized fashion

Could facilitate easy, fast, and open access to the intellectual output of the Institute

Preservation and long-term access to the scholars' research output

OAI and the "Google"-ing of thesis in etd@IISc can be immediately found in global indexing and search services

23rd Mar 2012 National Seminar on Managing ETDs

Why DSpace?

Largest community of users and developers worldwide Free open source software Completely customizable to fit your needs Used by educational, government, private and

commercial institutions Can be installed out of the box

Quickly install DSpace on your computer- DSpace Live CD. (http://cadair.aber.ac.uk/dspace/handle/2160/565)

Can manage and preserve all types of digital content

23rd Mar 2012 National Seminar on Managing ETDs

What is DSpace?

DSpace is a platform that allows you to capture items in any format – in text, video,

audio, and data. It distributes it over the web. It indexes your work, so users can search

and retrieve your items. It preserves your digital work over the long

term.

23rd Mar 2012 National Seminar on Managing ETDs

Repository structureRepository is organized as communities and collections

23rd Mar 2012 National Seminar on Managing ETDs

Prerequisite Software (latest)

Latest release of DSpace, version 1.8.2 UNIX-like OS or Microsoft Windows

Linux (recommended) Oracle Java JDK 6 (standard SDK is fine, you don't need

J2EE) Apache Maven 2.2.x or higher (Java build tool) Apache Ant 1.8 or later (Java build tool) Relational Database: (PostgreSQL or Oracle). 2.6 Servlet Engine: (Apache Tomcat 5.5 or 6, Jetty,

Caucho Resin or equivalent).

23rd Mar 2012 National Seminar on Managing ETDs

Repository implementation steps

Prototype Repository Formulate key requirements Metadata addition for Compliance with ETD-

MS Customization to meet the requirements Creation of community and collections Thesis submission and Archival Back up and restore mechanism

23rd Mar 2012 National Seminar on Managing ETDs

Prototype Repository

Prototype repository for ETD using DSpace 1.2 was setup to Understand the systemWorkflowLocal requirementsCompliance with standardsValue addition to be done

23rd Mar 2012 National Seminar on Managing ETDs

Key requirements

The prototype setup helped us to arrive at the following key requirements System should support only post-approval (accepted) online

submission of theses Reflection of IISc divisions and departments as communities and

collections Compliance with ETD-MS metadata standard Validation of student registration using students’ record

database Automatic community and collection assignment to students

upon registration

23rd Mar 2012 National Seminar on Managing ETDs

Key requirements cont..

Automatic metadata assignment and validation during online submission e.g. Author’s details (extracted form students’

database) Support for assigning subject categories Metadata and full text quality assessment by

library staff E-mail notifications to concerned parties during

submission, approval and archiving processes

23rd Mar 2012 National Seminar on Managing ETDs

Registrationrequest

Local copy ofStudents’ Database

Reviewer (Library staff)

Archive

Student Workspace

Reque-stValid?

Submi-ssion Ok?

AcademicSection

Advisors (Thesis Guide)

Registrationcompleted

etd@IISc workflow

Yes

No

SRNoEmail

Check

Request

To Admin

Login

YesApprove

RejectNo

23rd Mar 2012 National Seminar on Managing ETDs

Compliance with ETD-MS metadata

thesis.degree.name Name of the thesis (Ph.D, MSc Engg. etc.)

thesis.degree.level Level of the degree (Master, Doctorate)

thesis.degree.discipline Discipline of the degree (Science, Engineering)

thesis.degree.grantor Grantor of thesis (IISc)

23rd Mar 2012 National Seminar on Managing ETDs

Customization

Look and feel Registration process Submission fields added e.g. Advisor, Provision

for subject classification etc. E-mail notification in all stages of the submission

process Displays the total number of thesis in the

repository

23rd Mar 2012 National Seminar on Managing ETDs

Customization: Automatic Association to a collection upon registration

Normally administrator associates an user (eperson) to a collection for submission

etd@IISc automatically assigns a user to a collection

e.g. Email to be registered shwetha@mcbl.iisc.ernet.in

Identifies the department mcbl (collection) using email

shwetha@mcbl.iisc.ernet.in-> Microbiology and Cell Biology (mcbl)

23rd Mar 2012 National Seminar on Managing ETDs

Communities and Collections of etd@IISc A division as community which has many

departments A department as collection A thesis from a department goes to

respective collection No subcommunity

23rd Mar 2012 National Seminar on Managing ETDs

Create Community

23rd Mar 2012 National Seminar on Managing ETDs

Communities

23rd Mar 2012 National Seminar on Managing ETDs

Create Collection

23rd Mar 2012 National Seminar on Managing ETDs

Create Collection

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

Thesis submission and Archival step …

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

Workflow Steps etd@IISc

Accept/Reject/Edit Metadata Step by Library staff

23rd Mar 2012 National Seminar on Managing ETDs

Handle Server Configuration / Open Archives Initiative (OAI) complaint

etd@IISc creates persistent identifier for every submitted thesis

The handle prefix provided by CNRI is ‘2005’ e.g http://hdl.handle.net/2005/140 is a URL of a

thesis abstract page OAI compliance and the base URL is

:http://etd.ncsi.iisc.ernet.in/oai/request

23rd Mar 2012 National Seminar on Managing ETDs

Challenges encountered in implementation… Communities and Collections Strengths

Patch was developed by us (NCSI) for displaying of Communities and Collections Strengths

This feature accepted and is now part of DSpace code base (v-1.2.2 onwards)

Displaying of total number of these in etd@IISc.

23rd Mar 2012 National Seminar on Managing ETDs

Challenges encountered in implementation… Browse views for subject fields and Thesis

Guide Code was developed at NCSIAccepted and is part of DSpace code baseThis feature available in configuration now

23rd Mar 2012 National Seminar on Managing ETDs

Challenges encountered in implementation Creating metadata submission field Editing JSP files and make changes in

java serverlet files Pretty simple now (edit input-forms.xml)

23rd Mar 2012 National Seminar on Managing ETDs

Challenges encountered in implementation… Subject classification

To enable the submitters to include their thesis under the most appropriate subject headings, etd@IISc provides a classification scheme based on Dissertation Abstracts International (DAI)

23rd Mar 2012 National Seminar on Managing ETDs

23rd Mar 2012 National Seminar on Managing ETDs

Challenges encountered in implementation… Pre-filled submission text box e.g

Author text box Identifier (SRNo of a student)Thesis degree name (MSc. Engg, Ph.D)Thesis degree level (Master, Doctoral)RightsThesis grantor (IISc)

23rd Mar 2012 National Seminar on Managing ETDs

Challenges encountered in implementation… Up gradation from lower versions to higher

version Reason lot of customization have to be

taken care of Versions compliance of postgreSQL

database Backup/Restore database

23rd Mar 2012 National Seminar on Managing ETDs

Challenges Always

Self Archiving Archiving back volumes of ETD Proper metadata tagging

23rd Mar 2012 National Seminar on Managing ETDs

Back up and restore mechanism

Database, assetstore, configuration files, Web pages (jspui, xmlui)

Scripts written for backup (rsync tool) Scheduled the scripts (crontab utility) Restored to a mirror site in case system

crash and etd@IISc will up in seconds

23rd Mar 2012 National Seminar on Managing ETDs

Access Policies

Access by registration Abstract available for everyone Registration allowed only for Institutes

users

23rd Mar 2012 National Seminar on Managing ETDs

Conclusion

etd@IISc is a digital repository of theses and dissertations of IISc

Facilitates better means to capture, store, process, and disseminate the intellectual output of IISc

Prototype Repository to Formulate key requirements Implemented and customized to meet our requirements We are observing the various operational implications of

the repository and are very keen to incorporate further improvements

23rd Mar 2012 National Seminar on Managing ETDs

Thanks for listening

Any question?