hong kong university of science & technology library developing the digital institutional...
Post on 21-Dec-2015
216 views
TRANSCRIPT
Hong Kong University of Science & Technology Library
Developing the Digital Institutional Repositoryat HKUST
Diana Chan, Head of Reference
KT Lam, Head of Systems
HKUST Library
December 9, 2003
HKUST Library 2
Outline of Presentation
1. Why Create the Institutional Repository?
2. Demonstration
3. How Did HKUST Develop It?
4. Challenges
5. IR Software Evaluation
6. DSpace Implementation at HKUST
7. Q&A
HKUST Library 3
1. Why Create the Institutional Repository?
What is an Institutional Repository (IR)?
A “digital collection capturing and preserving the intellectual output of a single or multi-university community”.
- adopted from “The case for institutional repositories: a SPARC position paper” prepared by Raym Crow.
- <http://www.arl.org/sparc/IR/ir.html>
HKUST Library 4
1. Why Create the IR?
Budapest Open Access Initiative <http://www.soros.org/openaccess/index.shtml>
Recommends 2 Strategies:
1. Self-archiving in Open Electronic Archives
2. Open Access Journals
HKUST Library 5
1. Why Create the IR?
Dual Open-Access Strategy <http://www.ecs.soton.ac.uk/~harnad/Temp/berlin.htm>
BOAI-2 ("gold"): Publish your article in a suitable open-access journal whenever one exists.
BOAI-1 ("green"): Otherwise, publish your article in a suitable toll-access journal and also self-archive it.
HKUST Library 6
1. Why Create the IR?
Why have an IR at HKUST? To create a permanent record of the scholarly
output of HKUST
- No access to some scholarly works published by our own faculty
- Collections of working papers, technical reports, research reports flowing around- Some of our scholarly works are in the public
domain
HKUST Library 7
1. Why Create the IR?
Why have an IR at HKUST?To help the international Open Access effort. Because the mission of disseminating knowledge is only half complete if it is not widely and readily available to society. - Adapted from the Berlin Declaration
<http://www.zim.mpg.de/openaccess-berlin/berlindeclaration.html>
HKUST Library 8
1. Why Create the IR?
The Contribution Must Satisfy 2 Conditions:
The author…grants to all users a free…right of access to, and a license to copy, use, distribute, transmit and display the work publicly …
A complete version of the work is deposited in…at least one online repository- From the Berlin Declaration
HKUST Library 9
2. Demonstration
HKUST Institutional Repository
<http://repository.ust.hk/>
DSpace interface Sample record Submission form Search in OAISter
HKUST Library 10
Collection Type and Size
Communities 18Collections 37
Book chapters 1Conference papers 85Journal articles 66Patents 62Presentations 40Preprints 12Technical reports 109Theses 110Working papers 35Miscellaneous 6
Total 520
HKUST Library 11
3. How Did HKUST Develop It?
<http://library.ust.hk/staffman/ref-man/IR/ir.html#stages>
1. Planning & Policies
2. Technical Developments
3. Harvesting and Promotion
4. Work Teams
5. Negotiations with Publishers
HKUST Library 12
3.1 Planning and Policies
Task Force - software, scope, policies, database structure, problems, action plans.
Information Services Committee – guidelines on different types of publications, publishers’ policies, data formats, faculty concerns.
Library Administrative Committee – final approvals.
HKUST Library 13
3.2 Technical Developments
Will be discussed by KT Lam in parts 5 & 6
HKUST Library 14
3.3 Harvesting and Promotion
Within HKUST:
1st Stage : Prototype 105 Computer Science Technical Reports
2nd Stage: Target Group: Faculty who already posted their publications on the WebEmailed 80. 49 agreed. Harvested 144 documents.
HKUST Library 15
3.3 Harvesting and Promotion
Within HKUST:3rd Stage: Target Group: All Faculty
Emailed all to encourage direct submission. 2 documents submitted. Notes from the Library
4th Stage: Target group: All Faculty
Emailed all telling which publishers allow post-refereed self-archiving (IEEE, ACM, Emerald, SPIE…). 3 documents submitted
HKUST Library 16
3.3 Harvesting and Promotion
In the Cyberspace: Harvested 53 US Patents Harvested 21 journal articles from Emerald Harvested 10 articles from DOAJ Joined OAISTer
HKUST Library 17
3.3 Harvesting and Promotion
Planned: Will harvest conference proceedings held at
HKUST and published by HKUST Will cover PhD theses with signed permissions Will contact departments for preprints, working
papers, technical reports, etc. Will contact faculty whose publications have not
been posted Departmental visits
HKUST Library 18
3.4 Work Teams
Subject Librarians
Data Entry Team
HKUST Library 19
3.4 Work Teams – Subject Librarians
1 Liaison With Faculty
6. Do Indexing
5. Verify Document Versions
2. Check Faculty’s Publication Lists
4. Ascertain Publishers’s Policies
3. Harvest Documents
HKUST Library 20
3.4 Work Teams– Data Entry Team
1. Verify and Convert PDF Documents
2. Data EntryUsing SubmissionForm
3. Set PDF DocumentSecurity & Properties
4. Create Folders & Upload Files
HKUST Library 21
Get indexed documents
from librarians
Screen and Convert Files
Input Data
Check for Errors
Set Pdf document Security & Properties
Final Check
Group to Different Folders
Define Communities &
Collections in DSpace&
Upload Files
Flowchart on Data Entry
HKUST Library 22
3.5 Negotiations with Publishers
Collection Development Librarian wrote to: INFORM ProQuest Wiley Springer IEEE AAAS Elsevier
Result: No good news yet.
HKUST Library 23
4. Challenges
Faculty: Low awareness of Open Access Initiative (OAI) Concern over copyright issues Apathy in direct submission Lack of willingness to negotiate on non-exclusive
rights and to publish in open access journals Lack of willingness to provide the right versions of
documents (pre- or post-refereed) Only a small % of scholarly work can be archived
HKUST Library 24
4. Challenges
Institution: Needs to make a mandate to deposit all
research outputs with the Institutional Repository
Needs to give financial support to faculty who submit papers to open access journals
HKUST Library 25
4. Challenges
Publishers: In Romeo project, only 34 out of 80 allow some sort
of archiving Many have no policy (Camford, Genetic Society of
America) Many have an unclear policy Some:
Decline to give permissions (Springer, AAAS) Give no response (INFORM) Give a wrong answer (Wiley)
Need to include self-archiving into license agreements with publishers
HKUST Library 26
4. Challenges
Library continue to: Provide support for university research self-
archiving Promote IR Educate users and faculty about the IR Showcase the IR Find champions and partners among faculty Seek institutional mandate and support Harvest documents
HKUST Library 27
5. IR Software Evaluation
1. Background
2. EPrints
3. DSpace
4. Why Did We Choose DSpace?
5. Evaluation Guide
6. Other Software and Selection Criteria
HKUST Library 28
5.1 Background
Institutional repository software - also known as institutional archive-creating software, or digital repository software.
HKUST Library started IR software evaluation in late December 2002.
Two products were evaluated: EPrints and DSpace.
Decided to use DSpace in mid-February 2003.
HKUST Library 29
5.2 EPrints
<http://software.eprints.org/>
Developed by University of Southampton. The very first freely available institutional
repository software; since 2000. GNU software, thus, open source. Has the largest installed base. Written in Perl, with MySQL and Apache.
HKUST Library 30
5.3 DSpace
<http://www.dspace.org/>
Jointly developed by MIT Libraries and Hewlett-Packard Company.
Open source available since late December 2002, after two years of development.
Written in Java, with PostgreSQL, Lucene, and Apache/Tomcat.
Still under development.
HKUST Library 31
5.4 Why Did We Choose DSpace?
DSpace was developed based on the experience gained by EPrints.
It has a well defined data model:Community + Collection + Item + Metadata + Bundle + Bitstream
UTF-8 capable. Well organized web-interface. Metadata in Dublin Core format.
HKUST Library 32
5.5 Evaluation Guide
“A Guide to Institutional Repository Software” by Open Society Institute
<http://www.soros.org/openaccess/software/OSI_Guide_to_Institutional_Repository_Software_v1.htm>
HKUST Library 33
5.6 Other Software & Selection Criteria
Other IR Software: CDSware – from CERN. I-TOR – from Netherlands Institute for Scientific
Information Services. MyCoRe – from University of Essen.
Selection Criteria: Open source. Comply to OAI-PMH (Open Archives Initiative
Protocol for Metadata Harvesting) . Currently released and publicly available.
HKUST Library 34
6. DSpace Implementation at HKUST
1. DSpace Server
2. Problems
3. Limitations
HKUST Library 35
6.1 DSpace Server
<http://repository.ust.hk/>
PC with Pentium 4, 2.4GHz, 1GB RAM memory
RedHat Linux, with standalone Tomcat, PostgreSQL database, and Lucene search engine.
DSpace Version 1.1.1. Becomes live since late February 2003.
HKUST Library 36
6.2 Problems
Faculty Submission Form DSpace’s build-in submission interface is too
complicated. We have to develop our own submission form.
Then use DSpace’s Item Importer to load the data.
CJK Search Failure Fixed by modifying DSpace Java source codes.
HKUST Library 37
6.2 Problems
CNRI Handle Required registration at CNRI for a handle prefix.
Our prefix is 1783.1. Custom Authentication
Added java codes to query HKUST’s LDAP server.
Handling of non-English Characters Uses the approach adopted in our Electronic
Theses Database.
HKUST Library 38
6.2 Problems
Server Hanging Problem
Other Software Bugs
HKUST Library 39
6.3 Limitations
Flatten Community+Collection structure 2-level only, not deep enough.
Linked Collection a collection that belongs to more than one
communities.
Unable to Cross Search search multiple collections from different
communities.
HKUST Library 40
6.3 Limitations
Query Syntax Not Apparent to Users, e.g.+water +rapid [for exact word match]
"vapor generator" [for phrase search]
Limited Capability on Sorting Search Results.
Cannot Display the Number of Items in the Repository, in a Community, and in a Collection.
HKUST Library 41
Related Websites
American-Scientist September Forum <http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/index.html>
Open Access Presentation <http://www.ecs.soton.ac.uk/~harnad/Temp/openaccess.ppt>
Self-Archiving FAQs <http://www.eprints.org/self-faq/>
SPARC Institutional Repository Checklist & Resource Guide
<http://www.arl.org/sparc/IR/IR_Guide.html>