maximizing the impact of institutional knowledge using dspace

Maximizing the impact of institutional knowledge using DSpace Alan Orth Nairobi, Kenya - July 28, 2015 Webinar for AIMS@FAO

Upload: aims-agricultural-information-management-standards

Post on 12-Aug-2015




0 download


Page 1: Maximizing the Impact of Institutional Knowledge Using DSpace

Maximizing the impact of institutional knowledge using

DSpaceAlan Orth

Nairobi, Kenya - July 28, 2015Webinar for AIMS@FAO

Page 2: Maximizing the Impact of Institutional Knowledge Using DSpace


● Why we use DSpace● How we use DSpace● Organizational tips for using DSpace● Technical tips for DSpace deployments

Page 3: Maximizing the Impact of Institutional Knowledge Using DSpace

DSpace helps make information “F.A.I.R”

● Free: no subscriptions, “paywalls”, etc● Accessible: is publicly available● Indexed: can be found in search engines● Reusable: has a permissive license

Addresses both the moral and legal imperatives… aka the “carrot” and the “stick”.

Page 4: Maximizing the Impact of Institutional Knowledge Using DSpace

History of DSpace at ILRI

● Before: InMagic, physical library● 2009: ILRI launches Mahider (“repository” in

Amharic)● 2010: Other CGIAR research centers and

programs join our platform and share hard / soft costs

● 2011: Rebranded as “CGSpace”● 2015: 9 CGIAR centers, ~50,000 items, ~200k


Page 5: Maximizing the Impact of Institutional Knowledge Using DSpace

“CGSpace” in July, 2015

Page 6: Maximizing the Impact of Institutional Knowledge Using DSpace

How we use DSpace

● Primary location for institutional outputs!● (No posting PDFs on corporate website!)● Content people embedded in each department

help capture results (presentations, papers, brochures, etc)

● Integrate with website and blogs via RSS feeds● (Direct ALL traffic to DSpace!)● For data sets, videos, etc we make a metadata-

only accession with a link to eg YouTube

Page 7: Maximizing the Impact of Institutional Knowledge Using DSpace

● Communities, sub-communities, and collections● Tempting to model after organization hierarchy!● (we did)● … but organization hierarchies change!

DSpace hierarchies

Page 8: Maximizing the Impact of Institutional Knowledge Using DSpace

Mostly organized by output type now...

Page 9: Maximizing the Impact of Institutional Knowledge Using DSpace


● Standard Dublin Core is available● No AGROVOC!● You can create custom controlled vocabularies in

arbitrary namespaces, eg: cg.subject.ilri● Display custom fields selectively in the XMLUI

item list and view pages

Page 10: Maximizing the Impact of Institutional Knowledge Using DSpace

Custom metadata displayed on ILRI item page

Page 11: Maximizing the Impact of Institutional Knowledge Using DSpace

“Discovery” facets

● Context-aware metadata summaries

● Great for content people and users alike

● Side effect: helps spot metadata inconsistencies!

● … Open Access, Open access, open Access, etc.

● DSpace 4+, XMLUI

Page 12: Maximizing the Impact of Institutional Knowledge Using DSpace

Search engine optimization (SEO)

Help Google Scholar consume your content...

1. XML sitemaps (see DSpace manual)2. Submit sitemap to Google Webmaster Tools to

control indexing, see stats, etc.3. Single, consistent domain name, ie:

cgspace.cgiar.org4. Persistent links for resources (“Handle”)5. Website speed and HTTPS both a plus6. Bing, Yahoo, and Yandex less important

Page 13: Maximizing the Impact of Institutional Knowledge Using DSpace

SEO: crawling vs consuming

● Traditionally search engines basically “stumble” upon your content

● Using XML sitemaps they can consume it in a structured way

● Google discontinued the use of OAI for discovering site content in 2008!

Drinking from the firehose!

Page 14: Maximizing the Impact of Institutional Knowledge Using DSpace

Sitemap view in Google Webmaster Tools

Page 15: Maximizing the Impact of Institutional Knowledge Using DSpace

Meteoric rise in Google’s indexes

Page 16: Maximizing the Impact of Institutional Knowledge Using DSpace

Importance of persistent links

● Website addresses change…● ->● But resources stay the same!

● “Handle” service from● Everything under prefix 10568 is CGSpace● Default DSpace handle prefix is 123456789!

Page 17: Maximizing the Impact of Institutional Knowledge Using DSpace

dc.identifier.uri: persistent universal resource identifier

Page 18: Maximizing the Impact of Institutional Knowledge Using DSpace

Getting data INTO DSpace

● Day-to-day submission is manual (by a small army of editors)

● One-time batch uploads of items from other systems in CSV format (InMagic!)

● OAI-PMH for metadata only● OAI-ORE for metadata + bitstreams (eg, from

another DSpace, Sharepoint, etc)● SWORD (haven't tried)● REST API (DSpace 5+, haven't tried)

Page 19: Maximizing the Impact of Institutional Knowledge Using DSpace

Getting data OUT OF DSpace

● REST API for structured JSON or XML 👍● OAI-PMH for metadata● OAI-ORE for metadata + bitstreams (PDFs, etc)● RSS feeds for websites / blogs● XML sitemaps for search engines

Page 20: Maximizing the Impact of Institutional Knowledge Using DSpace

CCAFS website, powered by Drupal + DSpace APIs

Page 21: Maximizing the Impact of Institutional Knowledge Using DSpace

“Latest outputs” on ILRI homepage, via DSpace RSS

Page 22: Maximizing the Impact of Institutional Knowledge Using DSpace

“Latest outputs” on project blog, via DSpace RSS

Page 23: Maximizing the Impact of Institutional Knowledge Using DSpace

CGSpace technology stack

- NGINX 1.8 HTTP server- TLS termination, SPDY, redirects, virtual hosts

- Tomcat 7 servlet engine- runs DSpace, bound to localhost

- Ubuntu 14.04 GNU/Linux OS- long-term support release, good mix of stable / new

Page 24: Maximizing the Impact of Institutional Knowledge Using DSpace

Open source workflow on GitHub

Page 25: Maximizing the Impact of Institutional Knowledge Using DSpace

Skills needed in your organization

Besides content people(!)...

● Prioritize: Linux systems administration experience (Tomcat, httpd, PostgreSQL, DNS, SSH, git)

● General: computer science background● Web developers a diverse bunch...● Java development experience doesn't hurt

Page 26: Maximizing the Impact of Institutional Knowledge Using DSpace

Extra considerations

● Item mapping● Maintenance tasks (background batch jobs)● Backups of assetstore and PostgreSQL!● Altmetrics tracks social media mentions● Separate production / development

environments● CGSpace server is $80/month● ~20GB of PDFs, ~8GB of Solr data

Page 27: Maximizing the Impact of Institutional Knowledge Using DSpace

Getting help

● “DSpace Tech” mailing list● “dspace” tag on StackOverflow website● [email protected]

Page 28: Maximizing the Impact of Institutional Knowledge Using DSpace

This presentation has a Creative Commons licence. You are free to re-use or distribute this work for non-commercial purposes, provided credit is given to ILRI.

better lives through livestock

Box 30709, Nairobi 00100, KenyaPhone +254 20 422 3000Fax +254 20 422 3001Email [email protected]

ilri.orgbetter lives through livestock

ILRI is a member of the CGIAR consortium

ILRI has offices in:Central America • East Africa

South Asia • Southeast and East AsiaSouthern Africa • West Africa