getting "good" e-theses marc records from dspace

15
Getting “good” e-theses MARC records from DSpace Alison Hitchens Short talk at Code4Lib Midwest July 2014

Upload: alison-hitchens

Post on 08-May-2015

316 views

Category:

Education


4 download

DESCRIPTION

java program for marc record creation in DSpace

TRANSCRIPT

Page 1: Getting "good" e-theses MARC records from DSpace

Getting “good” e-theses MARC records from DSpaceAlison HitchensShort talk at Code4Lib MidwestJuly 2014

Page 2: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

2

Acknowledgements This program is based on MSpace

project by Jonathan Roby at University of Manitoba & was modified for UW needs

Page 3: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

3

Quick history of ETDs at uWaterloo 1997 - Pilot project 1998 - Prototype database 1999 - Proposal for ETD system 2000 - Voluntary submission of e-theses 2006 - switched from local system to

DSpace (aka UWSpace) 2006 - mandatory submission of e-theses Visit the project page

Page 4: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

4

Thesis cataloguing workflow Print/fiche

Received physical item and catalogued it Voluntary ETD

Received physical item and catalogued it; checked for e-version

Mandatory ETD with no print/fiche Receive e-mail alert of new theses from

UWSpace Receive e-mail with MARC records Minor editing/checks by cataloguing staff

Page 5: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

5

Why not generic DSpace MARC?Some of the issues: Missing essential control fields, e.g. 008, 007 Many indicators miscoded, e.g. 100 10, 245

00 Missing statement of responsibility No publication data No standard thesis notes No entry for related department/school Local thesis cataloguing policies

Page 6: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

6

Solution: program to create “good” MARC records Systems Department co-op student/contract

Adam Patterson in collaboration with Cataloguing Department (2007) Based on MSpace project by Jonathan Roby at

University of Manitoba & modified for UW needs

UWMARC21Export.java Version 2 revised by Graham Faulkner with

migration from DSpace 1.4 to 3.1 Also updated for new cataloguing code, RDA

Page 7: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

7

E-mailing MARC records Part of the itemexport package Creates MARC files for new submissions

(if not embargoed) & for items with embargo removed

E-mails the MARC file after UWSpace has sent out the new e-theses alert e-mails

Checks for empty MARC files (i.e. if no new submissions) before e-mailing

Page 8: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

8

Creating MARC records Creates control fields Processes relevant Dublin Core fields from

UWSpace into MARC Adds new information for local needs E.g. creator field:

Creates string for 100 tag Sets indicators as 1b Adds author as Last name, First name Adds comma at end (or changes . to ,) Adds $e author. Validates there should only be one 100 tag

Page 9: Getting "good" e-theses MARC records from DSpace

Sample section of code: 100 tag

public static String process100Field(String value) { String retstring = ""; if (!value.equals("")) { retstring = "\u001e1 \u001fa"; //first num is always "1", second is missing retstring += value; if (value.charAt(value.length()-1) != '.') { retstring += ","; } retstring += "\u001fe author."; numValidFields += 1; } return retstring; } 9

Page 10: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

10

Examples of modifications to 245 Normalized title to sentence case First indicator 1; second indicator based

on presence of A, The, An etc. using an LC table

Checks for “:” and creates $b for other title information

Adds “/$c by” + author information, with First name Last name

(used to add $h [electronic resource])

Page 11: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

11

Added/revised by local request 264 (was 260) with UW as publisher 502 with appropriate abbreviations for degree names Local wording in linking field (856) Local codes for 040 field, e.g. $a CaOWtU 710 for department/school to match authorized access

point Added a 300 field $a 1 online resource Added generic thesis note

500 $a "A thesis presented to the University of Waterloo in fulfilment of the thesis requirement for the degree of

Changed some of the 008 mappings (e.g. literary form) Remove extra spaces in abstracts

Page 12: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

12

Example e-mail with file attachedFrom: [email protected] [mailto:[email protected]] Sent: Monday, July 14, 2014 2:30 AMTo: Graham Faulkner; Alison HitchensSubject: [UWSpace] Patent Pending removed: patentPendingChanged140714.mrc The attached file contains MARC records for Etheses submissions which had Patent Pending the last time the MARC generation script was run, but do not have Patent Pending anymore. These records use the UTF-8 character set so be sure you save them as "mrk8". Have a nice day :-)

Page 13: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

13

Example MARC

=LDR 05500nam a2200229 i 4500=006 m\\\\\\\\d\\\\\\\\=007 cr\cn\|||||a||=008 140619s2014\\\\onc\\\\\obm\\\000\0\eng\d=040 \\$aCaOWtU$beng$erda$cCaOWtU=100 1\$aSteinmoeller, Derek,$e author.=245 10$aHigh-order numerical methods in lake modelling /$cby Derek Steinmoeller.=264 \1$aWaterloo, Ontario, Canada : $b University of Waterloo, $c2014.=300 \\$a1 online resource ( pages)=336 \\$atext$2rdacontent

Page 14: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

14

MARC record continued=337 \\$acomputer$2rdamedia=338 \\$aonline resource$2rdacarrier=500 \\$a"A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of=502 \\$aThesis (Ph.D)--University of Waterloo, 2014.=504 \\$aIncludes bibliographical references.=520 \\$aThe physical processes in lakes remain only partially understood despite…=710 2\$aUniversity of Waterloo. $bDepartment of Applied Mathematics.=856 40$uhttp://hdl.handle.net/10012/8534 $zClick here for access

Page 15: Getting "good" e-theses MARC records from DSpace

Hitchens-C4LMW2014 - UWSPACE MARC

15

Questions?Alison HitchensCataloguing & Metadata LibrarianUniversity of Waterloo [email protected]: @ahitchensSlideshare: aehitchens

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.