implementing an integrated dams fedora / vital at the national library of wales
DESCRIPTION
Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales. OR08 3 rd April 2008 Paul Bevan [email protected] Glen Robson [email protected]. Outline. Introduction Single Point of Access – how and why Objects Structure Content Streams Ingest - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/1.jpg)
Implementing an Integrated DAMS
FEDORA / VITAL at the National Library of Wales
OR083rd April 2008
Paul Bevan [email protected]
Glen Robson [email protected]
![Page 2: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/2.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Outline
• Introduction
• Single Point of Access – how and why• Objects Structure• Content Streams
• Ingest• Architecture• Implementation of Single Sign On
• Lessons Learnt
![Page 3: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/3.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Background
• FEDORA Users since 2005
• Early Pilot Project– What do we have?– How can we deal with it?– What should come first?
• Steep Learning Curve
• Planning for a repository infrastructure
![Page 4: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/4.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
2006: A ‘New System’ For NLW
• Diverse Collections
• No Single Metadata Standard/Format
• Variance in the application of Standards
• Many Legacy Systems
• Improved Efficiency
• Increasing Digital Resources & Assets
VTLS (Virtua & VITAL)
![Page 5: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/5.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Single Metadata Repository
• Existing Data Formats Migrated to MARC21• Integrated Information Management System
controls accession, circulation, statistics, preservation, etc.
• Facilitates:– Combined (& Efficient) Workflows– Reduced Support Requirements– Single Point of Access…
![Page 6: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/6.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Single Point of Access
• Why?– Varied Collections– Compartmentalised Users
• Why Not?– Information Overload– Data Specific Functionality
• Solution: Many & One
![Page 7: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/7.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Multiple Resource Providers
• Single Point of Access, Single Metadata Repository, Single Catalogue.
• Resources from many back-end systems:– Subscribed Electronic Resources– Free Online Resources– Digitised/Digital Resources (VITAL)– Stacks (Virtua)
![Page 8: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/8.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Catalogue
Full Skin eResources Family History
FEDORA/VITAL
SubscribedResources
IngentaConnect (etc)
Digital Exhibitions
OAI-PMH
Search EnginesATHENS
APIs
Handle Resolution
![Page 9: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/9.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
DAMS Project
• 18 Month implementation process
• Development requirements from VTLS
• Substantial metadata discussions
• Lots of institutional self-discovery
![Page 10: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/10.jpg)
OAIS
METSMETS
METS
METS
Digital Archive
Virtua
VITALDisseminators
MARCMARC
![Page 11: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/11.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Virtua / Vital Integration
• NLW Spec, VTLS Development
• Primary access through catalogue– Less confusion for the user– Retain existing skills– Initially DAMS will hold digitisation images so
catalogue record already exists
• ‘Single Sign On’– Existing patron information used for authentication– User & Group details passed directly from Catalogue
to FEDORA as part of HTTP request
![Page 12: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/12.jpg)
![Page 13: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/13.jpg)
![Page 14: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/14.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Would we do the same again?
Or…
in the age of cross-searching is it worth integrating?
![Page 15: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/15.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Object Rules & Content Model
• Every object will have a MARC record (at some level) within the IMS
• METS will be our SIP, AIP and DIP
• METS will be the policy, FEDORA services will enact the policy.– eg. DC Section in METS populates the DC datastream– eg. Structural Map held in METS, structure in RELS-EXT datastream
• Every object in the repository will have a METS document
• Every object will have DC for OAI-PMH
• Object Rules evolve into a content model
![Page 16: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/16.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Object (PID/Handle)
DS: METS
DS: DC (OAI-PMH)
DS: RELS-EXT
DS: Object (eg. TIFF File)
DS: Object (eg. JPEG File)
DS: Object (eg. Thumbnail File)
DS: Object (eg. Zoomify File)
Content Model (AIP): Still Imagehttp://hdl.handle.net/10107/1
http://hdl.handle.net/10107/1-0
http://hdl.handle.net/10107/1-1
http://hdl.handle.net/10107/1-2
http://hdl.handle.net/10107/1-10
http://hdl.handle.net/10107/1-11
http://hdl.handle.net/10107/1-12
http://hdl.handle.net/10107/1-13
![Page 17: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/17.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W• METS Header:
– History of METS document
• Descriptive Meta Data – mods on digital copy – link to dc for original – premis details on original file name – relationship information to other objects in Fedora
• Administrative Meta Data – mix technical meta data on images 1– dc legacy technical data – premis admin data including checksums, and relationships between data streams – link to MARC record in Virtua
• Digital Providence – List of actions that have occurred to the object – List of agents who ran the software / versions of the software used
• File Sec – List of Data streams linking above sections to particular data streams
• Structural Map – First one allows display via a METS viewer type application – Subsequent ones are linked to Disseminators
• Behaviour Section – Links objects to their dissemination
![Page 18: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/18.jpg)
Object Structure
MARC Catalogue
Record
isPartOf
Digitisation Projects(Object)
Geoff Charles Collection(Object)
…
Wrexham Young Farmers' Club
ploughing competition(Object)
![Page 19: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/19.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Content Streams
• The library’s mission is to:“collect, preserve and give access to recorded knowledge, in all documentary forms, with an especial emphasis on the intellectual record of Wales, for the benefit of all engaged in research and learning, or with other information needs”
• In reality this means a LOT of stuff:– Born digital– Digitised– Physical Media– Digital Deposits– Archived websites– Floppy disks in archival files– And on…– And on…
All of which needs prioritising
![Page 20: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/20.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Digitisation Projects (Still Images)
• Geoff Charles“Geoff Charles (1909 – 2002) was a newspaper photographer who
dedicated 50 years of his life to portraying Wales through the lens of his camera. The contribution by Geoff Charles to Wales is unique and today his archive of 120,000 photographs is one of the treasures of the National Library of Wales.”
• 2,294 Records• 14,038 Images• 16,322 Fedora Objects
![Page 21: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/21.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Welsh Journals Online
• JISC Funded
• 90 Journals since 1900
• Titles in Welsh and English
• 400,000 scanned pages
• OCR TEI & Coordinates metadata
• ‘Complex’ rights issues.
![Page 22: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/22.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Physical Digital Deposits
• Variety of unexpected and unknown deposits as part of archives
• Large collection of deposited CDs• No selection process yet• CD Accessioning System (CDAS)
– In-House development– Remove data from CD (to ISO) & Checksum– Create METS manifest– Determine filetypes with DROID
• Selection Interface– Navigation of CDs over web interface– View files to aid in selection
![Page 23: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/23.jpg)
![Page 24: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/24.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Offair Recording
• NSSAW Archive
• Every Programme (DVB-T & Radio) of Welsh Interest– All S4C– News & Current Affairs– Welsh Productions
• Proprietary BOB system integrated with Virtua & VITAL
• No Networked Access
![Page 25: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/25.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Wills
• All Welsh probate records
• 182, 404 Records• 816, 325 Images• 998, 729 Fedora Objects
• 27 Days ingest• Concerns over Triplestore• Upgrade?
![Page 26: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/26.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Ingest
• All automated
• Wills data is catalogued in MARC 21 with links to an image
• 1 Record points to a number of pages
![Page 27: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/27.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Process – Step 1Create SIP
MARC RecordCatalogue
Parent METS
Image METSImage METSImage METSImage METS
Default METS
MarcTo
METS
Jhove Output
![Page 28: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/28.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Process – Step 2Ingest
METS FOXMLMETS To
FOXML
Fedora
Fedora Web Services
Ingest XMLReserve PID
Time to Ingest
16,322 Objects
10 Hours 40mins
![Page 29: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/29.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Architecture: Servers• Using Fedora 2.1.1 (No Journaling)• Redundancy rather than Performance.
Server 1 Server 2
Handle Server Handle ServerVital Instance Vital InstanceFile System (FOXML) File System (FOXML)Oracle Database Oracle Database
• SAN Connected through iSCUSI (Holds Reference Images)
• Tape / Optical library (Stores Archive copies)
![Page 30: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/30.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
DataGuard
• Allows failover of database
• Keeps both databases in sync
• Currently we use rsync to sync fedora's data files
• Data more important than Database!!
![Page 31: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/31.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Disseminators
• Disseminator for each object that returns METS
• Use cocoon to turn METS XML into web page• With no authentication you can build the view
functionality as a disseminator• Once you turn on authentication it asks for username
and password every time
• Solution is to create a guest account and a web application which sits in between the user and fedora
• Could be in the same tomcat instance as Fedora
![Page 32: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/32.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
System Architecture
![Page 33: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/33.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Rights Issues for Wills
• Free access internally
• Paid for external access
• External Group paid access
• Printing services
![Page 34: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/34.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Implementing ‘Single Sign On’Issue 1 - Authentication
• Username + Password and group in catalogue• Web service on catalogue to authenticate• Need to develop servlet filter for Fedora to authenticate against this• web service
Issue 2 - Authorization• Use Fedora XACML
![Page 35: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/35.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Implementing ‘Single Sign On’Issue 3 - Single Sign onUse Case 1• User goes to library catalogue• Logs on to catalogue• finds an item which is in Vital• shouldn't have to login again
Use Case 2• User in the library goes to catalogue• finds an item which is in Vital
• Solution to Use Case 1 - Surrogates• Solution to Use Case 2 - Surrogates or local username?• Fedora sees IP address as catalogue• Can we pass a surrogate IP address
![Page 36: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/36.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Lessons Learnt
• Keep your options open
• 80% Soft, 20% Dev
• Managing Expectations is Difficult
• And that’s not even talking about format migration…
![Page 37: Implementing an Integrated DAMS FEDORA / VITAL at the National Library of Wales](https://reader030.vdocument.in/reader030/viewer/2022032805/56813144550346895d97c111/html5/thumbnails/37.jpg)
Imp
lemen
ting
an In
tegrated
DA
MS
at NL
W
Questions?
Paul Bevan – [email protected]
Glen Robson – [email protected]
http://dev.llgc.org.uk