![Page 1: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/1.jpg)
Author(s): Jeremy York, 2010
License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share Alike 3.0 License: http://creativecommons.org/licenses/by-nc-sa/3.0/
We have reviewed this material in accordance with U.S. Copyright Law and have tried to maximize your ability to use, share, and adapt it. The citation key on the following slide provides information about how you may share and adapt this material.
Copyright holders of content included in this material should contact [email protected] with any questions, corrections, or clarification regarding the use of content.
For more information about how to cite these materials visit http://open.umich.edu/education/about/terms-of-use.
Any medical information in this material is intended to inform and educate and is not a tool for self-diagnosis or a replacement for medical evaluation, advice, diagnosis or treatment by a healthcare professional. Please speak to your physician if you have questions about your medical condition.
Viewer discretion is advised: Some medical content is graphic and may not be suitable for all viewers.
![Page 2: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/2.jpg)
Citation Keyfor more information see: http://open.umich.edu/wiki/CitationPolicy
Use + Share + Adapt
Make Your Own Assessment
Creative Commons – Attribution License
Creative Commons – Attribution Share Alike License
Creative Commons – Attribution Noncommercial License
Creative Commons – Attribution Noncommercial Share Alike License
GNU – Free Documentation License
Creative Commons – Zero Waiver
Public Domain – Ineligible: Works that are ineligible for copyright protection in the U.S. (17 USC § 102(b)) *laws in your jurisdiction may differ
Public Domain – Expired: Works that are no longer protected due to an expired copyright term.
Public Domain – Government: Works that are produced by the U.S. Government. (17 USC § 105)
Public Domain – Self Dedicated: Works that a copyright holder has dedicated to the public domain.
Fair Use: Use of works that is determined to be Fair consistent with the U.S. Copyright Act. (17 USC § 107) *laws in your jurisdiction may differ
Our determination DOES NOT mean that all uses of this 3rd-party content are Fair Uses and we DO NOT guarantee that your use of the content is Fair.
To use this content you should do your own independent analysis to determine whether or not your use will be Fair.
{ Content the copyright holder, author, or law permits you to use, share and adapt. }
{ Content Open.Michigan believes can be used, shared, and adapted because it is ineligible for copyright. }
{ Content Open.Michigan has used under a Fair Use determination. }
![Page 3: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/3.jpg)
HATHI TRUST A Shared Digital Repository
Building A Future By Preserving Our PastThe Preservation Infrastructure of
HathiTrust Digital Library
Jeremy YorkIFLA 2010
August 15, 2010
![Page 4: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/4.jpg)
Current Partners
– Columbia University– New York Public Library– University of California system– CIC (Committee on Institutional Cooperation)
– University of Virginia– Yale University
University of ChicagoUniversity of IllinoisIndiana UniversityUniversity of IowaUniversity of Michigan Michigan State University
University of MinnesotaNorthwestern University Ohio State University Pennsylvania State University Purdue University University of Wisconsin-Madison
![Page 5: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/5.jpg)
Mission
• To contribute to the common good by collecting, organizing, preserving, communicating, and sharing the record of human knowledge
![Page 6: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/6.jpg)
Universal Library
Common Goal
Single Entity, Many Partners
HathiTrust
![Page 7: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/7.jpg)
Goals
• Comprehensive collection• Preservation…with Access• Shared strategies
– Collection management, development– Preservation– Copyright– Efficient user services
• Openness
![Page 8: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/8.jpg)
Content Distribution
6,549,680 – Total volumes1,300,896 – Public Domain3,798,116 Book titles153,311 Serial titles
* As of August 13, 2010
![Page 9: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/9.jpg)
Language Distribution (1)
* As of August 13, 2010
![Page 10: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/10.jpg)
Language Distribution (2)The next 40 languages make up ~13% of total
* As of August 13, 2010
![Page 11: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/11.jpg)
Dates
* As of August 13, 2010
![Page 12: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/12.jpg)
Content Growth
![Page 13: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/13.jpg)
Repository Philosophy/Design
• OAIS/TRAC• Consistency• Standardization• Simplicity (in design, not function)• Practicality• Sustainability
![Page 14: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/14.jpg)
![Page 15: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/15.jpg)
Content
• Largely uniform in technical characteristics• 4 formats
– ITU G4 TIFF– JPEG2000– JPEG– Unicode (with and without coordinates)
![Page 16: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/16.jpg)
Object Package
imagesSource METStext
HTMETS
Zip
malachus, Flickr.com
![Page 17: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/17.jpg)
Metadata
• Details and specifications at repository level– Object specifications / Validation criteria– Page-tagging
• Variations at object level– Files missing– Non-valid files– Incorrect file checksums
http://www.hathitrust.org/digital_object_specifications
![Page 18: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/18.jpg)
• Bibliographic Data– Must be present prior to content ingest– MARCXML, as complete as possible
• Content– Pre-ingest– Ingest
Ingest
![Page 19: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/19.jpg)
Ingest (2)
Pre-ingestPre-
ingest
SIPSIP
Backend servers
GROOVE
ValidationValidation
METS creation
METS creation
PackagecreationPackagecreation
HandlecreationHandle
creation- Evaluation- Determination of standards- Modification / Transformation
- Ensure conformance- Barcode- Fixity- Consistency- Well-formedness- Prepare archival package
Bibliographic data
Content
![Page 20: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/20.jpg)
Archival Storage
• Reliability – ensure integrity• Redundancy – in single and multiple sites• Scalability – including ease of management• Accessibility – for repository processes and
services• Platform-independence – for data/object
management
![Page 21: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/21.jpg)
Media & Architecture
Michigan
Indiana
Tape Backup
Tape Backup
Archival Storage• Isilon Systems• Load balancing
and failover• Ingest at
Michigan, replicated to Indiana
• Replacement on 3-4 year cycle
![Page 22: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/22.jpg)
Architecture & Management
imagesSource METStext
HTMETS
../uc1/pairtree_root/b3/54/34/86/b34543486
b34543486.zip
b34543486.mets.xml
Example ids:
wu.89094366434mdp.39015037375253
uc2.ark:/1390/t26973133miua.aaj0523.1950.001
malachus, Flickr.com
![Page 23: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/23.jpg)
Data Management
Rights Determination
Rights Determination
Rights DatabaseRights Database
Bibliographic Management
System
Copyright Review Management
System
Copyright Review Management
System
- Inventory- Loading and updating records- Duplicate detection and collation- Solr indexes behind VuFind catalog- Source of information for Access services- Rights determination (automated and support for manual review)
![Page 24: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/24.jpg)
Rights Database
• System of precedence
• 9 attributes • 11 reason codes
Bibliographic (automatic)
Bibliographic (automatic)
Manual1.Conformance with formalities2.Contractual agreements3.Access control overrides
Manual1.Conformance with formalities2.Contractual agreements3.Access control overrides
![Page 25: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/25.jpg)
Access
Rights Database
Rights Database
Michigan
Indiana
Data Management
Archival Storage
Tab-delimited Metadata filesTab-delimited Metadata files
Collection BuilderIndex
RightsDetermination
Bibliographic Management
Full textIndex
VuFindIndex
Bibliographic Catalog
Bibliographic API
OAI setsOAI sets
Full text Search applicationFull text Search application
PageTurnerPageTurner
Data APIData API
Collection Builder
![Page 26: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/26.jpg)
Content Access
Rights Database
Michigan
Indiana
Data Management
Archival Storage
Tab-delimited Metadata filesTab-delimited Metadata files
Collection BuilderIndex
RightsDetermination
Bibliographic Management
Full textIndex
VuFindIndex
Bibliographic Catalog
Bibliographic API
OAI setsOAI sets
Full text Search applicationFull text Search application
PageTurner
Data API
Collection Builder
![Page 27: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/27.jpg)
Search and Aggregation Access
Rights Database
Rights Database
Michigan
Indiana
Data Management
Archival Storage
Tab-delimited Metadata filesTab-delimited Metadata files
Collection BuilderIndex
RightsDetermination
Bibliographic Management
Full textIndex
VuFindIndex
Bibliographic Catalog
Bibliographic API
OAI setsOAI sets
Full text Search applicationFull text Search application
PageTurnerPageTurner
Data APIData API
Collection Builder
![Page 28: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/28.jpg)
Metadata Access
Rights Database
Rights Database
Michigan
Indiana
Data Management
Archival Storage
Tab-delimited Metadata filesTab-delimited Metadata files
Collection BuilderIndex
RightsDetermination
Bibliographic Management
Full textIndex
VuFindIndex
Bibliographic Catalog
Bibliographic API
OAI setsOAI sets
Full text Search applicationFull text Search application
PageTurnerPageTurner
Data APIData API
Collection Builder
![Page 29: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/29.jpg)
Source Undetermined
![Page 30: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/30.jpg)
Thank [email protected]
![Page 31: Author(s): Jeremy York, 2010 License: Unless otherwise noted, this material is made available under the terms of the Creative Commons Attribution–Noncommercial–Share](https://reader035.vdocument.in/reader035/viewer/2022062719/56649ee85503460f94bf9acc/html5/thumbnails/31.jpg)
Additional Source Informationfor more information see: http://open.umich.edu/wiki/CitationPolicy
Slide 16, Image 11: malachus, Flickr.com
Slide 22, Image 11: malachus, Flickr.com, http://www.flickr.com/photos/malachus/5152200478/
Slide 29, Image 0: Source Undetermined