besser--lita dig imaging preconference 7/7/00 1 creating working digital libraries howard besser...

47
er--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/ ~howard

Upload: caroline-ann-mason

Post on 03-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 1

Creating WorkingDigital Libraries

Howard Besser

UCLA School of Education & Information

http://www.gseis.ucla.edu/~howard

Page 2: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 2

Creating WorkingDigital Libraries-

_ Moving from Digital Collections to Digital Libraries

_ Interoperability_ Importance of Standards_ Longevity_ Best Practices for Managing Digital Projects_ Some Wild Musings

Page 3: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 3

Moving from Digital Collections to Digital Libraries

_ What’s the difference?_ Recent history of Library Automation-

Page 4: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 4

Developmental Stages

_ Experiment with methods_ Build real operational systems_ Build interoperable operational systems

Page 5: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 5

Traditional Digital Library Model

DL

DL

DL

DL

useruser

search & presentation

search & presentation

search & presentation

search & presentation

Page 6: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 6

Ideal Digital Library Model

DL

DL

DL

DL

useruser

search & presentation

Page 7: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 7

Developmental Stages

_ Experiment with methods_ Build real operational systems_ Build interoperable operational systems

– For DL Initiatives

– For OPACs

– For I & A Services

– For Image Retrieval

Page 8: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 8

Key problems we’re facing

Discovery Interoperability- Longevity-

Page 9: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 9

For Interoperability Digital Libraries Need Standards

Descriptive Metadata for consistent description

Discovery Metadata for finding Administrative Metadata for viewing and

maintaining Structural Metadata for navigation ... Terms & Conditions Metadata for

controlling access...

Page 10: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 10

Metadata is not just indexing terms

_ CBIR attributes used for retrieval on color, shape, texture, etc._ Structural attributes used for page-turning_ Administrative attributes used for managing a digital work

over time_ IPR attributes to limit unauthorized use_ Identification attributes to determine what application software

is needed to view a particular digital work

_ Can be located anywhere

Page 11: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 11

Why are Standards and Metadata consensus

important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create

applications that support this

Page 12: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 12

Why Standards? Why do we need standards?

– To make information universally available to users– facilitate sharing and interchange of information– To preserve information (make it safe from

changes in hardware and software) Standards only work if communities widely

accept them, but they’re necessary for communities to work together

Page 13: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 13

Serious Longevity Problems

What we know from prior widespread digital file formats

Images separating from their metadata Inaccessibility of software needed to view

an image Inability to even decode the file format of

an image

Page 14: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 14

Journal Archiving

_ License, don’t own; may not be even able to obtain right to make archival copy

_ Increasingly no paper back-up at all_ Usually we don’t have the important

redundancy factor_ Stanford’s LOCKSS Project (Lots of Copies

Keeps Stuff Safe) and its problems (http://lockss.stanford.edu)

Page 15: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 15

The Short Life of Digital Info: Digital Longevity Problems-

Disappearing Information The Viewing Problem The Scrambling Problem The Inter-relation Problem The Custodial Problem The Translation Problem

Page 16: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 16

The Viewing Problem

Digital Info requires a whole infrastructure to view it

Each piece of that infrastructure is changing at an incredibly rapid rate

How can we ever hope to deal with all the permutations and combinations

Page 17: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 17

The Scrambling Problem

Dangers from: Compression to ease storage & delivery Container Architecture to enhance digital

commerce

Page 18: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 18

The Inter-relation Problem

-Info is increasingly inter-related to other info

-How do we make our own Info persist when it points to and integrates with Info owned by others?

-What is the boundary of a set of information (or even of a digital object)?

Page 19: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 19

The Custodial Problem

How do we decide what to save? Who should save it? How should they save it?

– -methods for later access: emulation, migration, etc.

– -issues of authenticity and evidence

Page 20: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 20

The Translation Problem

Content translated into new delivery devices changes meaning– -A photo vs. a painting– -If Info is produced originally in digital form in

one encoded format, will it be the same when translated into another format?

– Behaviors

Page 21: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 21

Pieces of the Solution (1/2)

-We need to insist upon clearly readable standardized ways for digital objects to self-identify their formats

-We should discourage scrambling -We need to better understand information

inter-relates to other Info, and what constitutes “boundaries” of Info objects

Page 22: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 22

Pieces of the Solution (2/2)

-People and organizations wishing to make information persist need guidelines of how to go about doing it

-We need to better understand how translating from one storage or display format to another affects the meaning of a work

-We need to save the “behaviors” of a digital object, not just it’s “contents”

Page 23: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 23

Metadata can be the first line of defense

Can tell you– where the file is (if you can’t find the file)– where more info about the file is (if you have the

file but most other metadata has become separated)

– what the file format is– what the compression scheme is– what application program and version is needed

for the file

Page 24: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 24

Groups Working onthe Big Longevity Problem

http://sunsite.Berkeley.EDU/Imaging/Databases/Longevity/

CPA Task Force Getty “Time & Bits” Conference & follow-

up NEDLIB, CURL, Michigan Internet Archive Long Now

Page 25: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 25

Migration/Refreshing

Impact on evidential value

Page 26: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 26

Best Practices for Managing Digital Projects-

_ Who will your users be?_ Best Practices Guidelines_ Workflow and Management Issues

Page 27: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 27

Why are you Managing this Information?

Organizational mission & type Users Uses

Page 28: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 28

Scanning Best Practices

_ Think about users (and potential users), uses, and type of material/collection

_ Scan at the highest quality that does not exceed the likely potential users/uses/material

_ Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery

_ Many documents which appear to be bitonal actually are better represented with greyscale scans

_ Include color bar and ruler in the scan

_ Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)

_ Don’t use lossy compression_ Store in a common (standardized)

file format_ Capture as much metadata as is

reasonably possible (including metadata about the scanning process itself)

Page 29: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 29

Why Scale is important

Page 30: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 30

Digital Object Behaviors

_ Book example

Page 31: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 31

Metadata Standards(from MOA2)

_ Administrative Metadata– for enhancing resource management

_ Structural Metadata– for reflecting internal hierarchies and

relationships btwn parts

_ Raw/Seared/Cooked

Page 32: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 32

Workflow and Management Issues-

_ Managing multiple image files_ Persistent Identification_ Making your works accessible throughout

the Net

Page 33: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 33

The number of variant forms of a work can be enormous

different views of the same object different scans of the same photo different resolutions different compression schemes different compression ratios different file storage formats different details of the same image ...

Page 34: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Image Families

Page 35: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 35

Identification/Provenance

how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF)

Vocabulary Standards to express this– VRA Surrogate Categories– CIMI's "Image Elements”

Page 36: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 36

Persistent IDs--the Problem

_ Need to separate work ID from work location

_ URNs probably won’t be ready until 2003_ Becomes a business process issue when one

organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)

Page 37: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 37

More Persistent IDs--the Approach for today

_ PURLs_ Handles_ HTTP redirects

_ And worry about costs now and conversion costs when URNs become feasible

Page 38: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 38

Data Set ManagementMore issues with referencing IDs

_ References for mirror sites_ References for back-up sites when main site

is down or bottle-necked_ References for off-site copies and archival

copies

Page 39: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 39

Making your works accessible throughout the Net

_ The DLF/Mellon meeting_ An administrative and political issue as

much as a a technical one

Page 40: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 40

Some Wild Musings-

_ Movement towards packages and away from MARC

_ The disappearance of OPACs

Page 41: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 41

Containers and Packages of Metadata

Warwick, not MARC

_ modular_ overlapping_ extensible_ community-based_ designed for a networked world to aid

commonality btwn communities while still providing full functionality within each community

Page 42: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 42

DC Qualifiers

_ allows one community to express important nuances and qualifications, while still making the basic importance available to communities with simple needs

_ our community can reflect alternate title, transliterated title, and main title, yet they will all be found under a simple Web search under “title”

Page 43: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 43

Crosswalks

mapping btwn differing metadata structures eliminate the need for monolithic,

universally adopted standards focus on flexibility and interoperatiblity RDF-based metadata registries

Page 44: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 44

Crosswalk ExampleCDWA Object ID

CIMISchema

FDAVRA CoreCategories

USMARCDUBLINCORE

OBJECT/WORK (core)

    DocumentClassification-CatalogLevel (core)DocumentClassification-Group Type

     

Object/Work-Type (core)

Type ofObject

objectNAME DocumentClassification- DocumentType (core)Purpose-Purpose(Broad) (core)Purpose-Purpose(Narrow)

W1. WorkType

655 Genre-Form

Type

Object/Work-Components

  quantity DocumentClassification-Extent

  300a PhysicalDescription-Extent

 

ORIENTATION/ARRANGEMENT

          Description

TITLES ORNAMES(core)

Title objectTitlebibliographicTitle

Group/ItemIdentification-RepositoryTitleGroup/ItemIdentification-DescriptiveTitle (core)Group/ItemIdentification-InscribedTitle

W2. Title 24Xa Titleand Title-RelatedInformation

Title 

Page 45: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 45

Do we still need OPACs?

_ Why repeat almost identical bibliographic descriptions in each local system?

_ Why not store only local information locally, and link to bibliographic descriptions stored in the major utilities?

_ Could our acquisition systems for monographs begin to use the acquisition systems imposed on us by our parent organizations (like those for supplies)?

Page 46: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 46

Creating WorkingDigital Libraries-

_ Moving from Digital Collections to Digital Libraries

_ Interoperability_ Importance of Standards_ Longevity_ Best Practices for Managing Digital Projects_ Some Wild Musings

Page 47: Besser--LITA Dig Imaging Preconference 7/7/00 1 Creating Working Digital Libraries Howard Besser UCLA School of Education & Information howard

Besser--LITA Dig Imaging Preconference 7/7/00 47

Creating Working Digital LibrariesHoward Besser

UCLA School of Education & Information

http://www.getty.edu/gri/standard/intrometadata/

http://www.ifla.org/II/metadata.htm

http://sunsite.Berkeley.EDU/Imaging/Databases/#standards

http://sunsite.Berkeley.EDU/moa2/

http://sunsite.Berkeley.EDU/Longevity/

http://purl.oclc.org/metadata/dublin_core/

http://www.gseis.ucla.edu/~howard/image-meta.html

http://www.gseis.ucla.edu/~howard/Metadata/UC-May00/

http://sunsite.berkeley.edu/Metadata/sp2000.html

http://www.gseis.ucla.edu/~howard/