besser--seybold--digital asset mgmt 8/29/00 1 digital asset management: an academic view howard...
TRANSCRIPT
Besser--Seybold--Digital Asset Mgmt 8/29/00 1
Digital Asset Management:An Academic View
Howard Besser
UCLA School of Education & Information
http://www.gseis.ucla.edu/~howard
Besser--Seybold--Digital Asset Mgmt 8/29/00 2
Digital Asset Management:An Academic View-
_ Interoperability_ Importance of Standards_ Best Practices for Managing Digital
Projects_ Implications of Digital Projects_ Longevity
Besser--Seybold--Digital Asset Mgmt 8/29/00 3
Traditional Digital Collection Model
DL
DL
DL
DL
useruser
search & presentation
search & presentation
search & presentation
search & presentation
Besser--Seybold--Digital Asset Mgmt 8/29/00 4
Ideal Digital Collection Model
DL
DL
DL
DL
useruser
search & presentation
Besser--Seybold--Digital Asset Mgmt 8/29/00 5
For Interoperability Digital Collections Need Standards
_ Descriptive Metadata for consistent description
_ Discovery Metadata for finding_ Administrative Metadata for viewing and
maintaining_ Structural Metadata for navigation_ ... Terms & Conditions Metadata for
controlling access...
Besser--Seybold--Digital Asset Mgmt 8/29/00 6
Metadata is not just indexing terms
_ CBIR attributes used for retrieval on color, shape, texture, etc._ Structural attributes used for page-turning_ Administrative attributes used for managing a digital work
over time_ IPR attributes to limit unauthorized use_ Identification attributes to determine what application software
is needed to view a particular digital work
_ Can be located anywhere
Besser--Seybold--Digital Asset Mgmt 8/29/00 7
Why are Standards and Metadata consensus
important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create
applications that support this
Besser--Seybold--Digital Asset Mgmt 8/29/00 8
Why Standards? Why do we need standards?
– To make information universally available to users– facilitate sharing and interchange of information– To preserve information (make it safe from
changes in hardware and software) Standards only work if communities widely
accept them, but they’re necessary for communities to work together
Besser--Seybold--Digital Asset Mgmt 8/29/00 9
Best Practices for Managing Digital Projects-
_ Who will your users be?_ Best Practices Guidelines_ Workflow and Management Issues
Besser--Seybold--Digital Asset Mgmt 8/29/00 10
Why are you Managing this Information?
Organizational mission & type Users Uses
Besser--Seybold--Digital Asset Mgmt 8/29/00 11
Why are Standards and Metadata consensus
important? Managing digital files over time Longevity Interoperability Veracity Recording in a consistent manner Will give vendors incentive to create
applications that support this
Besser--Seybold--Digital Asset Mgmt 8/29/00 12
Collaborative Metadata Projects
Dublin Core NSF/ERCIM Digital Collaboratory OCLC CORC Project- Visual Resources Association (VRA) Core Encoded Archival Description (EAD) Computerized Interchange of Museum Information
(CIMI)- Records Export for Art and Cultural Heritage
(REACH)
Besser--Seybold--Digital Asset Mgmt 8/29/00 13
CORC--Cooperative Online Resource Catalog
_ both bib records & webliographies (pathfiinders)
_ supports both AACR2/MARC and DC_ began 1/99, scheduled availability 7/00_ 100-200 participants
– Academic libraries– OCLC networks, special libraries, public
libraries, state & national libraries, consortia
Besser--Seybold--Digital Asset Mgmt 8/29/00 14
Making of America II-
Background of the DLF Project Administrative Metadata Structural Metadata
Besser--Seybold--Digital Asset Mgmt 8/29/00 16
MOA2 Access
OPAC Finding AidsList
Collectn levelrecord
Finding Aid
DigitalObjects
Besser--Seybold--Digital Asset Mgmt 8/29/00 17
DLF Metadata for Interoperability Testbed:
the MOA II Project R & D Distributed Repositories Transportation, 1869-1900 Testbed Project Best Practices Structural and administrative metadata
Besser--Seybold--Digital Asset Mgmt 8/29/00 18
Previous Projects/Background
Library Standards Background UC Berkeley Background Finding Aids EAD SGML EAD “Digital Archives”
Besser--Seybold--Digital Asset Mgmt 8/29/00 19
MOA II Classes of Objects
Continuous Tone Photos Photo Albums Diaries, journals, letterpress books Ledgers Correspondence
Besser--Seybold--Digital Asset Mgmt 8/29/00 20
MOA II Metadata
_ Administrative Metadata– for enhancing resource management
_ Structural Metadata– for reflecting internal hierarchies and
relationships btwn parts
_ Raw/Seared/Cooked
Besser--Seybold--Digital Asset Mgmt 8/29/00 22
MOA II Best practices
Use/Users/Collection: Benchmarking Masters vs. Derivatives Scanning- Administrative Metadata- Structural Metadata-
Besser--Seybold--Digital Asset Mgmt 8/29/00 23
Scanning Best Practices
_ Think about users (and potential users), uses, and type of material/collection
_ Scan at the highest quality that does not exceed the likely potential users/uses/material
_ Do not let today’s delivery limitations influence your scanning file sizes; understand the difference between digital masters and derivative files used for delivery
_ Many documents which appear to be bitonal actually are better represented with greyscale scans
_ Include color bar and ruler in the scan
_ Use objective measurements to determine scanner settings (do NOT attempt to make the image good on your particular monitor or use image processing to color correct)
_ Don’t use lossy compression_ Store in a common (standardized)
file format_ Capture as much metadata as is
reasonably possible (including metadata about the scanning process itself)
Besser--Seybold--Digital Asset Mgmt 8/29/00 25
Administrative Metadatato uniquely identify a digital resource and manage it
over time
_ Information about where the various pieces/versions of the object reside
_ Information to view the digital object_ Information about the scanning process
Besser--Seybold--Digital Asset Mgmt 8/29/00 26
Structural Metadata:that which is relevant to presentation of the
digital object to the user
_ metadata defining the "object”: a book, a diary, a photo album
_ metadata defining the “sub-objects”: pages (physical) or chapters and subheads (intellectual)
Besser--Seybold--Digital Asset Mgmt 8/29/00 27
SGML, XML, HTML
_ TEI for structured humanities text_ EAD for Finding Aids
Besser--Seybold--Digital Asset Mgmt 8/29/00 28
Museum Online Archive of California -- Goals
_ Access to Museum Collections_ Use of Encoded Archival Description
(EAD) for Museums_ Integration of primary source access across
institution types_ Scalable production methods
Besser--Seybold--Digital Asset Mgmt 8/29/00 29
MOAC Participants
_ 8 Museums – Berkeley Art Museum/Pacific Film Archive– Phoebe A. Hearst Museum of Anthropology– Oakland Museum fo California– UCLA Grunwald Center for the Graphic Arts– UCR/California Museum of Photography– Bancroft Library– UCLA Fowler Museum of Cultural History– Stanford University Iris & B. Gerald Cantor Center for Visual
Arts– Japanese American National Museum
Besser--Seybold--Digital Asset Mgmt 8/29/00 30
Collection goals
_ 29 Collections_ 73,000 images
– Paintings & drawings– Sculpture & ceramics– Masks, textiles, cultural objects– Artists books– Photographs & stereographs– Audio & video
Besser--Seybold--Digital Asset Mgmt 8/29/00 37
Unique Outcomes
_ Finding Aids for museum collections_ Integration of item & collection level
information_ Presentation & navigation of multi-media
Besser--Seybold--Digital Asset Mgmt 8/29/00 38
Methods
_ Standards based (EAD, SGML, XML, REACH, LCSH, AAT)
_ Digital Asset Mgmt. Database– Automated markup– Connections to collection mgmt. DBs– Image workflow– Export to EAD
Besser--Seybold--Digital Asset Mgmt 8/29/00 39
Progress
_ 6 finding aids_ 5200 images_ 6 month cycles: digitization & encoding
followed by mounting and access (4 iterations to gain knowledge)
_ Alternative MOAC portal (with fielded searching)
Besser--Seybold--Digital Asset Mgmt 8/29/00 40
MOAC Contact information
_ http://www.bampfa.berkeley.edu/moac/_ MOAC Project Manager Rick Rinehart ([email protected])_ OAC Manager Robin Chandler [email protected]
Besser--Seybold--Digital Asset Mgmt 8/29/00 41
Further OAC Discussion
_ More pieces of OAC– CDL Best Practices
– Access to online Finding Aids-
_ Broader Implications of OAC and similar projects-– utility of image browsing
– Digitization means new audiences
– New users’ lack of familiarity with Finding Aids
– searching across finding aids
_ More general issues of digital projects-
Besser--Seybold--Digital Asset Mgmt 8/29/00 42
NISO/DLF Image Metadata WorkshopPossible Goals
Metadata fields Rules for Field Contents (authority control)
Core set of necessary fields
Syntax for expressing fields and contents (headers)
Besser--Seybold--Digital Asset Mgmt 8/29/00 43
Image Metadata
Focus on Metadata that may prove helpful for
management use preservation ...
Besser--Seybold--Digital Asset Mgmt 8/29/00 44
Image Metadata
Break-out Groups: Work Done
Characteristics and Features of Images Image Production and Reformatting
Features Image Identification and Integrity
Besser--Seybold--Digital Asset Mgmt 8/29/00 45
NISO/DLF Image Metadata Workshop (4/99)
Image Technical Information : Possible Goals
Metadata fields Rules for Field Contents (authority control)
Core set of necessary fields
Syntax for expressing fields and contents (headers)
Besser--Seybold--Digital Asset Mgmt 8/29/00 46
Image Metadata
Focus on Metadata that may prove helpful for
management use preservation ...
Besser--Seybold--Digital Asset Mgmt 8/29/00 47
Image Metadata
Break-out Groups: Work Done-
Characteristics and Features of Images Image Production and Reformatting
Features Image Identification and Integrity
Besser--Seybold--Digital Asset Mgmt 8/29/00 48
Image Metadata Elements for Data Dictionary
Data Dictionary Entries_ Element Name_ Definition (short) of the element name_ Is the element required? (Identified as: Mandatory, Mandatory if
Applicable, Recommended, Optional)_ How is the value of the element represented?_ Examples_ When is this data collected?_ What is the purpose of this data?_ Who would the identified users be?_ How is the metadata used?_ What other metadata standards reference it?
Besser--Seybold--Digital Asset Mgmt 8/29/00 49
Image Metadata Elements for Data Dictionary
Characteristics and Features Element List
_ Format Issues:_ Resolution Issues:_ Encoding:_ Compression:_ Others:
Besser--Seybold--Digital Asset Mgmt 8/29/00 50
Image Metadata Elements for Data Dictionary
Image Production Element List (Pertaining to the Image)
_ In-image target(s):_ System target(s), associated with the object:_ Responsible agent_ Rationale:_ Hardware:_ Software:
Besser--Seybold--Digital Asset Mgmt 8/29/00 51
Image Metadata Elements for Data Dictionary
Image Production Element List (Pertaining to the Process)
_ Format of the image_ Intrinsic characteristics of the image_ Identification_ Provides a means for defining methodology including documentation and rationale_ Who is involved with the file?_ Who created the image file?_ Who commissioned the creation of the image file (i.e., the chartering entity), as opposed
to: Who is the responsible agency? Who is the owner?_ Where_ What_ When: necessary dates including: capture date/time, modification_ Checksum_ Navigational aid_ Encoding tools
Besser--Seybold--Digital Asset Mgmt 8/29/00 52
Image Metadata
NISO/DLF Image Metadata:In Progress
_ Data Dictionary for both “Characteristics & Features” and for “Image Production Elements” due end of 6/00
Besser--Seybold--Digital Asset Mgmt 8/29/00 54
Identification/Provenance (Images)-
The number of variant forms of a work can be enormous Image Families A digital image frequently has many layers of parentage Information about the parentage that can indicate the
quality and veracity of the image (Dublin Core "Source" and "Relation")
how to deal with different versions derived from the same scan or different encoding schemes
Vocabulary Standards to express this
Besser--Seybold--Digital Asset Mgmt 8/29/00 55
The number of variant forms of a work can be enormous
different views of the same object different lighting of the same object different scans of the same photo different resolutions different compression schemes different compression ratios different file storage formats different details of the same image ...
Besser--Seybold--Digital Asset Mgmt 8/29/00 57
Identification/Provenance
how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF)
Vocabulary Standards to express this– VRA Surrogate Categories– CIMI's "Image Elements”
Besser--Seybold--Digital Asset Mgmt 8/29/00 58
Other Metadata
_ Description of depiction/surrogate (What VRA calls its "Surrogate Categories")
_ Description of original object
_ Rights and Reproduction Information_ Location Information
Besser--Seybold--Digital Asset Mgmt 8/29/00 60
<Indecs>
formal structure for describing and uniquely identifying intellectual property itself, the people and businesses involved in its trading, and the agreements which they make about it (primarily for publishing, music, and visual arts)
will develop high-level specifications for the services that will be required to implement a global IP trading system based on this <indecs> generic data model
focus is on encoding rights at a high level, not on resource discovery likely to involve metadata schma registration and directory to allow
interoperation of personal identifiers for rightsholders and users supported by EEC DG-13 First meeting July 1999 http://www.indecs.org/
Besser--Seybold--Digital Asset Mgmt 8/29/00 61
Serious Longevity Problems
_ What we know from prior widespread digital file formats
_ Images separating from their metadata_ Inaccessibility of software needed to view
an image_ Inability to even decode the file format of
an image
Besser--Seybold--Digital Asset Mgmt 8/29/00 62
The Translation Problem
Content translated into new delivery devices changes meaning– -A photo vs. a painting– -If Info is produced originally in digital form in
one encoded format, will it be the same when translated into another format?
– Behaviors
Besser--Seybold--Digital Asset Mgmt 8/29/00 63
Pieces of the Solution (1/2)
-We need to insist upon clearly readable standardized ways for digital objects to self-identify their formats
-We should discourage scrambling -We need to better understand information
inter-relates to other Info, and what constitutes “boundaries” of Info objects
Besser--Seybold--Digital Asset Mgmt 8/29/00 64
Pieces of the Solution (2/2)
-People and organizations wishing to make information persist need guidelines of how to go about doing it
-We need to better understand how translating from one storage or display format to another affects the meaning of a work
-We need to save the “behaviors” of a digital object, not just it’s “contents”
Besser--Seybold--Digital Asset Mgmt 8/29/00 65
Metadata can be the first line of defense
Can tell you– where the file is (if you can’t find the file)– where more info about the file is (if you have the
file but most other metadata has become separated)
– what the file format is– what the compression scheme is– what application program and version is needed
for the file
Besser--Seybold--Digital Asset Mgmt 8/29/00 66
Groups Working onthe Big Longevity Problem
http://sunsite.Berkeley.EDU/Longevity/
CPA Task Force Getty “Time & Bits” Conference & follow-
up NEDLIB, CURL, Michigan Internet Archive Long Now
Besser--Seybold--Digital Asset Mgmt 8/29/00 67
UCLA/Getty Summer Institutefor Knowledge Sharing
http://www.getty.edu/gri/standard/intrometadata/
http://www.ifla.org/II/metadata.htm
http://sunsite.Berkeley.EDU/Imaging/Databases/
http://sunsite.Berkeley.EDU/moa2/
http://sunsite.Berkeley.EDU/Longevity/
http://www.oac.cdlib.org/
http://lcweb.loc.gov/ead/
http://purl.oclc.org/metadata/dublin_core/
http://www.gseis.ucla.edu/~howard/image-meta.html
http://www.gseis.ucla.edu/~howard/Metadata/UC-May00/
http://sunsite.berkeley.edu/Metadata/sp2000.html
http://www.gseis.ucla.edu/~howard/
http://is.gseis.ucla.edu/impact/f95/special-collectns.html
http://is.gseis.ucla.edu/impact/f95/Papers-projects/Projects/Trowbridge/
Besser--Seybold--Digital Asset Mgmt 8/29/00 68
Workflow and Management Issues-
_ Managing multiple image files_ Persistent Identification_ Making your works accessible throughout
the Net
Besser--Seybold--Digital Asset Mgmt 8/29/00 69
Identification/Provenance
how to deal with different versions (browse, hi-res, medium res) derived from the same scan or different encoding schemes (TIFF, PICT, JFIF)
Vocabulary Standards to express this– VRA Surrogate Categories– CIMI's "Image Elements”
Besser--Seybold--Digital Asset Mgmt 8/29/00 70
Persistent IDs--the Problem
_ Need to separate work ID from work location
_ URNs probably won’t be ready until 2003_ Becomes a business process issue when one
organization maintains the resource and another organization references it (ie. licensed from vendors or managed by separate administrative structures)
Besser--Seybold--Digital Asset Mgmt 8/29/00 71
Making your works accessible throughout the Net
_ The DLF/Mellon Metadata Harvesting meeting
_ An administrative and political issue as much as a a technical one
Besser--Seybold--Digital Asset Mgmt 8/29/00 73
Major Issues Facing Digital Projects
_ Dangerous Changes in Intellectual Property Law
_ Intellectual Access_ Storage & Delivery_ Usefulness to User
– Integration with other tools– Interoperability
Besser--Seybold--Digital Asset Mgmt 8/29/00 74
Containers and Packages of Metadata
Warwick, not MARC
_ modular_ overlapping_ extensible_ community-based_ designed for a networked world to aid
commonality btwn communities while still providing full functionality within each community
Besser--Seybold--Digital Asset Mgmt 8/29/00 75
DC Qualifiers
_ allows one community to express important nuances and qualifications, while still making the basic importance available to communities with simple needs
_ our community can reflect alternate title, transliterated title, and main title, yet they will all be found under a simple Web search under “title”
Besser--Seybold--Digital Asset Mgmt 8/29/00 76
Crosswalks
mapping btwn differing metadata structures eliminate the need for monolithic,
universally adopted standards focus on flexibility and interoperatiblity RDF-based metadata registries
Besser--Seybold--Digital Asset Mgmt 8/29/00 77
Crosswalk ExampleCDWAObject IDCIMISchema FDAVRA CoreCategories USMARCDUBLINCOREOBJECT/WORK (core) DocumentClassification-CatalogLevel (core)DocumentClassification-Group Type
Object/Work-Type (core) Type ofObject objectNAMEDocumentClassification- DocumentType (core)Purpose-Purpose(Broad) (core)Purpose-Purpose(Narrow)
W1. WorkType 655 Genre-Form Type
Object/Work-Components quantity DocumentClassification-Extent 300a PhysicalDescription-Extent ORIENTATION/ARRANGEMENT
DescriptionTITLES ORNAMES(core)
Title objectTitlebibliographicTitleGroup/ItemIdentification-RepositoryTitleGroup/ItemIdentification-DescriptiveTitle (core)Group/ItemIdentification-InscribedTitle
W2. Title 24Xa Titleand Title-RelatedInformationTitle
Besser--Seybold--Digital Asset Mgmt 8/29/00 78
Utility of Image Browsing^
_ [[http://sunsite.berkeley.edu/CalHeritage/]]– http://www.oac.cdlib.org:28008/dynaweb/ead/calher/bully/
@Generic__BookTextView/146;hf=0#X
_ implications of Image Browsing for Cataloging