besser--metadata (brazil) 1/6/01 1 introduction to metadata for digital asset management howard...

52
ser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information http://www.gseis.ucla.edu/ ~howard

Post on 22-Dec-2015

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 1

Introduction to Metadatafor Digital Asset Management

Howard Besser

UCLA School of Education & Information

http://www.gseis.ucla.edu/~howard

Page 2: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 2

Metadata: A fancy word for something familiar

_ Cataloging_ Indexing_ Description_ …

_ But also new elements of technical description (file format, compression schemes, file names, …)

Page 3: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 3

Metadata for Digital Asset Management-

Importance of Metadata Standards Types and Uses of Metadata Discovery Metadata: The Dublin Core Administrative and Structural Metadata: The Making of

America II Project Longevity Metadata Identification/Provenance The 4/99 NISO/DLF Image Metadata Workshop Various other Metadata

Page 4: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 4

What is Metadata

_ Structured data describing other data used to find or help manage information resources

_ Aids in interoperability_ Titles, dates, captions, cataloging and

indexing data, file headers, rights info, provenance, code books, transaction logs, ...

_ One person’s metadata is another’s data

Page 5: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 5

Sorting through the Standards Morass

_ Data Structures (DC, CDWA, MARC, VRA Core, TEI, EAD, MESL data dict)

_ Data Interchange (Z39.50)

_ Data Values/vocabularies (LCSH, AAT, ULAN, TGN)

_ Data Content/syntax (AACR2)

Page 6: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 6

Semantics/Syntax/Structure

_ Semantics– meaning, as defined by a community to meet their particular needs

(DC)

_ Syntax– a systematic arrangement of data elements for machine processing

– facilitates the exchange and use of metadata among various applications (HTML, XML, RDF)

_ Structure– a formal arrangement of the syntax with the goal of consistent

representation of the semantics (rules defining field contents like 1/11/99)

Page 7: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 7

What is MetadataTypes & Uses

lots of different ways of dividing the clusters

Page 8: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 8

Uses of Metadata

_ Discovery & Retrieval_ Identification/Provenance_ Rights Management_ Viewing_ Integrity_ Longevity_ Content rating

Page 9: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 9

Containers and Packages of Metadata

Warwick, not MARC

_ modular_ overlapping_ extensible_ community-based_ designed for a networked world to aid

commonality btwn communities while still providing full functionality within each community

Page 10: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 10

Some different schemes where Metdata is kept

_ embedded within the object (HTML tags)_ in a separate related DB maintained by same

organization (OPAC, MOA II)_ in a separate DB maintained by a separate

organization (Books in Print, ratings systems)

_ derived on-the-fly from a different scheme (MARC-to-DC)

Page 11: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 11

Collaborative Metadata Projects

Dublin Core NSF/ERCIM Digital Collaboratory OCLC CORC Project- Visual Resources Association (VRA) Core Encoded Archival Description (EAD) Computerized Interchange of Museum Information

(CIMI)- Records Export for Art and Cultural Heritage

(REACH)

Page 12: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 12

CORC--Cooperative Online Resource Catalog

_ both bib records & webliographies (pathfiinders)

_ supports both AACR2/MARC and DC_ began 1/99, scheduled availability 7/00_ 100-200 participants

– Academic libraries– OCLC networks, special libraries, public

libraries, state & national libraries, consortia

Page 13: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 13

Dublin Core (3/95)

_ improve resource discovery_ anticipate precision problems of Web Crawler-

based searching tools_ existing metadata could be “dumbed down”_ elements should be simple to understand and use,

so that any individual should be able to assign terms him/herself

_ software might eventually automatically generate very base-level metadata

Page 14: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 14

Dublin Core

Title Creator Subject Description Publisher Contributors Date Type

Format Identifier Source Language Relation Coverage Rights

Page 15: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 15

Dublin Core

every element is both optional and repeatable elements are cross-disciplinary elements are extensible by organized communities can employ a syntax such as html’s

<META> tagset for use by Spiders and HarvestersMay 2000 DLF Metadata Harvesting Project

Page 16: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 16

DC Qualifiers

_ allows one community to express important nuances and qualifications, while still making the basic importance available to communities with simple needs

_ our community can reflect alternate title, transliterated title, and main title, yet they will all be found under a simple Web search under “title”

Page 17: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 17

Discovery Metadata:Recent History

_ Dublin Core (3/95)_ Warwick Framework (4/96)_ Image Metadata Workshop (9/96)_ Canberra, Helsinki, ... DC (98)_ Digital Library Collaboratory (97-)_ DC-8, Frankfurt 10/99

Page 18: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 18

Dublin Core--further work

_ Warwick Framework– metadata packages for extensible functions

– layed groundwork for RDF

_ Canberra Qualifiers– refining the semantics of the element set to provide more precise info

– SUBELEMENT, SCHEME, LANG

_ Granularity– no hierarchical relationships w/i a given DC record; only one record

per discrete object (collection or item-level), and relationship field plus qualifier links them

Page 19: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

The Research Process and Functional Categories of

Metadata_ Discovery_ Retrieval_ Collation_ Analysis_ Re-presentation

Page 20: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 20

Making of America II-

Background of the DLF Project Administrative Metadata Structural Metadata

Page 21: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 21

MOA2 Goal is Interpoerability

Book example

Page 22: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 22

DLF Metadata for Interoperability Testbed:

the MOA II Project R & D Distributed Repositories Transportation, 1869-1900 Testbed Project Best Practices Structural and administrative metadata

Page 23: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 23

Previous Projects/Background

Library Standards Background UC Berkeley Background Finding Aids EAD SGML EAD “Digital Archives”

Page 24: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 24

MOA II Classes of Objects

Continuous Tone Photos Photo Albums Diaries, journals, letterpress books Ledgers Correspondence

Page 25: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 25

MOA II Metadata

_ Administrative Metadata– for enhancing resource management

_ Structural Metadata– for reflecting internal hierarchies and

relationships btwn parts

_ Raw/Seared/Cooked

Page 26: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 26

Administrative Metadatato uniquely identify a digital resource and manage it

over time

_ Information about where the various pieces/versions of the object reside

_ Information to view the digital object_ Information about the scanning process

Page 27: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 27

Structural Metadata:that which is relevant to presentation of the

digital object to the user

_ metadata defining the "object”: a book, a diary, a photo album

_ metadata defining the “sub-objects”: pages (physical) or chapters and subheads (intellectual)

Page 28: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 28

SGML, XML, HTML

_ TEI for structured humanities text_ EAD for Finding Aids

Page 29: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 29

Other Types of Metadata-

_ Longevity_ Identification/Provenance_ Rights Management

Page 30: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 30

NISO/DLF Image Metadata WorkshopPossible Goals

Metadata fields Rules for Field Contents (authority control)

Core set of necessary fields

Syntax for expressing fields and contents (headers)

Page 31: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 31

Image Metadata

Focus on Metadata that may prove helpful for

management use preservation ...

Page 32: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 32

Image Metadata

Break-out Groups: Work Done

Characteristics and Features of Images Image Production and Reformatting

Features Image Identification and Integrity

Page 33: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 33

Other Metadata

_ Description of depiction/surrogate (What VRA calls its "Surrogate Categories")

_ Description of original object

_ Rights and Reproduction Information_ Location Information

Page 34: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 34

Data Structures:The VRA Core

28 elements specifically for visual resource collections

Work Description Categories- Visual Document Description Categories- http://www.oberlin.edu/~art/vra/dsc.html

Page 35: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 35

VRA Core:Work Description Categories

Work type Title Measurements Material Technique Creator Role Date Repository name Repository place

_ Repository number_ Current site_ Original site_ Style/period/group/

movement_ Nationality/culture_ Subject_ Related work_ Relationship type_ Notes

Page 36: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 36

VRA Core:Visual Document Description

Categories Visual document type Visual document format Visual document measurements Visual document date Visual document owner Visual document owner number Visual document view description Visual document subject Visual document source

Page 37: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 37

Data Value Metadata(vocabularies)

LCSH TGM AAT ULAN TGN VRA Core

Page 38: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 38

LCSH

very general

Page 39: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 39

Thesaurus for Graphic Materials

designed for subject indexing of pictorial materials, particularly large general collections of historical images

for cataloging and retrieval good for general audiences and broad approaches

to the material TGM-I: Subject Terms & TGM-II: Genre and

Physical Characteristic Terms http://lcweb.loc.gov/rr/print/tgm/toc.html

Page 40: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 40

AAT

120,000 terms for describing objects, textual materials,

images, architecture, and material culture from antiquity to present

large and complex http://www.getty.edu/gri/vocabularies/

Page 41: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 41

ULAN

name authority http://www.getty.edu/gri/vocabularies/

Page 42: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 42

Thesaurus of Geographic Names

over 1 million records hierarchical and global throughout history most records include coordinates and

descriptive notes

Page 43: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 43

Metadata for Digital Commerce

DOI <indecs>-

Page 44: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 44

<Indecs>

formal structure for describing and uniquely identifying intellectual property itself, the people and businesses involved in its trading, and the agreements which they make about it (primarily for publishing, music, and visual arts)

will develop high-level specifications for the services that will be required to implement a global IP trading system based on this <indecs> generic data model

focus is on encoding rights at a high level, not on resource discovery likely to involve metadata schma registration and directory to allow

interoperation of personal identifiers for rightsholders and users supported by EEC DG-13 First meeting July 1999 http://www.indecs.org/

Page 45: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 45

Metadata Mapping-

Crosswalks Resource Description Framework (RDF)

Page 46: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 46

Crosswalks

mapping btwn differing metadata structures eliminate the need for monolithic,

universally adopted standards focus on flexibility and interoperatiblity RDF-based metadata registries

Page 47: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 47

Crosswalk ExampleCDWA Object ID

CIMISchema

FDAVRA CoreCategories

USMARCDUBLINCORE

OBJECT/WORK (core)

    DocumentClassification-CatalogLevel (core)DocumentClassification-Group Type

     

Object/Work-Type (core)

Type ofObject

objectNAME DocumentClassification- DocumentType (core)Purpose-Purpose(Broad) (core)Purpose-Purpose(Narrow)

W1. WorkType

655 Genre-Form

Type

Object/Work-Components

  quantity DocumentClassification-Extent

  300a PhysicalDescription-Extent

 

ORIENTATION/ARRANGEMENT

          Description

TITLES ORNAMES(core)

Title objectTitlebibliographicTitle

Group/ItemIdentification-RepositoryTitleGroup/ItemIdentification-DescriptiveTitle (core)Group/ItemIdentification-InscribedTitle

W2. Title 24Xa Titleand Title-RelatedInformation

Title 

Page 48: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 48

Resource Description Framework (RDF, spec released 2/99)

_ W3C Metadata activity_ designed to move the Web beyond simple links to

semantically-rich relationships btwn resources_ metadata application using XML as a common syntax for

exchange and processing_ flexible architecture for managing diverse application-

specific metadata packets that can be processed by machines_ associates resources, property types, and corresponding

values_ http://www.w3.org/RDF/

Page 49: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 49

RDF

_ Resources (character strings, names, digital objects)

_ Property (“is the author of”)_ Value

_ resources+properties=relationships_ many different relationships can be reflected

Page 50: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 50

XML-encoded RDF

_ <?xml:namespace ns=http://www.w3.org/RDF/RDF prefix="RDF" ?>

_ <?xml:namespace ns=http://purl.oclc.org/DC/ prefix="DC" ?>

_ <RDF:RDF>_ <DC:Creator>Howard Besser</DC:Creator>_ </RDF:Description>_ </RDF:RDF>

Page 51: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 51

Should you start building with RDF today?

_ Tools are primitive_ Standard still likely to evolve

Page 52: Besser--Metadata (Brazil) 1/6/01 1 Introduction to Metadata for Digital Asset Management Howard Besser UCLA School of Education & Information howard

Besser--Metadata (Brazil) 1/6/01 52

Metadata for Digital Asset MgmtHoward Besser

UCLA School of Education & Information

Baca, Murtha (ed). Introduction to Metadata, Los Angeles: Getty Information Institute, 1998

http://www.getty.edu/gri/standard/intrometadata/

http://sunsite.berkeley.edu/Imaging/Databases/#standards

http://sunsite.berkeley.edu/moa2/

http://sunsite.berkeley.edu/Longevity/

http://www.ifla.org/II/metadata.htm

http://purl.oclc.org/metadata/dublin_core/

http://purl.oclc.org/corc/

http://lcweb.loc.gov/ead/

http://www.gseis.ucla.edu/~howard/image-meta.html

http://www.gseis.ucla.edu/~howard/Metadata/UC-May00/

http://sunsite.berkeley.edu/Metadata/sp2000.html