september, 1999 grace agnew metadata overview metadata: data that describes data structured data...

55
September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data “Pure” metadata has meaning only in relation to the primary data that is being described.

Upload: jesus-conway

Post on 27-Mar-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadata:

Data that describes data

Structured data about data

“Pure” metadata has meaning only in relation to the primary data that is being described.

Page 2: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadata may be either:

Extrinsic: Existing indendepently of the primary data being described, usually in an indexable metadata base

or

Intrinsic: Existing as a part of the primary data being described

Page 3: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Design Criteria for a Metadata System:

Durable - independent of changes to hardware, software and network infrastructure

Interoperable Can be seamlessly shared across the web with disparate hardware,

software, network infrastructure and search engines

Page 4: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Precise Enables the creation of customized “virtual collections”--pulling objects together seamlessly from any digital space to meet exact information requirements.

Flexible Supports any search engine, search strategy, transport or display

option

Efficient Provides immediate access to the mostappropriate asset for the searcher.

Controlled Insures digital assets are from atrusted source to an

authorized end user.

Page 5: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Granular - Able to search the top page, subsequent pages, or drill

down to an underlying database of objects.

“Break through the web skin”Query

metadatabase

Search Engine

Underlying ObjectDatabase

Page 6: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Key Concepts:

Semantics: Meaning ascribed by a community to a metadata element or to the values for that element. Organized into a “vocabulary.”

Structure: Imposes order for the unambiguous expression of the semantics--consistent coding, exchange and display of metadata elements, providing

consistent interpretation by the end user.

Syntax: Provides a means to represent one or more structures in a flexible, extensible manner. Provides underlying mechanism for encoding, exchange, display and machine processing of metadata . Example: XML

Page 7: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Schema Identifies, defines, organizes and constrains the elements in a set, their characteristics and descriptions. Involves both semantics and structure. Examples: Dublin Core, RDF

Page 8: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Types of Metadata:

Structural

Describes the physical and logical attributes of the object, related to creation, transport, storage and display;

Describes the hardware and software used to create the object; (Some place this in Administrative metadata)

Describes the hardware, software and bandwidth needed to transport and display the object.

May be machine-readable, human readable or both. May be part of digital object header (ex: TWAIN)

Page 9: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Provenance/Ingest Metadata:

“Admission ticket” to the Archive or Data Repository. Acknowledges the rules of entry and identifies the object for positioning within the Archive. Best if intrinsic in the object, e.g. in the Header.

Identifies the owner/creator of the metadata.

Identifies the owner/creator of the digital asset.

Provides date created, permanence of asset; updates and modifications to asset. May “push” asset to users when content changes.

Page 10: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Rights & Access:

Provides requirements for access, display and download/storage of asset.

Should integrate with organization’s access and authorization system, e.g. Reference/hyperlink to Digital certificate authority

Indicate User restrictions (may reference attribute on certificate authority’s user attribute server

Support multilayered access:

download only vs. store;free vs. fee;asset versions (high res. Vs. low res.)

Page 11: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

DescriptiveShould uniquely identify an asset through:

Physical description (overlap with structural metadata)

Publication/Creation information (overlap with ingest metadata)

Should describe the information content in subject and free-text fields to identify and select the asset in response query from a search engine.

Page 12: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Linking Metadata

Persistent Links:

Metadata record and the described asset.

All physical instantiations of the asset.

Registries for metadata schemas used to provide a “meta-schema” to describe the object.

Security system for access and authorization and/or link to intermediary

access page

Considerable overlap with other metadata types

Page 13: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Mining Web Assets: Current Practice

A query is sent to a proprietary search engine, or a metasearch engine which queries many engines.

Benefits:

Ubiquitous and free; competition results in better precision and coverage

Drawbacks:

Access for assets only, not long-term management; “Ephemeral” metadata; Asset creator has no control over description and access.

Page 14: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Standards are Developed to: Create durable, persistent metadata records that precisely define the asset so that exactly-relevant assets are identified and retrieved in response to a query.

Create metadata that is flexible, extensible, and scalable to support the needs of any organization, any type of asset, and varying skill and interest levels of metadata creators.

Allow the metadata records from many schemas with differing levels of complexity to interoperate for data discovery.

Enable machine-intervention for automatic interpretation of metadata and data discovery, particularly among disparate search and retrieval platforms

Page 15: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

ISO 11179Joint Standard of the ISO (International Organization for Standardization) and IEC (International Electrotechnical Commission) to provide a robust framework for defining data elements in an unambiguous and persistent manner within user committees.

Also provides a framework for creating and maintaining metadata registries to store and maintain data element definitions.

NCITS L8 Draft Standards available at the following websites: http://www.jtc1.org/

http://pueblo.lbl.gov/~olken/X3L8/drafts/draft.docs.html

Page 16: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Relevant Metadata Standards:

Dublin Core Element Set V. 1.1 (IETF Recommendation)

- Flexible “lowest common denominator” standard with 15 optional, repeatable fields;

- XML and HTML based - integrates completely with assets that live on the web or are accessed via the web and live in an attached database May be intrinsic or separate from the asset described;

- Automated tools for generating/validating Dublin Core are freely available, e.g. DC.dot: http://www.ukoln.ac.uk/metadata/dcdot/

Page 17: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Content IntellectualProperty

Instantiation

Title Creator DateSubject Publisher TypeDescription Contributor FormatSource Rights IdentifierLanguageRelationCoverage

From “Description of Dublin Core Elements”http://purl.oclc.org/metadata/dublin_core_elements

Page 18: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Dublin CoreDrawbacks:

Too Flexible and Simple for complex, sophisticated collections;

Elements lack standardized use and precision. Different communities are developing extensions to

specify and categorize the elements. Approved extensions are available but slow to appear.

Some elements (rights, coverage) are ambiguous in their application

Page 19: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Dublin CoreDrawbacks:

Intended for web objects that are textual or primarily textual. Does not provide for:

Media asset components (video sequences, scenes, shots, frames, objects);

sequential media (audio and video, slide shows);

synchronized media (video, audio, caption file or transcription; slide shows).

Page 20: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Result: Every Community Creates Their Own Metadata

Archives: EAD (Encoded Archival Description)

Government: GILS (Government or Global Information Locator System)

IMS: Instructional Metadata System

TEI: Text Encoding Initiative - books and humanities; TEIH (TEI Header used for

metadata description

Dublin Core EdNA http://www.edna.edu.au/edna/owa/info.getpage?sp=auto&pagecode=5210

“Flavors” CIMI Guide to Best Practice: Dublin Core. Available as PDF

from http://www.cimi.org/

Page 21: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

MARC Machine-readable cataloging: most library catalogs

worldwide.

MPEG-7 Digital Audio, Video and Still Image files. (In development. Committee

draft due October 2000)

Page 22: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

MPEG-7:Intended to describe audiovisual information regardless of storage, coding, display, medium of technology--will include analog and digital media and combinations of media formats

Will Standardize:

* Core set of Descriptors (D)

* Description Schemes (codified structures of Descriptors-- definition, constraints, relationships among Descriptors) (DS)

* Language defining Description Schemes and Descriptors

Page 23: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Jane Hunter. “MPEG-7: Behind the Scenes” in D-Lib Magazine September, 1999 (v. 5, no. 9): 6)

MPEG-7 Structural Model

Page 24: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Possible MPEG7 schema incorporating DC<DC:Type>Image.Moving.TV.News.sequence.scene</DC:Type>

<DC:Description.text>”Footage of Grenade Attack”</DC:Description.text>

<DC:Description.transcript>”Sam Rainsy knows the violence of political life in Cambodia. Four months ago, 16 of his supporters were killed in a grenade attack in Phom Penh.”</DC:Description.transcript>

<DC:Format.Length>10seconds</DC:Format.Length>

<DC:Coverage.t.min DC.Scheme=“SMPTE”>19:31:57;1</DC:Coverage.t.min>

<DC:Coverage.t.maxDC Scheme=“SMPTE”>19:32:07;1</DC:Coverage.t.max>

From: Jane Hunter and Renato Iannella. “The Application of Metadata Standards to Video Indexing.” In Research and advanced technology for digital libraries : second European conference, ECDL '98, Heraklion, Crete, Cyprus, September 21-23, 1998 : Proceedings. Berlin: Springer: 1998 (Lecture Notes in Computer Science: 1513): 135-156.

Page 25: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Beyond the Metadata Schema:

Access to Information:

Information stored and managed within your organization (possibly under different metadata schema)

Information stored and managed by outside organizations

Page 26: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadatabase - Dublin CoreRecord 1

DC.Creator Grace Agnew

Record 70

DC.Contributor. Grace Agnew

Books and web sites written by Grace Agnew

Author: Agnew, GraceParameter mapping: DC.Creator, DC.Contributor

Result Set:AGNEW, GRACE…1999………………

AGNEW, GRACE…1994……………...

Page 27: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Books and web sites written by Grace Agnew

Author: Agnew, GraceParameter mapping: DC.Creator, DC.Contributor

SEARCH ENGINE 1 SEARCH ENGINE 2

Author: Agnew, GraceParameter mapping: 100, 700

Page 28: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50

Information Retrieval (Z39.50): Application Service Definition and Protocol Specification

Enables a client to interact with multiple servers, employing different search engines and different data element formats and definitions, to search databases and retrieve the records that result from the search

Page 29: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50

Initiates a session between client and server

Executes a query from the client against one or more databases on the server

Creates a result set consisting of records that match the query on one or more query attributes (access points)

Page 30: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50

Returns a report on the number of records matching the search

Returns records--individual records selected by the client--in a format selected by the client

Primary formats returned: MARC, SUTRS,

extending to SQL, Dublin Core, other schema

Page 31: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50 Version 3

Extends the capabilities of the standard to include:

• Boolean and proximity searching

• Extended services, including saved queries to be periodically re-executed (“SDI”)

• “Explain” facility to allow client to solicit information about the server and

dynamically reconfigure itself.

Page 32: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50

Profiles for User Groups:

LOC: Access to Digital Collections

LOC: Access to Digital Library Objects

CIMI: Companion profile for museum digital collections and objects

GEO: Geospatial Datasets

Z+SQL: extension to the SQL query language

Page 33: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50 - Limitations

Requires client software and Z39.50-enabled server software (which requires Z39.50 aware search engine)

Most commercial C/S Products have not implemented the “explain” feature in version 3

Requires human collaboration for implementation, particularly at the profile level

Limited primarily to features provided by commercial servers and clients

Page 34: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Z39.50 Limitations

Indexing parameters proprietary to server database are not shared with client to allow client to override or extend the proprietary search parameters

Databases that are not on a Z39.50 server are invisible

Page 35: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadata Registries:Dynamic specification, maintenance and description of metadatabase structures:

unambiguous definition of data structures

unambiguous definition and description of relationships between data structures,

behaviors of data structures, integrity constraints on the contents of data structures.

semantics (meaning in context) and structure definition

Page 36: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadata RegistriesLinks/Hooks into subordinate registries used to define data content within a metadata element

Mapping of data structures between registries

Should be both eye-readable and able to be interpreted by computer programs for seamless, unambiguous discovery, query and display across disparate database and search engine structures and to enable intelligent query agents, advanced data mining, etc.

Page 37: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadata Registries

Collaborative Effort of the Joint Technical Committee 1 (JTC1) of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC)

Open Forum on Metadata Registries:

http://www.sdct.itl.nist.gov/~ftp/l8/sc32wg2/2000/events/openforum/index.htm

Page 38: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Metadata Registries

REGGIE - Java Applet that dynamically creates metadata according to available online registries;

Allows you to enter your own registry, describing, characterizing and constraining all the elements in the set.

http://metadata.net

UK/Australia joint effort

Page 39: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata OverviewAnything by Grace Agnew?

Metadatabase

Scheme = DC <URL of Registry>

Dublin Core.

Author defined as:

Creator,

Contributor

Page 40: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Resource Description FrameworkW3C Resource Description Framework (RDF) Model and Syntax Specification (22 February 1999): http://www.w3.org/TR/REC-rdf-syntax/

Provide robust application of metadata in the web environment:

Model for unambiguous, schema-independent description of resources.

Key Concepts:

Resource: Any object uniquely identifiable by a URI (uniform resource identifier)

Property-type: Property associated with a resource.

Value: Associated with a property type--may be atomic (a string) or another resource,

creating a new hierarchy)

Page 41: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

RDFProperty types express the relationships of values associated with resources:

“Famous Example”

The Author of “Metadata Overview” is Grace Agnew

Metadata Overview

http://www…….edu/mo“Grace Agnew”

Resource

Property Type

Value

Author

Page 42: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

RDFEnables interoperability among metadata schemes, including the modular use of multiple schemes within a metadata record utilizing the XML namespace facility;

Adds machine-interpretable semantics to the encoding, exchange and reuse of structured metadata;

Enables automatic negotiation between search engine, metadata record, and metadata registry for powerful, flexible search and retrieval independent of server and client search and retrieval infrastructures (or, at least, it will!)

Page 43: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

MetadataApplication of Dublin Core and RDF for resource description: Dublin Core in HTML - Resides in the Header Element<html><head><title>A Thousand Wheels are set in Motion - Georgia Tech Library and Information Center </title><link rel=schema.DC" href="http://purl.org/dc"><meta name="DC.Title" content="A Thousand Wheels are Set in Motion”><meta name=“DC.Title.Alternative" content="The Building of Georgia Tech at the Turn of the 20th Century, 1888-1908"><meta name="DC.Creator.CorporateName” scheme="LCNAF" content="Georgia Institute of Technology Library and Information Center"><meta name="DC.Subject" scheme="LCSH" content="Georgia Institute of Technology--Buildings"><meta name="DC.Description" content="This Web site provides photographs, engravings and sketches of the first buildings on the Georgia Tech Campus, from 1888-1908. As of 9/20/1999, 88 images are provided but more will be added. Cataloged in EAD Single Item Metadata format."><meta name="DC.Publisher.CorporateName" scheme="LCNAF" content="Georgia Institute of Technology Library and Information Center"><meta name="DC.Contributor.PersonalName" scheme="LCNAF" content="Chritton, Heather"><meta name=Dc.Contributor.PersonalName” scheme=“LCNAF”content=“Crafts, Laurel”>

Full Metadata record: http://www.library.gatech.edu/gtbuildings

Page 44: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

RDF / Dublin Core in XML<?xml:namespace href=“http://www.w3c.org/RDF/”=as=”RDF”?>

<?xml:namespace href=“http://purl.org/RDF/DC”as=”DC”?>

<?XMl:namespace href=“http://loc.gov/LCNAF”as=”LCNAF”?>

<?XML:namespace href=“http://loc.gov/LCSH” as= “LCSH”?>

<RDF:RDF>

<RDF: Description RDF: HREF=“http://purl.org/metadata/dublin_core_elements”>

<DC.Title> A Thousand Wheels are Set in Motion</DC:Title>

< DC.Title.Alternative> The Building of Georgia Tech at the Turn of the 20th Century, 1888-1908</DC.Title.Alternative>

<DC:Creator.CorporateName>

<RDF:Description>

<LCNAF:CorporateName>Georgia Tech Library and Information Center</LCNAF:Corporate Name>

</RDF:Description>

Page 45: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview<DC:Subject>

<RDF:Description>

<LCSH:CorporateName>Georgia Institute of Technology-- Buildings</LCSH:CorporateName>

</RDF:Description>

<DC:Description> This Web site provides photographs, engravings and sketches of the first buildings on the Georgia Tech Campus, from 1888-1908. As of 9/20/1999, 88 images are provided but more will be added. Cataloged in EAD Single Item Metadata (SIM) format.</DC:Description>

<RDF:Seq>

<RDF:Description>

<RDF:LI><LCSH:PersonalName>Chritton, Heather</LCSH:PersonalName></RDF:LI>

<RDF:LI><LCSH:PersonalName>Crafts,Laurel</LCSH:PersonalName></RDF:LI>

</RDF:Description>

</RDF:Seq>

Page 46: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata OverviewNotes:

1. RDF shows three types of relationships among collected resources:

Sequence (specified ordering of elements)

Bag (all members of equal importance)

Alternatives (choice between members)

In this example, I am specifying among contributors that Heather Chritton, the web page developer, appears first among contributors and Laurel Crafts, the digital image creator, appears second. Other contributors follow (text creation, metadata creation, indexing, etc.) in specified order in the complete record. I use the RDF Sequence list to establish this fixed contributor order.

2. LCSH (Library of Congress Subject Headings) and LCNAF (Library of Congress Name Authority File) do not currently reside on web pages at a URL. The URLs provided are for illustration only

Page 47: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

XMLExtensible Markup Language, a subset of SGML (Standard Generalized Markup Language) provides the ability to define elements within a web document. XML documents have a logical and a physical structure. Each unit of an XML document is an entity. Entities are defined within the document in relation to each other. The logical and physical structures of the document include declarations, elements, comments, character references and processing instructions. Structural relationship is provided through nesting.

Page 48: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

XML

XML display is governed by an attached style document, formulated in CSS (Cascading Style Sheet) or XSL (Extensible Style Language) to provide rules for display. Styles can be applied to single elements as well as to the entire document. More than one style sheet or style document can be provided for a document or element, with precedence rules governing the given display.

Page 49: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

DTD The Document Type Declaration provides a formally defined structure, vocabulary and

syntax for an XML document type. Documents are validated against a DTD to insure nested structure and semantic constraints are followed to insure consistent meaning across documents.

DCD A semantic superset of XML DTDs--intended to be conformant with the RDF Model and Syntax

Specification. Describes an XML vocabulary for schemas--for specifying object classes. Based on elements (RDF property types) and attributes Supports RDF vocabulary and constructs.

Page 50: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

SOX Schema for Object-Oriented XML

Alternative to DTD for validating XML documents. Supports scalar (numeric) datatypes,

enumerated datatypes (values enumeration) and format datatypes. An expanded namespace facility supports objects from any identifiable namespace to be used to build the document.

Page 51: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Role of the Database: A database that can be parsed and reported to a validated XML metadata format, as well as other metadata syntaxes, provides a robust space for metadata development. Also reports to any XML Document type and hooks into applications via APIs, to support unique user needs ORACLE DATABASE

MARC-BASED CATALOG

COLLABORATIVE RESEARCH SPACE

WEB-BASED COURSEWARE APPLICATION

SUBJECT-SPECIFIC WEB RESEARCH TOOL

PERSONAL

RESEARCH

SPACE

Page 52: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Last Step: Data RetrievalData storage, access and delivery architecture should be open, standards-based, hardware and software independent, providing users across platforms with common, consistent interface and underlying storage structure for efficient retrieval, display, storage and use of digital information

Data architecture should support a well-defined, widely available security system to validate authenticity of users and provide data for a variety of uses according to a scalable authorization hierarchy

Page 53: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Last Step: Data RetrievalData architecture should support data as objects for scalable, extensible access, with sophisticated and flexible support for object relationships, particularly to support different physical instantiations of identical data, e.g. digital video object as D1, MPEG1, Quicktime, etc.

CORBA Common Object Request Broker Architecture - emerging architecture for open distributed object computing. Intended to provide transparent access to applications and databases, regardless of the hardware and software infrastructure at each end of the transaction

Page 54: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Metadata Overview

Putting It All Together:

A Digital Archive Architecture

Reference Model for Open Archival Information Systems (OAIS),

Developed by a US ISO archiving group under ISO TC20/SC13 and the Consultative Committee for Space Data Systems (CCSDS). This model has recently been released for formal ISO and CCSDS review. An electronic version of the OAIS Reference Model can be found at

http://www.ccsds.org/RP9905/RP9905.html

Page 55: September, 1999 Grace Agnew Metadata Overview Metadata: Data that describes data Structured data about data Pure metadata has meaning only in relation

September, 1999 Grace Agnew

Reference Model for Open Archival Information Systems (OAIS)

EXTERNAL DATA FLOW DIAGRAM