lecture 3: multimedia metadata standardssfchang/course/vis/slide/lecture3-handout.pdf · lecture 3:...

18
EE 6850, F'02, Chang, Columbia U 1 Lecture 3: Multimedia Metadata Standards Prof. Shih-Fu Chang EE 6850, Fall 2002 Sept. 18, 2002 Course URL: http://www.ee.columbia.edu/~sfchang/course/vis/ EE 6850, F'02, Chang, Columbia U. 2 References Digital Still Camera Image File Format Standard (Exchangeable image file format for Digital Still Cameras: Exif) - Version 2.1 http://www.exif.org/ Introduction to MPEG-7 (v2), Document: ISO/IEC JTC1/SC29/WG11 N3751. Oct. 2000. DIG35 Image Metadata Standard http://www.i3a.org/i_dig35.html S.-F. Chang, T. Sikora and A. Puri, "Overview of the MPEG-7 Standard," IEEE Transactions on Circuits and Systems for Video Technology, special issue on MPEG-7, June 2001.

Upload: hoangmien

Post on 30-Aug-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

EE 6850, F'02, Chang, Columbia U 1

Lecture 3: Multimedia Metadata Standards

Prof. Shih-Fu Chang

EE 6850, Fall 2002

Sept. 18, 2002Course URL: http://www.ee.columbia.edu/~sfchang/course/vis/

EE 6850, F'02, Chang, Columbia U. 2

References

� Digital Still Camera Image File Format Standard (Exchangeable image file format for Digital Still Cameras: Exif) - Version 2.1 http://www.exif.org/

� Introduction to MPEG-7 (v2), Document: ISO/IEC JTC1/SC29/WG11 N3751. Oct. 2000.

� DIG35 Image Metadata Standardhttp://www.i3a.org/i_dig35.html

� S.-F. Chang, T. Sikora and A. Puri, "Overview of the MPEG-7 Standard," IEEE Transactions on Circuits and Systems for Video Technology, special issue on MPEG-7, June 2001.

EE 6850, F'02, Chang, Columbia U. 3

Why Metadata Standard?

� Content Exchange� Content owners� Consumers

� Interoperable Client Applications� Cross-operator information access� Meta search engines

EE 6850, F'02, Chang, Columbia U. 4

DIG 35 - image metadata

EE 6850, F'02, Chang, Columbia U. 5

DIG35

� Participants: Canon, Kodak, Fuji, HP, Microsoft, Polaroid, Seattle Film Works, etc.

� Time frame: started 1999, WD 1.0 March ’00, V 1.0 Aug. 2000.

� Use-case Scenarios:� albuming, content searching, linking,

information/copyright preservation

EE 6850, F'02, Chang, Columbia U. 6

DIG 35 metadata interchange model

EE 6850, F'02, Chang, Columbia U. 7

DIG 35 metadata subblocks

EE 6850, F'02, Chang, Columbia U. 8

EXIF: Exchangeable image file format

� October 1996 Version 1.0, May 1997 Version 1.1, June 1998, Version 2.1 .

� Supported by most Digital Camera Manufacturers

� Consist of both image and audio file specifications

� Image file spec includes:� Structure of image data files,� Tags used by this standard,� Definition and management of format versions.

EE 6850, F'02, Chang, Columbia U. 9

EXIF Image File Spec

� Compressed files are recorded as JPEG.� Uncompressed files are recorded in TIFF Rev. 6.0� A feature of Exif image files is their compatibility with

standard formats in wide use today.� Related attribute information for both compressed

and uncompressed files is stored in the tag information format defined in TIFF Rev. 6.0.

� New EXIF specific attributes are stored as private tags in TIFF.

EE 6850, F'02, Chang, Columbia U. 14

� Flexible, extensible, multi-level, and standard framework for describing multimedia� Systems, DDL, Video, Audio, MDS, Software

� Scope

� Schedule

MPEG-7 Standard

9/0110/0012/9910/98

International Standard

Committee Draft

Working Draft

Call For Proposals

Feature Extraction

MPEG-7 Description

Search/Filtering Application

EE 6850, F'02, Chang, Columbia U. 15

MPEG-7 Segment Types

EE 6850, F'02, Chang, Columbia U. 16

MPEG-7 Framework

� Description Definition Language (DDL)� Language to create new

Ds/DSs or extend existing ones

� Extend XML-Schema

� Description Schemes (DSs)� Structure and semantics

of relations among Ds/DSs

� Descriptors (Ds)� Representation of a

feature of AV data

DescriptionDefinitionLanguage

DescriptionScheme

Descriptor

1..*

0..*

defines

describes

1..*

AV ContentItem

Data

Feature

User or System

to

signifies1..*

1..*

1..*

EE 6850, F'02, Chang, Columbia U. 17

MM ContentMM Content

MPEG7Coded

DescriptionEncoder Decoder

Description DefinitionLanguage (DDL)

Description Schemes(DS)

Descriptors (D)

DescriptionGeneration

MPEG7Description Search /

QueryEngine

User or dataprocessing

system

FilterAgents

MPEG-7 Application Chain

EE 6850, F'02, Chang, Columbia U. 18

Parts of MPEG-7 (ISO/IEC 15938)

� Systems� Binary encoding, Dynamic update, Transport,

Synchronization, and IPMP tools� Description Definition Language (DDL)

� Language for defining new, extending existing DSs and Ds� Visual

� Visual Ds and DSs� Audio

� Audio Ds and DSs� Multimedia Description Schemes (MDS)

� Generic Ds and DSs; neither purely visual nor purely audio� Reference Software� Conformance

EE 6850, F'02, Chang, Columbia U. 19

X M L

� eXtensible Markup Language (XML)� Derived from SGML (Standard Generalized Markup Language)� Description of structure and semantics of documents� Human- and machine- readable� Author-defined elements and attributes: DTD or XML-Schema

<customer id="Ana2000"><name> Ana Benitez </name><address country="US">

<street>500 W 120</street><city> New York </city><state> New York </state><postal> 94571 </postal>

</address></customer>

EE 6850, F'02, Chang, Columbia U. 20

XML / DTD / XML-Schema

XML Description

<customer id="Ana2000">

<name> Ana Benitez </name>

<address country="US">

<street>500 W 120</street>

<city> New York </city>

<state> New York </state>

<postal> 94571 </postal>

</address>

</customer>

DTD Definition

<!ELEMENT customer (name, email?, address+) >

<!ATTLIST customer id ID #REQUIRED>

<!ELEMENT name (#PCDATA)>

<!ELEMENT email (#PCDATA)>

<!ELEMENT address (street, city, state, postal)>

<!ATTLIST address country CDATA #REQUIRED>

<!ELEMENT street (#PCDATA)>

<!ELEMENT city (#PCDATA)>

<!ELEMENT state (#PCDATA)>

<!ELEMENT postal (#PCDATA)>

XML / DTD / XML-Schema (cont)

XML Description

<customer id="Ana2000">

<name> Ana Benitez </name>

<address country="US">

<street>500 W 120</street>

<city> New York </city>

<state> New York </state>

<postal> 94571 </postal>

</address>

</customer>

XML-Schema Definition<complexType name=“customer”>

<element name=“name” type=“string”/><element name=“email” type=“string”

minOccurs=“0”/><element name=“address” type=“addressType” maxOccurs=“unbounded”/><attribute name=“id” type=“ID” use=“required”/>

</complexType><complexType name=“address”>

<element name=“street” type=“string”/><element name=“city” type=“string”/><element name=“state” type=“string”/><element name=“postal” type=“positiveInteger”/><attribute name=“country” type=“string”/>

</complexType>

EE 6850, F'02, Chang, Columbia U. 22

Some Useful Sites

� W3C: http://www.w3.org/xml

� XML Cover Pages: http://www.oasis-open.org/cover/

� Web Developer’s Virtual Library: http://wdvl.com/

� XML Industry Portal: http://www.xml.org/

� XML Schemas Endgame: http://www.xml.com/pub

� Apache XML Project: http://xml.apache.org/

� IBM alphaWorks: http://www.alphaWorks.ibm.com/

EE 6850, F'02, Chang, Columbia U. 23

Video Descriptors

OtherFace Recognition

LocalizationRegion LocatorSpatio-Temporal Locator

MotionCamera MotionMotion TrajectoryParametric MotionMotion Activity

ShapeRegion ShapeContour Shape3D Shape

TextureHomogeneous TextureTexture BrowsingEdge Histogram

ColorDominant ColorScalable ColorColor LayoutColor StructureGoF/GoP Color

EE 6850, F'02, Chang, Columbia U. 24

Example: Color Histogram

<GoFGoPHistogram HistogramTypeInfo = "Average"><ColorHistogram>

<ColorSpace> <HSV/> </ColorSpace><ColorQuantization ColorQuantizationType = ”uniform">

<bin_number> 4 </bin_number><bin_number> 4 </bin_number><bin_number> 4 </bin_number>

</ColorQuantization><Histogram HistogramNormFactor = "1" NumberHistogramBins = "64">

<HistogramValue> 444 </HistogramValue><HistogramValue> 34 </HistogramValue><HistogramValue> 58 </HistogramValue><HistogramValue> 564 </HistogramValue><HistogramValue> 16 </HistogramValue><! -- Other HistogramValue elements -- >

</Histogram></ColorHistogram>

</GoFGoPColorHistogram>

EE 6850, F'02, Chang, Columbia U. 25

Structure Description Tools

Segment DS

describes

MultimediaContent

SegmentRelation DS

SegmentDecomposition

DS

VideoSegment DS

MovingRegion DS

. . .

StillRegion DS

TextAnnotation D SpatialMask D . . .

Structure Description (I)

Video Segment

Segment Decomposition

Moving Region

Segment Decomposition

Moving Regions

Segment Decomposition

• MediaTime• Mosaic• GoFGoPColor• TextAnnotation

• MediaTime• ScalableColor• ParametricMotion• TextureBrowsing• ContourShape• TextAnnotation

Relation

Video Segments

above

Structure Description (II)<StillRegion id="SR1">

<TextAnnotation><FreeTextAnnotation>Alex shakes hands with Ana

</FreeTextAnnotation></TextAnnotation><SpatialDecomposition overlap="false" gap="true">

<StillRegion id="SR2"><TextAnnotation> <FreeTextAnnotation> Alex </FreeTextAnnotation>

</TextAnnotation><VisualDescriptor xsi:type="ColorStructureType"> ... </VisualDescriptor>

</StillRegion>

<StillRegion id="SR3"><TextAnnotation> <FreeTextAnnotation> Ana </FreeTextAnnotation>

</TextAnnotation><MatchingHint><Hint value="0.455" xpath=”../../VisualDescriptor"/>

</MatchingHint><Relation xsi:type="DirectionalSpatialSegmentRelationType“

name="left“ target="#SR2"/><VisualDescriptor xsi:type="ColorStructureType"> ... </VisualDescriptor>

</StillRegion>

</SpatialDecomposition></StillRegion>

Still region SR1: Creation inform a tion Text annotation

Still region SR2: Text annotation Color structure

Still region SR3: Text annotation Matching hint Color structure

Spatial segment decompos i tion: No overlap, gap

Directional spatial segment relation: left

EE 6850, F'02, Chang, Columbia U. 28

Semantic Description Tools

captures

SemanticBag DS

SemanticBase DS

Content

Narrative World

Object DS

Event DS

Concept DS

SemanticState DS

SemanticPlace DS

SemanticTime DS

AgentObject DSAbstractionLevel

describes

. . .

AnalyticModel DS

Segment DS

Semantic DSMultimedia

SemanticRelation

DS

Label

EE 6850, F'02, Chang, Columbia U. 29

Semantic Description

Agent object AO1: Label Person

Agent object AO2: Label Person

Event EV1: Label

Concept C1: Label Property Property

Comradeship

Shake hands

Alex Ana Object-event relation: hasAccompanierOf

Concept-semantic base rel a tion: hasPropertyOf

Object-event relation: hasAgentOf

Segment-semantic base relation: hasMediaPerceptionOf

Segment-semantic base relation: hasMediaSymbolOf

Segment-semantic base relation: hasMediaPerceptionOf

New York

9 September

SemanticPlace SP1: Label

Place

SemanticTime ST1: Label

Time Semantic time-semantic base relation: hasTimeOf

Semantic place-semantic base relation: hasLocationOf

Still region SR1: Creation inform a tion Text annotation

Still region SR2: Text annotation Color structure

Still region SR3: Text annotation Matching hint Color structure

Spatial segment decompos i tion: No overlap, gap

Directional spatial segment relation: left

Agent object AO1: Label Person

Agent object AO2: Label Person

Event EV1: Label Semantic time Semantic place

Concept C1: Label Property Property

Comradeship

Shake hands

Alex Ana

Object-event relation: hasAccompanierOf

Concept-semantic base rel a tion: hasPropertyOf

Object-event relation: hasAgentOf

Segment-semantic base relation: hasMediaPerceptionOf

Segment-semantic base relation: hasMediaSymbolOf

Photographer: Seungyup Place: Columbia University Time: 19 September 1998

704x480 pixels True color RGB http://www.ee.columbia.edu/~ana/alex&ana.jpg

Columbia University, All rights reserved

Creation information: Creation Creator Creation corrdinates Creation location Creation date

Media information: Media profile Media format Media instance

Usage unformation: Rights

AnMPEG-7 Description

EE 6850, F'02, Chang, Columbia U. 31

Multimedia Description Schemes

Content management

Content description

Creation &Production

Media Usage

SemanticsStructure

ModelsCollectionsContent organization

SchemaToolss

Links & MediaLocalization

Basic Tools

Basic elements

BasicDatatypes

Navigation &Access

Summaries

Variations

Views

Userinteraction

UserPreferences

UserHistory

EE 6850, F'02, Chang, Columbia U. 32

Other MDS

� Creation & Production� Description of content creation and production

(e.g. title and creator), mostly author-generated

� Usage� Description of usage of the content (e.g. rights

holders and publication)

� Media� Description of instances of storage media (e.g.

storage format) for AV content

EE 6850, F'02, Chang, Columbia U. 33

Other MDS Categories

� Navigation & Access� Description of summaries (hierarchical and sequential)

and views for efficient browsing� Description of variations for personalized access

� Translation, transcription, reduction, etc.

� Content Organization� Description of collections, classifications, and models

� User Interaction� Description of user’s preferences pertaining to

consumption of multimedia material

Transmission/Storage Medium

IP MP4Delivery

Layer

Demultiplex

MPEG-2 ATM ...

Multiplexed Streams

DemultiplexDemultiplex

Schemastreams

Descriptionstreams

CompressionLayer

Elementary Streams

Multimediastreams

UpstreamData

Application

APIs

Defines Describe

Reconstruction

DescriptionDecoder

SchemaDecoder

BiM/TextualParsing

BiM/TextualDecoding

MPEG-7 Terminal Architecture(March ‘01)

EE 6850, F'02, Chang, Columbia U. 35

Related Course Projects

� Survey of GPS attributes and their potential use in image organization

� Visualization tools for MetaData streams