m4 – wp3 multimodal integrationspandh.dcs.shef.ac.uk/projects/m4/project_only/... · m4 – wp3...

Post on 21-Jul-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

M4 – WP3

Multimodal integrationProgress report

Viper group

Computer Vision and Multimedia Lab

University of Geneva30-01-03

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 2

Progress report

UniGEInformation retrieval setup / extensionVideo data processingInformation management framework

WP3:IssuesStatus – deliverable

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 3

Information retrieval setup (initial)

Segmentation

Event definition

A/V/textinput

URLisation

SQLDB

Time codes

URLs

Characterisation

Feature definition

Feature files

GIFT indexing

Keyframes

Index file

GIFT

Text

QBEquery

Textquery

Interface

MRML

Query client

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 4

Information retrieval setup (planned)

Segmentation

Event definition

A/V/textinput

URLisation

SQLDB

Time codes

URLs

Characterisation

Feature definition

Feature files

GIFT indexing

Keyframes

Index file

Text

QBEquery

Textquery

Interface

MRML

Query client

Text

Audioquery

GIFT

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 5

Video processing (1)

OVAL :Video Access LibraryC++ Video Object ModelAccepts plugin for specific formats

MPEG-1 : Dali from CornellLibDV, « XML » video plugin

Provides a generic APIOpen, Close, GetProp streamGetFrame(s)Specific (MPEG: getMV, getDCT)

Do not accomodate Image Processing functionalitiesUse of Matlab Mex with persistent memory

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 6

Video Processing (2)

Video segmentationClassical techniques

Based on spatio-temporal features (ongoing)Mixed colour/motion information

Need to be extended to event-based segmentationIntegration of M4 features

Video characterisationEstimation on feature pattern model (motion)

Support Vector RegressionNon-linear Prediction of Chaotic Times Series using SVM, NNSP’97(Mukherjee, Osuna, Girosi)

Predicting Time Series with SVM, ICANN’97 (Muller, Smola, Schölkopf, Vapnik)

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 7

Video Similarity Measure

Problems: S(V1 , V1) ≠ 0S(V1 , V2) ≠ S(V2 , V1)

Artificial symetrizationD (V1 , V2) = 0.5*[S(V1 , V2) + S(V2 , V1) ]

)(1),( 221 1VEVVS V−=

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 8

Video Classification

Distance matrix computed with prediction error D(Vi , Vj )For all pair of video <i,j> in the given database

Di,j = D(Vi , Vj )

Curvilinear Component Analysis is applied on D⇒ gives a 2-dimensionnal mapping of the feature space

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 9

Preliminary experiment

29 video shots containing mainly Tv news and sport activities

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 10

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 11

Ongoing…

Text retrievalInclusion within GIFTMultimodal embedding (visual+text query)Query expansion (eg using WordNet)

Event characterisationHigh level modelFeature-based inference

⇒Characterisation of well-known events⇒Suitable for restricted contexts (M4)

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 12

Information management

MRML : Going toward version 2.0More multimedia

More like an XML protocol (as defined by W3C - XMLP)Trully multimedia / multimodal

⇒ Spec proposal release mid-Feb

⇒ Expected validation software: this summer

DEVA (Annotation model)Based on RDF and Dublin Core (XML)

DAML+OIL (OWL) compatible

Makes existing software available (Xerces, Jena,…)Allows multiple extensions (WordNet,…)

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 13

WP3: Initial work plan

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 14

WP3: Delivrables

D3,1: Report on baseline information access methodsm12 (Feb 2003)

Technical doc of the working system in place

D3,2: Report on methods for multimodal integration and NLPm24 (Feb 2004)

Define intuitive way for meeting data querying and retrieval

D3,3: Final report on multimodal information accessm36 (Feb 2005)Technical doc of the meeting manager

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 15

D3.1

Gathered basic informationGroup-based

Template sent by next weekActivity-basedDescription of what you can contribute in one field

Response by Feb 20thFill in where you feel is relevant

Edited by End of FebSmoothed out gaps…

Sent to Steve by Mid March

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 16

WP3: Issues

Visual data is not usable aloneNeed for text transcitpsUse of « external » data

Need for common format for data exchangeAnnotation (explicit)Processing results

Increase collaborationIntegration

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 17

WP3 breakdown

Year 1 (-> 03/2003)Emphasis on multimedia information processing and retrieval

Image, Video : Visual + Motion

Audio (speech), TextFramework: Architecture, integration

Year 2: (-> 03/2004)Emphasis on multimodal interaction (query processing)Information from text, speech (text?), gesture,...

Natural language processing

Year 3: (-> 03/2005)Emphasis on data summarisation

Video, dialogs, documents

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 18

????

S. Marchand-Maillet http://viper.unige.ch/ M4 meeting #3, Sheffield, UK, January 2003 19

CBIR server

TCP/IP

Client

Multimedia data

http server

socket

MRMLlayer

QBE query formulator

(eg PHP interface)

Existing tool

socket

MRMLlayer

Tool plugin(eg GIMP

plugin)

MRMLlayer

socket

Assessor(eg Viper evaluation

script)

Open socket

GIFT

plugins

MRML

PluginX

PluginY

Multimedia feature storage

MRML logging

Multimedia data

Online Offline

Feature extraction

…features

URL abstraction

(temporary local copy)

QueriesResponseRelevance feedback

The framework

top related