multimedia computing and its applicationsxiaoyong/download/multimedia.pdf · multimedia level query...

47
Multimedia Computing and Its Applications Xiao-Yong Wei (魏骁勇) Machine Intelligence Laboratory Sichuan University, China

Upload: others

Post on 02-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Multimedia Computing and Its Applications

Xiao-Yong Wei (魏骁勇)

Machine Intelligence Laboratory

Sichuan University, China

Page 2: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Multimedia Computing

Page 3: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations
Page 4: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations
Page 5: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

http://jaeger.earthsci.unimelb.edu.au/Images/Topographic/Whole_Earth/Earth_50.jpg

12

,75

6.3

2 k

ilom

eter

s

5

YouTube (83M videos, 10 hrs/min)

Web (10B videos watched per month)

All broadcast (70,000TB/year)

Page 6: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Issues Related to Explosive

Multimedia Data Growth

• Management and Retrieval

• Illegal Information

• Copyrights

• Communications

• …

Semantic Gap

Page 7: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Semantic Gap

User

Level

Multimedia

Level

Query Query Query

Sem

an

tic Ga

p

Text Image Motion Audio

Low-Level

Representations

Low-Level Features

Query

Page 8: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Case Study: Semantics-based

Video Search

Page 9: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Text Search on Multimedia

User

Level

Multimedia

Level

Query Query Query Query

Find a

car

woman man AD car Tags

Page 10: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Untrustworthy Tags

Find a

car

Page 11: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Semantics in Multimedia

Sky

Mountain

Desert

Road

Car

Page 12: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

12

Page 13: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations
Page 14: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations
Page 15: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Bridging Semantic Gap

User

Level

Multimedia

Level

Query Query Query

Sem

an

tic Ga

p

Text Image Motion Audio

Low-Level

Representations

face car animal …….

Hig

h-L

evel

Sem

an

tic

Low-Level Features

High-Level Concepts

Query

We can only develop

a limited number of

concept detectors.

Page 16: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Bridging Semantic Gap

User

Level

Multimedia

Level

Query Query Query

Sem

an

tic Ga

p

Text Image Motion Audio

Low-Level

Representations

face car animal …….

Hig

h-L

evel

Sem

an

tic

Gen

eral

Vo

cab

ula

ries

Low-Level Features

High-Level Concepts

Query-to-Concept Reasoning

Query

Page 17: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Modeling the Reasoning

Page 18: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Find me images

including military

vehicles

How do WE search images?

Page 19: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Find images including military vehicles (reasoning process of human)

armored car, tank

ARE

military vehicles

armored car

tank

Page 20: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Find images including military vehicles (reasoning process of human)

Page 21: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Find images including military vehicles (reasoning process of human)

soldier

soldier

explosion

soldiers

military

vehicle

explosion

Page 22: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Find images including military vehicles (reasoning process of human)

This is tank

on lawn

tank

lawn

tank

lawn

This might be what they want !!!

Page 23: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Semantic

Reasoning

armored car,

tank

(e.g., IS-A relation)

Reasoning Skills of Human Find images including military vehicles

explosion, soldiers

Contextual

Reasoning

soldiers

military

vehicle explosion armored

car tank

tank

lawn

Visual

Mapping

tank

lawn

Page 24: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Find images

including military

vehicles

Page 25: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Semantic Reasoning

• Ontology or Semantic Web - a conventional way

weapon

gun

vehicle

tank armored car

Page 26: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Semantic Reasoning

• Semantic space – a linear vector space making

the reasoning computable

weapon

gun

vehicle

tank armored car

Semantic Space

B2

B1

gun

tank

armored

car

vehicle

weapon

Page 27: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

-1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

walkingmarching

snow

cloud

tower

doggrass

government

smoke

entertainment

boat

tree

bus

monologue

weather

cartoon

court

riverface

religious

house

chart

food

mountain

male

graphics

sport

candle

sky

prisoner

studio

basketball

powell

meeting

screen

flagoffice

motorbike

female

weapon

leader

vegetation

crowd

explosion

map

racing

disaster

road

violence

people

bicycle

beach

police

building

water

newspaper

military

soccer

tankaircraft

car

waterfall

vehicle

fire

bird

georgebush

animal

clinton

golf

city

chair

truck

tonyblair

cyclingfootballtennis

drawing

desert

fish

table

0.45 0.5 0.55 0.6 0.65 0.7-0.42

-0.4

-0.38

-0.36

-0.34

-0.32

-0.3

-0.28

walking

marching

snow

cloud

tower

dog grass

government

smoke

entertainment

boat

tree

bus

monologue

weather

cartoon

court

river

face

religious

house

chart

food

mountain

male

graphics

sport

candle

sky

prisoner

studio

basketball

powell

meeting

screen

flag

office

motorbike

female

weapon

leader

vegetation

crowd

explosion

map

racing

disaster

road

violence

people

bicycle

beach

police

building

water

newspaper

military

soccer

tank

aircraft

car

waterfall

vehicle

fire

bird

georgebush

animal

clinton

golf

city

chair

truck

tonyblair

cycling

footballtennis

drawing

desert

fish

table

Page 28: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

-1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

walkingmarching

snow

cloud

tower

doggrass

government

smoke

entertainment

boat

tree

bus

monologue

weather

cartoon

court

riverface

religious

house

chart

food

mountain

male

graphics

sport

candle

sky

prisoner

studio

basketball

powell

meeting

screen

flagoffice

motorbike

female

weapon

leader

vegetation

crowd

explosion

map

racing

disaster

road

violence

people

bicycle

beach

police

building

water

newspaper

military

soccer

tankaircraft

car

waterfall

vehicle

fire

bird

georgebush

animal

clinton

golf

city

chair

truck

tonyblair

cyclingfootballtennis

drawing

desert

fish

table

0.5 0.55 0.6 0.65 0.7 0.75-0.28

-0.26

-0.24

-0.22

-0.2

-0.18

-0.16

-0.14

walking

marching

snow

cloud

tower

dog grass

government

smoke

entertainment

boat

tree

bus

monologue

weather

cartoon

court

river

face

religious

house

chart

food

mountain

male

graphics

sport

candle

sky

prisoner

studio

basketball

powell

meeting

screen

flag

office

motorbike

female

weapon

leader

vegetation

crowd

explosion

map

racing

disaster

road

violence

people

bicycle

beach

police

building

water

newspaper

military

soccer

tank

aircraft

car

waterfall

vehicle

fire

bird

georgebush

animal

clinton

golf

city

chair

truck

tonyblair

cycling

footballtennis

drawing

desert

fish

table

Page 29: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

-1.2 -1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

walkingmarching

snow

cloud

tower

doggrass

government

smoke

entertainment

boat

tree

bus

monologue

weather

cartoon

court

riverface

religious

house

chart

food

mountain

male

graphics

sport

candle

sky

prisoner

studio

basketball

powell

meeting

screen

flagoffice

motorbike

female

weapon

leader

vegetation

crowd

explosion

map

racing

disaster

road

violence

people

bicycle

beach

police

building

water

newspaper

military

soccer

tankaircraft

car

waterfall

vehicle

fire

bird

georgebush

animal

clinton

golf

city

chair

truck

tonyblair

cyclingfootballtennis

drawing

desert

fish

table

Page 30: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Contextual Reasoning

• How concepts occur together

Page 31: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Weakness of Conventional Context

Reasoning Approaches Pair-wise correlation measurements are locally determined

Human annotations are always forgetful

car, road, people,

building…

car, road…

car, road, vehicle,

trees…

vehicle car road water …

Vehicle is easy to be ignored by

annotators when they are annotating a

keyframe with car presented.

[J.R. Kender, ICME07]

User Annotation Locally determined

Page 32: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Contextual Reasoning

• Context Space – a computable space for

contextual reasoning

Context Space

B2

B1

boat

car

vehicle

road

water

Concept Labels

(e.g., LSCOM)

Page 33: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Using SS and CS for Query-to-

Concept mapping

find vehicles

on the way

SS

vehicle road

OS

vehicle

road

car

car_on_road

water

boat

Page 34: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Using SS and CS for Query-to-

Concept mapping

Find shots of a person walking or riding a bicycle

Anchor

concepts

Page 35: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Visual Mapping Visual Mapping

Detectors Detection Scores

Car 0.9 0.8 0.75 0.1 0.1

Road 0.8 0.9 0.65 0.2 0.15

Person 0.1 0.2 0.13 0.9 0.8

Signed Fisher Ratio

Query images as Positive Samples Positive Samples

Page 36: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Fusing Multiple Clues

• Semantic reasoning

• Contextual reasoning

• Visual mapping

Page 37: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Framework of our method

Random Walk

Page 38: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Context Graph A partial view of Context Graph

Edge width indicates the strength of contextual relationship

(a)

(b)(c)

Logos_Full_Screen

Network_Logo

Text_On_Artificial_Background

Ties

Male_Reporter

Interview_Sequences

Speaking_To_Camera

Male_Anchor

Computer_TV-screen

Studio

News_Studio

Computer_Or_Television_Screens

Female_Anchor

Commentator_Or_Studio_Expert

Studio_With_Anchorperson

Armed_Person

Ground_Combat

Military_Personnel

Shooting

Street_Battle

Weapons

Armored_Vehicles

Tanks

Machine_Guns

Military

Rifles

Soldiers

Exploding_Ordinance

Explosion_FireSmoke

Windy

(c)

Ground_Vehicles

Road

Ground Transportation

Factory

Smoke_Stack

Observation_Tower

Television_Tower

Tower

Oil_Drilling_Site

Oil_Field

Pipes

Coal_Powerplants

Power_Plant

Processing_Plant

Urban_Scenes

Building

Urban Life

Outdoor

Face

Group

Crowd

Page 39: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Context Graph

A middle-sized swarm in the Graph

Edge width indicates the strength of contextual relationship

Beach

Oceans

Boat_Ship

Harbors

Lakes

River

Rowboat

Waterscape_Waterfront

WaterwaysCanoe

Radar

Raft

River_Bank

Freighter

Ship

Page 40: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Random Walk Walker -> the information from the query (semantic & visual)

Transition probability -> contextual similarity (whose strength is indicated by edge width)

Birds

Flying_

Objects

Airplane Sky

Airplane

_Flying

Vehicle

.7

.9 1

0 0

0

.9 .8

.6

0

.3

.95 1

.04 .17

.93

.97 .91

.2

.08 .09

0 1(a) (b)

River

0

Car

Cloud

AnimalTower

Birds

Flying_

Objects

Airplane Sky

Airplane

_Flying

Vehicle

RiverCar

Cloud

AnimalTower

Concepts by semantic mapping Concepts by visual mapping

propagation

Page 41: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Demo

Page 42: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Ongoing Researches

• Multi-modality fusion

• Social networks analysis

– Community detection

Page 43: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Twitter Topics

Page 44: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations
Page 45: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations
Page 46: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Related Applications

• Patch space model

– Near duplicate detection

– AD/Geo/Historical info delivery

• Object tracking

– Vehicle detection and tracking

– Surveillance video analysis

• Story tracker

• Tobacco sorter

Page 47: Multimedia Computing and Its Applicationsxiaoyong/download/multimedia.pdf · Multimedia Level Query Query Query S e m a n t i c G a p Text Image Motion Audio Low-Level Representations

Thanks!