mpeg araf tutorial @ ismar 2014

53
MPEG for Augmented Reality ISMAR, September 9, 2014, Munich AR Standards Community Meeting September 12, 2014 Marius Preda, MPEG 3DG Chair Institut Mines TELECOM http://www.slideshare.net/MariusPreda/mpeg-augmented-reality-tutorial

Upload: marius-preda

Post on 02-Jul-2015

343 views

Category:

Science


3 download

DESCRIPTION

A set of slides introducing ARAF - An MPEG standard for Mixed Reality

TRANSCRIPT

Page 1: Mpeg ARAF tutorial @ ISMAR 2014

MPEG for Augmented Reality

ISMAR, September 9, 2014, Munich

AR Standards Community Meeting September 12, 2014

Marius Preda, MPEG 3DG Chair

Institut Mines TELECOM

http://www.slideshare.net/MariusPreda/mpeg-augmented-reality-tutorial

Page 2: Mpeg ARAF tutorial @ ISMAR 2014

What you will learn today

• Who is MPEG and why MPEG is doing AR

• MPEG ARAF design principles and the main features

• Create ARAF experiences: two exercises

Page 3: Mpeg ARAF tutorial @ ISMAR 2014

Tidy City

Page 4: Mpeg ARAF tutorial @ ISMAR 2014

Portal Hunt

Page 5: Mpeg ARAF tutorial @ ISMAR 2014

Elements

Page 6: Mpeg ARAF tutorial @ ISMAR 2014

ARQuiz

Page 7: Mpeg ARAF tutorial @ ISMAR 2014

Augmented Books

Page 8: Mpeg ARAF tutorial @ ISMAR 2014

Event LOOV

Available on AppStore, AndroidStores and MyMultimediaWorld.com

• Collecting virtual money in real world for buying real

services and products

Page 9: Mpeg ARAF tutorial @ ISMAR 2014

Summer School (1 week) Games

Page 10: Mpeg ARAF tutorial @ ISMAR 2014

What is common in these "games" ?

Based on MPEG ARAF

Augmented Reality Application Format

Page 11: Mpeg ARAF tutorial @ ISMAR 2014

Why MPEG AR?

MPEG Augmented Reality

Page 12: Mpeg ARAF tutorial @ ISMAR 2014

Answers to (some) of Christine’s (non-technical) questions

• Who is MPEG?

• What MPEG does successfully?

• Who are the members?

• IPR policy

Page 13: Mpeg ARAF tutorial @ ISMAR 2014

What is MPEG?

A suite of ~130 ISO/IEC standards for:

•Coding/compression of elementary media: • Audio (MPEG-1, 2 and 4), Video (MPEG-1, 2 and 4), 2D/3D graphics (MPEG-4)

• Transport • MPEG-2 Transport, File Format, Dynamic Adaptive Streaming over HTTP (DASH)

• Hybrid (natural & synthetic) scene description, user interaction (MPEG-4)• Metadata (MPEG-7)• Media management and protection (MPEG-21)• Sensors and actuators, Virtual Worlds (MPEG-V)• Advanced User interaction (MPEG-U)• Media-oriented middleware (MPEG-M)

More ISO/IEC standards under development for• Coding and Delivery in Heterogeneous Environments (incl.)• 3DVideo •…

Page 14: Mpeg ARAF tutorial @ ISMAR 2014

• A standardization activity continuing for 25 years,

– Supported by several hundreds companies/organisations from ~25 countries

– ~500 experts participating in quarterly meetings

– More than 2300 active contributors

– Many thousands experts working in companies

• A proven manner to organize the work to deliver useful and used standards

– Developing standards by integrating individual technologies

– Well defined procedures

– Subgroups with clear objectives

– Ad hoc groups continuing coordinated work between meetings

• MPEG standards are widely referenced by industry

– 3GPP, ARIB, ATSC, DVB, DVD-Forum, BDA, EITSI, SCTE, TIA, DLNA, DECE, OIPF…

• Billions of software and hardware devices built on MPEG technologies

– MP3 players, cameras, mobile handsets, PCs, DVD/Blue-Ray players, STBs, TVs, …

• Business friendly IPR policy established at ISO level

What is MPEG?

Page 15: Mpeg ARAF tutorial @ ISMAR 2014

MPEG technologies related to AR: 1st pillar

MPEG-1/2(AV content)

1992/4

VRML

1997

• Part 11 - BIFS:-Binarisation of VRML-Extensions for streaming-Extensions for server command-Extensions for 2D graphics- Real time augmentation with

audio & video• Part 2 - Visual:

- 3D Mesh compression- Face animation

1998

• Part 2 – Visual- Body animation

1999

MPEG-4 v.1

MPEG-4 v.2

First form of broadcast signal augmentation

Page 16: Mpeg ARAF tutorial @ ISMAR 2014

MPEG-4

2003

•AFX 2nd Edition:- Animation by

morphing- Multi-texturing

2005

• AFX 3rd Edition- WSS for terrain

and cities- Frame based

animation

2007

MPEG-4

MPEG-4

• Part 16 - AFX:- A rich set of 3D

graphics tools- Compression of

geometry, appearance,animation

• AFX 4th Edition- Scalable complexity

mesh coding

2011

MPEG-4

A rich set of Scene and Graphics

representation and compression tools

MPEG technologies related to AR: 1st pillar

Page 17: Mpeg ARAF tutorial @ ISMAR 2014

MPEG technologies related to AR: 2nd pillar

MPEG-V - Media Context and Control

2011

• 2nd Edition:- GPS- Biosensors- 3D Camera

2013

• Compression of video + depth

2014

MPEG-V

- 3D Video

• 1st Edition - Sensors and

actuators- Interoperability

between VirtualWorlds

• Feature-point based descriptors for image recognition

201x

CDVS

MPEG-U –Advanced User Interface

2012

A rich set of Sensors and Actuators

- 3D Audio

MPEG-H

Page 18: Mpeg ARAF tutorial @ ISMAR 2014

MPEG technologies related to AR: 2nd pillar

MPEG-V – Media Context and Control

Page 19: Mpeg ARAF tutorial @ ISMAR 2014

ActuatorsLight

Flash

Heating

Cooling

Wind

Vibration

Sprayer

Scent

Fog

Color correction

Initialize color correction parameter

Rigid body motion

Tactile

Kinesthetic

Global position command

SensorsLight

Ambient noise

Temperature

Humidity

Distance

Atmospheric pressure

Position

Velocity

Acceleration

Orientation

Angular velocity

Angular acceleration

Force

Torque

Pressure

Motion

Intelligent camera type

Multi Interaction point

Gaze tracking

Wind

Dust

Body height

Body weight

Body temperature

Body fat

Blood type

Blood pressure

Blood sugar

Blood oxygen

Heart rate

Electrograph

EEG , ECG, EMG, EOG , GSR

Weather

Facial expression

Facial morphology

Facial expression characteristics

Geomagnetic

Global position

Altitude

Bend

Gas

MPEG technologies related to AR: 2nd pillar

MPEG-V – Media Context and Control

Page 20: Mpeg ARAF tutorial @ ISMAR 2014

• All AR-related data is available from MPEG standards

• Real time composition of synthetic and natural objects

• Access to

– Remotely/locally stored scene/compressed 2D/3D mesh objects

– Streamed real-time scene/compressed 2D/3D mesh objects

• Inherent object scalability (e.g. for streaming)

• User interaction & server generated scene changes

• Physical context

– Captured by a broad range of standard sensors

– Affected by a broad range of standard actuators

Main features of MPEG AR technologies

Page 21: Mpeg ARAF tutorial @ ISMAR 2014

MPEG vision on AR

MPEG-4/MPEG-7/MPEG-21/MPEG-U/MPEG-V

MPEG Player

CompressionAuthoring Tool

Produce

Download

ARAF

Page 22: Mpeg ARAF tutorial @ ISMAR 2014

MPEG vision on AR

MPEG-4/MPEG-7/MPEG-21/MPEG-U/MPEG-V

ARAF Browser

CompressionAuthoring Tool

Produce

Download

ARAF

Page 23: Mpeg ARAF tutorial @ ISMAR 2014

End to end chain

ARAF Browser

MediaServers

ServiceServers

User

LocalSensors & Actuators

RemoteSensors & Actuators

MPEG ARAF

Local Real World

Environment

Remote Real World

Environment

AuthoringTools

Page 24: Mpeg ARAF tutorial @ ISMAR 2014

• A set of scene graph nodes/protos as defined in MPEG-4 Part 11

– Existing nodes : Audio, image, video, graphics, programming, communication, user interactivity, animation

– New standard PROTOs : Map, MapMarker, Overlay, Local & Remote Recognition, Local & Remote Registration, CameraCalibration, AugmentedRegion, Point of Interest

• Connection to sensors and actuators as defined in MPEG-V

– Orientation, Position, Angular Velocity, Acceleration, GPS, Geomagnetic, Altitude

– Local or/and remote camera sensor

– Flash, Heating, Cooling, Wind, Sprayer, Scent, Fog, RigidBodyMotion, Kinestetic

• Compressed media

Three main components: scene, sensors/actuators, media

MPEG-A Part 13 ARAF

Page 25: Mpeg ARAF tutorial @ ISMAR 2014

Scene: 73 XML Elements

MPEG-A Part 13 ARAF

Documentation available online:

http://wg11.sc29.org/augmentedReality/

Page 26: Mpeg ARAF tutorial @ ISMAR 2014

Event LOOV, how it looks like?

Page 27: Mpeg ARAF tutorial @ ISMAR 2014

Exercises

MPEG-A Part 13 ARAF

AR Quiz Augmented Book

Page 28: Mpeg ARAF tutorial @ ISMAR 2014

Exercises

MPEG-A Part 13 ARAF

AR Quiz Augmented Book

http://youtu.be/LXZUbAFPP-Yhttp://youtu.be/la-Oez0aaHE

Page 29: Mpeg ARAF tutorial @ ISMAR 2014

AR Quiz setting, preparing the medias

MPEG-A Part 13 ARAF

images, videos, audios, 2D/3D assets

GPS location

Page 30: Mpeg ARAF tutorial @ ISMAR 2014

AR Quiz XML inspection

MPEG-A Part 13 ARAF

http://tiny.cc/MPEGARQuiz

Page 31: Mpeg ARAF tutorial @ ISMAR 2014

AR Quiz Authoring Tool

MPEG-A Part 13 ARAF

www.MyMultimediaWorld.com go to Create / Augmented Reality

Page 32: Mpeg ARAF tutorial @ ISMAR 2014

Augmented Book setting

MPEG-A Part 13 ARAF

images, audios

Page 33: Mpeg ARAF tutorial @ ISMAR 2014

Augmented Book XML inspection

MPEG-A Part 13 ARAF

http://tiny.cc/MPEGAugBook

Page 34: Mpeg ARAF tutorial @ ISMAR 2014

Augmented Book Authoring Tool

MPEG-A Part 13 ARAF

www.MyMultimediaWorld.com go to Create / Augmented Books

Page 35: Mpeg ARAF tutorial @ ISMAR 2014

• ARAF Browser is Open Source

– iOS, Android, WS, Linux

– distributed at www.MyMultimediaWorld.com

• ARAF V1 published early 2014

• ARAF V2 in progress

– Visual Search (client side and server side)

– 3D Video, 3D Audio

– Connection to Social Networks

– Connection to POI servers

Conclusions

Page 36: Mpeg ARAF tutorial @ ISMAR 2014

• Other slides that may help

Page 37: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF 2nd Edition

Page 38: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF 2nd Edition, items under discussion

1. Local vs Remote recognition and tracking

2. Social Networks

3. 3D video

4. 3D audio

Page 39: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

Server side object recognition: a real system*

Client Server

Query image

[Extraction]Descriptors

[Detection]Key points

HTTP POST(binary descriptor +

key points)

Query descriptors

DB descriptors

Matching

ID

Corresponding Information

Error/no message

Data as String

Parse and display the

answer

Decode

Decode

(1)

(2.2)

(2.1)

(3.1)

(3.2)

HTTP Response

Descriptors, images and information

[DB]

(4)

(5.1)

(5.2)(6)

(7)

(8’)

(8’’)

(9)(10)

Binary Data

* Wine recognizer : GooT and IMT

Page 40: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

Server side object recognition: ARAF version

MAR Scene

ARAF Browser

End-user Device

Video stream Video

source

Source(video URL)

optional: recognition region

Processing Server URLs

Video stream

ProcessingServers

Media data

Binary (base64) key points + descriptors

Detection Library

Detection Library

Detection Library

Image Recognition

Libraries

MAR Experience Creator + Content Creator

Large Image DB

Corresponding media

DB

ORB

Page 41: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

Server side object recognition: ARAF version

Discussions on:

- Does the content creator specify the form of request (full image or descriptors) or the browser will take the best decision?

- Is the server’s answer formalized in ARAF?

Page 42: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF – Social Network Data in ARAF scene

Scenario: display posts from SN in a geo-localized manner

ARAF can do this directly by programming the access to the SN service at the scene level

Page 43: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF – Social Network Data in ARAF scene

At minimum, user login to SN - at maximum : the MPEG UD

Page 44: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF – Social Network Data in ARAF scene

Connect to an UD server to get all the necessary data

Page 45: Mpeg ARAF tutorial @ ISMAR 2014

Two categories of “SNS Data”

– Static data• Name, photo, email, phone number, address,

sex, interest, …– Social Network related activity

• Reported location, SNS post title, SNS text, SNS media, SNS media

MPEG 3DG Report

ARAF – Social Network scenario

Obtained from the UD server

Page 46: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF 2nd Edition – introducing 3D Video

Modeling of 3 AR classes for 3D video:

1.Pre-created 3D model of the environment, using visual search and other sensors to obtain camera position and orientation; 3D video used for handle occlusions

2.No a priori 3D model of the scene, depth captured in real-time and used to handle occlusions at the rendering step

3.No a priori model of the scene but created during AR experience (SLAM – Simultaneous Location and Mapping)

Page 48: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF – 3DAudio : local spatialisation

MAR

Experience

Creator +

Content Creator

Scene

ARAF Browser

Mobile device

Camera

Video/audio

stream

Coordination

mapping

Sensed

data

Position & orientationsensor

3D Audio

Engine

Relative sound location + (Acoustic scene) + audio

sourceSpatialized

audio sourceV

ideo

/aud

io

stream

User location & direction + sound location

ARAF file

Microphone

MixerSynthesized audio stream

Page 49: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF – 3DAudio : remote spatialisation

Scene

ARAF Browser

Mobile device

Camera

Video/audio

stream

Coordination

mapping

Sensed

data

Position & orientationsensor

video

/aud

io

stream

ProxyServer

Detection Library

Detection Library

Detection Library

3D Audio Engine

Relative sound location + Audio source + (Acoustic scene)

Spatialized audio source

MAR

Experience

Creator +

Content

Creator

Processing Server URL

ARAF file

User location & direction + sound location

Microphone

MixerSynthesized audio stream

Page 50: Mpeg ARAF tutorial @ ISMAR 2014

MAR Experience Creator +

Content Creator

Target Resources or descriptors

Scene

ARAF Browser

Mobile device

Microphone/audio stream

Target Resources

ID Mask

Audio source

Source (microphone/audio URL) Detection Library

Detection Library

Detection Library

Audio Detection

Libraryoptional: detection window, sampling rate, detection delay

MPEG 3DG Report

ARAF – Audio recognition: local

Page 51: Mpeg ARAF tutorial @ ISMAR 2014

MAR Experience Creator + Content Creator

Target Resources or descriptors

Scene

ARAF Browser

Mobile device

Microphone/audio stream

Audio source

Source (microphone/audio URL)

optional: detection window, sampling rate, detection delay

ProxyServer

Detection Library

Detection Library

Detection Library

AudioDetection

Library

ID Mask

URL of Processing Server

Target Resources or descriptors + IDs+ optional detection window, sampling rate, detection delay

MPEG 3DG Report

ARAF – Audio recognition: local

Page 52: Mpeg ARAF tutorial @ ISMAR 2014

MAR Experience Creator + Content Creator

Target Resources or descriptors

Scene

ARAF Browser

Mobile device

Audio source

Source (microphone/audio URL)

optional: detection window, sampling rate, detection delay

ProcessingServer

Detection Library

Detection Library

Detection Library

AudioDetection

Library

ID Mask

URL of Processing ServerDescriptor Extraction

Microphone/audio stream Descriptors

Target Resources or descriptors + IDs+ optional detection window, sampling rate, detection delay

MPEG 3DG Report

ARAF – Audio recognition: local

Page 53: Mpeg ARAF tutorial @ ISMAR 2014

MPEG 3DG Report

ARAF – joint meeting with 3DAudio

Spatialisation Recognition

• The 3D audio renderer

needs an API to get the

user position and

orientation

• It may be more

complex to update in

real time position and

orientation of all the

acoustic objects

• MPEG-7 has several

tools for audio

fingerprint

• Investigate the

ongoing work on

“Audio

synchronisation” and

check if it is suitable

for AR