towards real-time camera based logos detection mathieu delalandre laboratory of computer science,...

Towards real-time camera based logos detection

Mathieu Delalandre

Laboratory of Computer Science, RFAI group,

Tours city, France

Osaka Prefecture Partnership meeting

Tours city, France

Friday 9th of September 2011

1


1. Introduction

2. Devices synchronization for 3D frame tagging

3. Frame partitioning and selection

2

Towards real-time camera based logos detection“Introduction” (1)

Logo detection from video capture using some handled interactions,to display context based information (tourist check points, bus stop, meal, etc.).

This constitutes a hard computer vision application, due to the complexity of the recognition task and the real time constraints.

To support the real time, two basic paths could be considered1.To reduce complexity of the algorithms2.To reduce the amount of data

Camera Selection

Pattern Recognition

Frames

Frames

3

Towards real-time camera based logos detection“Introduction” (2)

Static object: without motion and appearance modification

Dynamic object: with motion then with appearance modification

“With static objects, one capture (in time and space) could be enough for recognition, if recognition is perspective, scale and rotation invariant and if occlusions neither appear”

“Capture instance could be detected if the embedded system can track its own positioning, and if objects are static”

“Then, self-tracking embedded system can be set for single capture of static objects. It can support real-time recognition by reducing the amount of data to process, without miss-case (i.e. one capture is here, at least)”

is object

is camera

t0 t1 t2

4


1. Introduction



5

Towards real-time camera based logos detection“Device synchronization for 3D frame tagging” (1)

Camera device, to capture images

Accelerometer device, that measures proper acceleration.

Gyroscope device, for measuring or maintaining orientation

The combination of these devices allows to tag frames in 3D space.

6

x, y, z

the frame

Embedded system orientation

Embedded system positioning (from root)

Frame coordinates

d

Towards real-time camera based logos detection“Devices synchronization for 3D frame tagging” (2)

Most of the commercial wearable systems (e.g. smartphones) can support frame tagging, but the multimodality is designed in a separate way, not in the sense of combination of these modalities. The device synchronization at hardware level is not done, and must achieved at the operating system level.How to do it ?

Device controller

CPU

data

control

Memory

data

control

Polling exchange with device (accelerometer, gyroscope)

DMA exchange with device (camera)

Device controller

DMA

data

control

Memory

data

control

CPU

control interrupt

real-lifeevent (tE)

memory writing (tw)

t

value depends of the device, considering-Acquisition delay of the device-Data transference time on bus-Execution time of control instruction-Interrupt execution time -Etc.

value is an estimation, it depends of -Mean access bus rate-operating scheduling and interrupt queuing-Etc.

wE tt

,

7


is the “root” and interrupt based device, every device will synchronize itself with it

The device to be synchronized with the root device

Ti0 The “coarse” timer, in charge of the “root” device at level 0

T0 Period of timer 0

Ti1 The “finer” timer, in charge of the device to synchronize at level 1

T1 Period of timer 1, with

L1 is frame length for T1, N the whished synchronization precision, the bounded parameter

I0 Is the first interrupt time

111 ,D

000 ,D

Root DeviceD0

DeviceD1

NLT 1

1

i

iiL max1

e.g. At I0, run Ti0

Every T0, run Ti1

01 TT

Synchronization will be done using a two timers framework

- The “coarse” timer will be scheduled on the root device- The “finer” timer will be used within a “upstream” frame, to be opened previously to the next “coarse” timer period. It will allow to catch events of the device to be synchronized

tE0

tE1

I0

T0

L1

t

I0+T0

Ti0

Ti1

I0+2T0

8


t

s+(k=1)T1s=I0+T0 s+(k=2)T1 s+(k=2)T1 s+(k=3)T1

I10000 wE tt

1111 wE tt

Ti1

General synchronization algorithm of the Ti1 timerk = 0Every T1 period k = k+1

When Ii occurs

1111 wE tkt

0000 wE tt

kEEE kttk 101 min

is the “root” and interrupt based device, every device will synchronize itself with it

The device to be synchronized with the root device

Ti0 The “coarse” timer, in charge of the “root” device at level 0

T0 Period of timer 0

Ti1 The “finer” timer, in charge of the device to synchronize at level 1

T1 Period of timer 1, with

L1 is frame length for T1, N the whished synchronization precision, the bounded parameter

I0 Is the first interrupt time

111 ,D

000 ,D

Root DeviceD0

DeviceD1

NLT 1

1

i

iiL max101 TT

tE0

tE1

9


1. Introduction



10

Towards real-time camera based logos detection“Frame partitioning and selection” (1)

Device synchronization can support 3D image tagging

The open problems now are how to detect overlapping between frames, how to achieve the frame selection in case of overlapping ,and how to access the obtained partition.

11

k

kF

kj

ji FRP

jji RsR

is the set of frame

is the intersection polygon and set of regions, such as ,obtained next to overlapping

Is the selection method

F1

F2

F3

F4

P1P3

P4P5

P6

P7

P2

22112 , FRFRP

kj FR

e.g.

x, y, z

the frame

orientationPositioning

Frame coordinates

d


12

To detect the overlapping, frames can be projected into a plane D to be computed with line intersection and closed polygon detection algorithms at complexity kO(nlog(n)).

P

To do it, it is necessary to fix the position of P in the 3D space and define an updating protocol

F1 F2

n

iiiiii dzyxF

nD

1

,,,1

P can be obtained by meaning positioning of frames

Updating of positioning is not necessary at any frame capture, only when important differences start to appear between the current plan and recent frame captures.

D1

D2

t

t1 t2

At t1, D1 is computed from the current frames

At t2, differences between D1 and D2 (corresponding to recent frame captures) is too important, D1 is shifted to D3

D3


13

F1

F2

P1P3

P2

22112 FRorFRP

Once overlapping are detected, at every overlap a region (coming from the overlapping frames) must be selected using a selection method

e.g.

This selection can be done using a spatial criterion

jji RsR

d1

d2

c1

c2

c1, c2 are projected gravity centers of frames

112

21

FRP

ddif

Video frame processing is a producer/consumer synchronization problem, where producer (i.e. frame capture) are blocked on memory constraint, and consumer (i.e. image process) are blocked when the frame stack is empty.

Here, we are working “up” to the frame with partition object. Intelligent access must be driven with RAG (Region Adjacency Graph) structure and graph coloring techniques.

R1F1

R2F1

R3F1

R4F2

R5F2

e.g.

adjacent side

F1

F2

to process together


1. Introduction



14

towards real-time camera based logos detection mathieu delalandre laboratory of computer science,...

Documents

camera device

device camera value

realtime recognition

device accelerometer

imagesaccelerometer

gyroscope device

d space

upstream frame