he wang, xuan bao, romit roy choudhury, srihari nelakuditi visually fingerprinting humans without...

Post on 21-Dec-2015

224 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

He Wang, Xuan Bao, Romit Roy Choudhury, Srihari Nelakuditi

Visually Fingerprinting Humans without Face Recognition

Can a camera identify a person in its view?

Can a camera identify a person in its view?

Face Recognition is a possible solution

But …• Face not always visible• Face is a permanent visual identifier … privacy?

This paper

Human motion (sequence of walk, stops, turns) could also be unique identifiers

Clothing colors are also partial identifiers

{color + motion} = a “spatiotemporal descriptor” of a person

Face may not be the only identifier for a human

Joe Bob Kim

Joe: {my color + my motion}

Bob: {my color + my motion} Kim: {my color + my motion}

Joe Bob Kim

Joe: {my color + my motion}

Bob: {my color + my motion} Kim: {my color + my motion}

???: {his color + his motion}

from video

Joe Bob Kim

Joe: {my color + my motion}

Bob: {my color + my motion} Kim: {my color + my motion}

???: {his color + his motion}

from video

Joe Bob Kim

He is Bob

Many applications if humans can be visually fingerprinted

1. Augmented Reality

Looking for a cofounder

Am skilled in UI design

Aah! I see people’s messages

Bob

James

Kevin Jason

Paul

David

John

Please do not record me.

00:00:10

2. Privacy Preserving Pictures/Videos

A B

C

3. Communicating to Shoppers

To:To:To:

Our System: InSight

{my visual address}

{his visual address}

InSight Server

{my visual address}

==

System Design:Extracting Motion Fingerprint

motion from sensors

motion from video

Difficult to match raw data from sensors and videos

motion from sensors

motion from video

Need to map raw data to common alphabet

String of Motion Alphabet

walk east pause

walk south

walk east

motion string = {E, E, P, S, …}

Comparing Motion Strings

my motion string = {E, E, P, N, …}

my motion string = {E, E, P, S, …}

Bob

Joe

his motion string = {E, E, P, S, …}

from video

Comparing Motion Strings

my motion string = {E, E, P, N, …}

my motion string = {E, E, P, S, …}

Bob

Joe

He is Bob

his motion string = {E, E, P, S, …}

from video

α Step duration

Step phase

Step direction

IsRotating IsWalking

Motion Alphabet

IsPausing

extracting motion from sensor

extracting motion from video

IsRotating IsWalking Step duration Step phase Step direction

𝑔

𝜔𝑔=𝜔g

¿𝑔∨¿¿𝜔

α

αIsRotating ==

α

IsRotating IsWalking Step duration Step phase Step direction

𝑠𝑡𝑑 ¿

α

bagged decision tree

IsWalkingrotation

α

1

2

Raw magnitude reading

Magnitude after filtering

Primary footsteps

Secondary footsteps

∆ 𝑇

𝑇 𝑠𝑡𝑒𝑝=∆𝑇

2

IsRotating IsWalking Step duration Step phase Step direction

Secondary footsteps

Primary footsteps

Magnitude after filtering

Raw magnitude reading

Step phase markers

IsRotating IsWalking Step duration Step phase Step direction

𝑔

𝐻=ℛ𝑎𝑥𝑖𝑠×𝑔ℛ𝑎𝑥𝑖𝑠

magnetic field

8 directions

IsRotating IsWalking Step duration Step phase Step direction

extracting motion from sensor

extracting motion from video

error occlusion

Kalman filter𝑠𝑘=[𝑥𝑘 , 𝑦 𝑘 ,𝑣𝑥𝑘

,𝑣𝑦𝑘]

box association

position, speed, size

Detection and Tracking

IsWalking

IsWalking Step direction Step duration Step phase IsRotating

{h𝑘 }

{𝑣𝑥𝑘,𝑣 𝑦𝑘

}

8 directions

{𝑣𝑥𝑘,𝑣 𝑦𝑘

}

IsWalking Step direction Step duration Step phase IsRotating

{h𝑘 }

𝑅=(𝐼 ∗𝑔∗h𝑒𝑣)2+(𝐼 ∗𝑔∗h𝑜𝑑)

2

2D Gaussian smoothing kernel – Space

1D Gabor filters – Time

× × × × ×

𝑥𝑜 𝑥𝑜 𝑥𝑜 𝑥𝑜 𝑥𝑜

IsWalking Step direction Step duration Step phase IsRotating

Space-Time Interest Points

Space-Time Interest Points

𝑅=(𝐼 ∗𝑔∗h𝑒𝑣)2+(𝐼 ∗𝑔∗h𝑜𝑑)

2

2D Gaussian smoothing kernel – Space

1D Gabor filters – Time

𝑥𝑜 𝑥𝑜 𝑥𝑜 𝑥𝑜 𝑥𝑜

× × × × ×

𝑇 𝑠𝑡𝑒𝑝

IsWalking Step direction Step duration Step phase IsRotatingStep duration Step phase

Step phase markers

not rotating rotating

IsWalking Step direction Step duration Step phase IsRotating

bagged decision tree

IsRotating

distribution features

Space-Time Interest Points

Step duration

Step phase

Step direction

IsWalking

IsRotating

identical or adjacent

a threshold ratio

range limit×

Matching Motion Alphabets

IsPausing

Step duration

Step phase

Step direction

IsWalking

IsRotating

identical or adjacent

a threshold ratio

range limit×

Matching Motion String

IsPausing

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦𝑚𝑜𝑡𝑖𝑜𝑛=h𝑚𝑎𝑡𝑐 𝑒𝑑𝑝𝑎𝑖𝑟𝑠

𝑡𝑜𝑡𝑎𝑙𝑝𝑎𝑖𝑟𝑠

System Design: Extracting Color Fingerprint

HSVRGB

clothing area

pose estimation

Extracting Color Fingerprint

color conversion

HSVRGB

color histogram

spatial distribution

Extracting Color Fingerprint

color conversion Spatiograms

clothing area

pose estimation

𝑠1={𝑛 ,𝜇 ,𝜎 }

𝑠2={𝑛 ′ ,𝜇 ′ ,𝜎 ′ }

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑠𝑝𝑎𝑡𝑖𝑜𝑔𝑟𝑎𝑚𝑠=∑𝑏=1

𝐵

√𝑛𝑏𝑛𝑏′ 8𝜋|Σ𝑏Σ𝑏

′ |14 𝑁 (𝜇𝑏;𝜇𝑏

′ ,2(Σ𝑏+Σ𝑏′ ))

color histograms spatial distributions

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦=𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦𝑚𝑜𝑡𝑖𝑜𝑛+𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑐𝑜𝑙𝑜𝑟

2

Matching Color Fingerprint

𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑐𝑜𝑙𝑜𝑟=𝑆𝑖𝑚𝑖𝑙𝑎𝑟𝑖𝑡𝑦 𝑠𝑝𝑎𝑡𝑖𝑜𝑔𝑟𝑎𝑚𝑠>𝑇 𝑐𝑜𝑙𝑜𝑟

Evaluation

Scenario with real users

Video simulationwith more users and different environments

Evaluation

Scenario with Real Users

Not instructed about clothing

Random request 100 times

Experiment design

12 volunteers

Samsung Android phone

Naturally moved around or paused, and did as they pleased

an area of 20mx15m

Scenario with Real Users

( 10s, 72% )

Motion

Scenario with Real Users

( 6s, 90% )

Motion Color

Video Simulation (for Scale)

Extract motion fingerprint from video:

Label ground truth:

Experiment design

Simulate motion fingerprint from sensor:

outside student union university cafe

record videos of people in public places

ColorMotion 50%40%

Student Union Scenario

40 people, summer, outdoor

Student Union Scenario

40 people, summer, outdoor

Motion Color 90%

(5s, 88%)

(8s, 90%)

Cafe Scenario

15 people, winter, indoor

ColorMotion 0%80%

Cafe Scenario

15 people, winter, indoor

Motion Color 93%

(6s, 80%)

(8s, 93%)

Faces are permanent visual identifiers for humans

Conclusion

Faces are permanent visual identifiers for humans

Conclusion

This paper observes that human clothing and motion patterns can also serve as visual, temporary identifiers.

Faces are permanent visual identifiers for humans

Conclusion

This paper observes that human clothing and motion patterns can also serve as visual, temporary identifiers.

Given rich diversity in human behavior, these spatio-temporal identifiers can be sensed and expressed with a few bits.

In other words, 2 humans are similar only for short segments in space-time, enabling complimentary techniques to face recognition

Human recognition, sans faces, enables various new applications

Questions, Comments?Thank You

top related