m1gp shimizu - darwin phone

DARWIN PHONES: THE EVOLUTION OFSENSING AND INFERENCEON MOBILE PHONESEMILIANO MILUZZO, CORY T. CORNELIUS, ASHWIN RAMASWAMY,TANZEEM CHOUDHURY, ZHIGANG LIU, ANDREW T. CAMPBELL

MOBISYS 2010

PRESENTER: KAZUTO SHIMIZU SEZAKI LAB, DEPT. OF IST, UNIV. OF TOKYO

INTRODUCTION

Fortunately,

the presentation the author used at Mobisys 2010 is available on the web site.

http://www.cs.dartmouth.edu/~miluzzo/publications.html




INTRODUCTION

Fortunately,



So,




INTRODUCTION

Fortunately,



Experience Top Conference Quality from Now!!




Darwin Phones: the Evolution of Sensing and Inference on

Mobile Phones

Emiliano Miluzzo*, Cory T. Cornelius*, Ashwin Ramaswamy*, Tanzeem Choudhury*, Zhigang Liu**,

Andrew T. Campbell*

* CS Department – Dartmouth College** Nokia Research Center – Palo Alto

[email protected] Miluzzo

ok… so what ??



density


accelerometer

digital compass

microphone

WiFi/bluetooth GPS

….

light sensor/camera

sensing


accelerometer

digital compass

microphone

WiFi/bluetooth GPS

light sensor/camera

gyroscope

air quality /pollution sensor

sensing….


- free SDK

- multitasking

programmability


- 600 MHz CPU

- up to 1GB application memory

hardware

computation capability is increasing


application distribution

collect huge amount of data for research

purposes

deploy apps onto millions of phones at

the blink of an eye


cloud infrastructure

cloud - backend support




we want to push intelligence to the

phone




preserve the phone user experience

(battery lifetime, ability to make calls, etc.)




- sensing

- run machine learning algorithms locally

(feature extraction + inference)




- sensing



run machine learningalgorithms (learning)




store and crunch big data(fusion)

run machine learningalgorithms (learning)

- sensing




sensingprogrammability



sensingprogrammability


??


societal scale sensing

global mobile sensor network

reality mining using mobile phones

will play a big role in the future

end of PR – now darwin

Emiliano Miluzzo [email protected]

a small building block towards the big vision

Emiliano Miluzzo [email protected]


microphone

camera

GPS/WiFi/cellular

air quality pollution

sensing apps

social context

audio / pollution / RF fingerprinting

image / video manipulation

darwin applies distributed computing and collaborative inference concepts to

mobile sensing systems

darwin

- classification model evolution

- classification model pooling

- collaborative inference

why darwin?


mobile phone sensing today

why darwin?


deploy classifier X

deploy classifier X’


train classification model X’ in the lab

train classification model X in the lab

why darwin?


train classification model X in the lab deploy classifier X

deploy classifier X’

a fully supervised approach doesn’t

scale!


train classification model X’ in the lab

why darwin? a same classifier does not scale to multiple

environments (e.g., quiet and noisy env)


why darwin? a same classifier does not scale to multiple

environments (e.g., quiet and noisy env)


darwin creates new classification models transparently from the user

(classification model evolution)


why darwin?

ability for an application torapidly scale to many devices


why darwin?

ability for an application torapidly scale to many devices

darwin re-uses classification models when possible

(classification model pooling)


why darwin?

leverage the large ensemble of in-situ resources


why darwin?

leverage the large ensemble of in-situ resources

darwin exploits spatial diversity and co-operate to alleviate the “sensing context”

problem(collaborative inference)


darwin design


speaker recognition (subject to audio noise, sensing context, etc.)


darwin phases


darwin phases

initial training (derive model seed)supervised


darwin phases

initial training (derive model seed)

classification model evolution

supervised

unsupervised


darwin phases



classification model pooling

supervised

unsupervised


darwin phases




collaborative inference

supervised

unsupervised


classification model training

sensed event



sensed event

filtering (silence suppression +

voicing)



sensed event


voicing)

featureextraction(MFCC)




voicing)

featureextraction(MFCC)

modeltraining(GMM)

model

baseline

sensed event

send model + baseline back to phone

send MFCC tobackend to train the model

backend



phone: determines when to evolve




training




training sampled




match?

YES

do not evolve




match?

NO

evolve(train new model using

backend as before)



new speaker voice model

training



Speaker A’s model

Phone A Phone B

Phone C

Speaker C’s model

Speaker B’s modelSpeaker B’s model

Speaker C’s model



Speaker A’s model

Phone A Phone B

Phone C

Speaker C’s model


Speaker C’s model

we have two options

1. train a new classifier for each speaker (costly for power, inference delay)

2. re-use already available classifiers



Speaker A’s model

Phone A Phone B

Phone C

Speaker C’s model


Speaker C’s model



Speaker A’s model

Phone A Phone B

Phone C

Speaker B’s model

Speaker C’s model

Speaker C’s model

Speaker A’s model

Speaker B’s model

Speaker B’s model

Speaker A’s model

Speaker C’s model



Speaker A’s model

Phone A Phone B

Phone C

Speaker B’s model

Speaker C’s model

Speaker C’s model

Speaker A’s model

Speaker B’s model

Speaker B’s model

Speaker A’s model

Speaker C’s model

ready to run the collaborative inference algorithm

- local inference first- final inference later



two phases



1. local inference (running independently in parallel on each mobile phone)

two phases



1. local inference (running independently in parallel on each mobile phone)

two phases

2. final inference (after collecting Local Inference results, to get better confidence about the final classification result)

local inference (LI)



Phone A Phone B

Phone C



Phone A Phone B

Phone C

speaker A speaking!!!local inference (LI)



Phone A Phone B

Phone C


A’s LI results:Prob(A speaking) = 0.65Prob(B speaking) = 0.25Prob(C speaking) = 0.10

C’s LI results:Prob(A speaking) = 0.30Prob(B speaking) = 0.67Prob(C speaking) = 0.03

B’s LI results:Prob(A speaking) = 0.79Prob(B speaking) = 0.11Prob(C speaking) = 0.10



Phone A Phone B

Phone C

speaker A speaking!!!


local inference (LI)





Phone A Phone B

Phone C







Phone A Phone B

Phone C





individual classification can be misleading!

final inference (FI)



Phone A Phone B

Phone C

each phone gathers LI results

A’s LI results

C’s LI results

B’s LI results

A’s LI results A’s LI results

C’s LI results C’s LI results

B’s LI resultsB’s LI results



collaborative inferenceon each phone








xxx

xxx







xxx

xxx

FI results (normalized):Confidence (A speaking) = 1 Confidence (B speaking) =

0.12Confidence (C speaking) =

0.002

=







xxx

xxx

=FI results (normalized):Confidence (A speaking) = 1 Confidence (B speaking) =


0.002







xxx

xxx

=

collaborative inference compensates the inaccuracies of individual

inferences

FI results (normalized):Confidence (A speaking) = 1 Confidence (B speaking) =


0.002




evaluation


evaluation

C/C++ &

implemented on Nokia N97 andiPhone in support of a speaker

recognition app


evaluation

C/C++ &

unix server

lightweight reliable protocol to transfer models from the server

and between phones


recognition app


evaluation

C/C++ &

UDP multicast protocol to distribute

local inference results between phones


recognition app


experimental scenarios

up to eight people in conversation in three different scenarios (quiet indoor, down the

street, in a restaurant)


some numerical results


need for evolution

train indoor, evaluate outdoor


need for evolution

accuracy improvement after evolution

accuracy


indoor quiet scenario

8 people talking around a table


indoor quiet scenario

8 people talking around a table

collaborative inference + classification model evolution

boost the performance of a mobile sensing app


impact of the number of mobile phones


impact of the number of mobile phones

the larger the number of mobile phones collaborating, the better the final inference result


battery lifetime Vs inference responsiveness


battery lifetime Vs inference responsiveness

smart duty-cycling techniques and machine learning algorithms with better performance in

terms of energy usage on mobile phones need to be identified

PERSONAL OPINION

Contribution-Implemented the modality of unsupervised labeling

-Built & implemented concept of collaborative sensing

Merit-Drastic improve of accuracy

-Shorten learning time

Future work-Energy management

-Machine resource

THANK YOU

REFERENCEEmiliano Miluzzo, Cory T. Cornelius, Ashwin Ramaswamy, Tanzeem Choudhury, Zhigang Liu, Andrew T. Campbell.

“Darwin Phone:the Evolution of Sensing and Inference on Mobile Phones,”


Talk(ppt), pdf, video, press available



APPENDIX

Emiliano Miluzzo (Ph.D)

Andrew T. Campbell (Professor) etc…

Mobile Sensing Group, Dartmouth College, Hanover, NH, USA

http://sensorlab.cs.dartmouth.edu/index.html

AUTHOR BACKGROUND

RELATED WORK

Sensor node co-operation

Semi-supervised machine learning

Heterogeneous sensing device collaboration

Sensing applications on mobile phone

Paper # 24,33,36-38,45,49,52

28,41,50 15,25 8-10,13,19,21,27,29,31,35

Existing Static sensor network

On PC Only borrow data Individual device

Darwin Mobile sensor network

Applied to mobile phone

Execute and share individual inference

Multi device collaboration

MACHINE PERFORMANCE

SAMPLE APPLICATION

Speaker Model Computation

→MFCC feature extraction 　 (Mel Frequency Cepstram Coefficient,　　メル周波数ケプストラム係数 )

• Leading approach for speech feature extraction [16,17,42]• Emphasize the part human use

Machine learning algorithm

→GMM (Gaussian Mixture Model)• Common to unsupervised machine learning

PRIVACY & SECURITY

- Store and share not raw data but model & feature (of course protected)

- User can opt in and out anytime

- Darwin meets

1. Run on trusted device

2. Subscribe to trusted system

3. Run on trusted application i.e. pre-installed or downloaded from trusted third party.

COLLABORATIVE INFERENCE

Individual Inference

LI = {Speaker1, Speaker2, .. ,Speaker_k}

Final Inference

EVALUATION SETTING

Situation• 5 phones• 8 people used• Several hours a day• 2 weeks

Voice chunk• Manually labeled to compare

m1gp shimizu - darwin phone

Technology

evolution of sensing

millions of phones

darwin phones

big role

web site

mobile phoneswill

mobile phonesemiliano

evolution ofsensing