guest lecture: sensec - mobile security through behaviometrics

39
1 Jiang Zhu [email protected] September 30 th , 2013 Collaborators: Ben Draffin, Sean Wang, Pang Wu, Joy Ying Zhang * * In alphabetical order

Upload: jiang-zhu

Post on 01-Nov-2014

676 views

Category:

Education


1 download

DESCRIPTION

Guest Lecure 14-829: Mobile Security Course website: http://wnss.sv.cmu.edu/courses/14829/f13/

TRANSCRIPT

Page 1: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

1

Jiang Zhu [email protected]

September 30th, 2013

Collaborators: Ben Draffin, Sean Wang, Pang Wu, Joy Ying Zhang*

* In alphabetical order

Page 2: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

2

North America

Global

0

50

100

150

200

250

300

2011 2011-2016

15.5 58.3

33.6

256.7 Tablets

North America

Global

0

500

1000

1500

2000

2500

2011

2011-2016

138 253

698

2027 Smartphones

•  Cisco Visual Networking Index VNI 2012, Cisco Systems Inc. 2012

Smartphone Tablets 3-82-4 2016

Page 3: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

3

0%

10%

20%

30%

40%

50%

60%

Mobile Device Loss or theft

Strategy One Survey conducted among a U.S. sample of 3017 adults age 18 years older in September 21-28, 2010, with an oversample in the top 20 cities (based on population).

•  “The 329 organizations polled had collectively lost more than 86,000 devices … with average cost of lost data at $49,246 per device, worth $2.1 billion or $6.4 million per organization.

"The Billion Dollar Lost-Laptop Study," conducted by Intel Corporation and the Ponemon Institute, analyzed the scope and circumstances of missing laptop PCs.

Page 4: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

4

Password

Application

Usability

A major source of security vulnerabilities. Easy to guess, reuse, forgotten, shared

Different applications may

have different sensitivities

Authentication too-often or sometimes too loose

Page 5: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

5

Passwords Normal passwords are not strong enough: usually meaningful words that can be remembered Stringent strong password can be annoying

Most users do not use the password-aid tools (Hong et al. 2009)

Fingerprint? Iris recognition? Face recognition? Voice recognition?

Password for the DHS E-file: Contain from 8 to 16 characters Contain at least 2 of the following 3 characters: uppercase alphabetic, lowercase alphabetic, numeric Contain at least 1 special character (e.g., @, #, $, %, & *, +, =) Begin and end with an alphabetic character Not contain spaces Not contain all or part of your UserID Not use 2 identical characters consecutively Not be a recently used password

Page 6: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

6

•  Derived from

•  Behavioral: the way a human subject behaves

•  Biometrics: technologies and methods that measure and analyzes biological characteristics of the human body

•  Finger prints, eye retina, voice patterns

•  BehavioMetrics: Measurable behavior to Recognize or to Verify •  Identity of a human subject, or

•  Subject’s certain behaviors

Behavioral Biometrics Behaviometrics

Page 7: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

7

•  Mobile devices come with embedded sensors •  Accelerometers, gyroscope, magnetometer •  GPS receiver •  WiFi, Bluetooth, NFC •  Microphone, camera, •  Temperature, light sensor •  “Clock” and “Calendar”

•  Connect with other sensors •  EEG, EMG, GSR

•  Mobile devices are connected with the Internet •  Upload sensor data to the cloud •  Viewing information computing on the server side

•  Users carry the device almost at all time. •  My phone “knows” where I am, what I am doing and my future

activities.

Page 8: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

8

•  Network Factors

•  Personal Factors

•  Behavioral Factors

•  Application Factors

•  Accelerometer •  activity, motion, hand trembling, driving

style •  sleeping pattern •  inferred activity level, steps made per

day, estimated calorie burned •  Motion sensors, WiFi, Bluetooth

•  accurate indoor position and trace. •  GPS

•  outdoor location, geo-trace, commuting pattern

•  Microphone, camera: •  From background noise: activity, type

of location. •  From voice: stress level, emotion •  Video/audio: additional contexts

•  Keyboard, touches, slides •  Specific tasks, user interactions, …

Page 9: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

9

•  Monitor and track user behavior on smartphones using various on-device sensors

•  Convert sensory traces and other context information to Personal Behavior Features

•  Build continuous n-gram model with these features and use it for calculation of Sureness Scores

•  Trigger various Authentication Schemes when certain application is launched.

Page 10: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

10

•  Human behavior/activities share some common properties with natural languages

•  Meanings are composed from meanings of building blocks •  Exists an underlying structure (grammar) •  Expressed as a sequence (time-series)

•  Apply rich sets of Statistical NLPs to mobile sensory data

3

3.5

4

4.5

5

5.5

6

0 20 40 60 80 100 120 140 160 180 200

log

(fre

q)

Rank of words by frequency

Zipf’s Law

Page 11: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

11

Quantization Clustering

Page 12: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

12

•  Generative language model: P( English sentence) given a model

P(“President Obama has signed the Bill of … ”| Politics ) >> P(“President Obama has signed the Bill of … ” | Sports ) LM reflects the n-gram distribution of the training data: domain, genre, topics.

•  With labeled behavior text data, we can train a LM for each activity type: “walking”-LM, “running”-LM and classify the activity as

Page 13: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

13

•  User activity at time t depends only on the last n-1 locations

•  Sequence of activities can be predicted by n consecutive activities in the past

•  Maximum Likelihood Estimation from training data by counting:

•  MLE assign zero probability to unseen n-grams Incorporate smoothing function (Katz)

Discount probability for observed grams Reserve probability for unseen grams

Page 14: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

14

•  Long distance dependency of words in sentences •  tri-grams for “I hit the tennis ball”: “I hit the”, “hit the tennis” “the tennis ball” •  “I hit ball” not captured

•  Future activities depends on activities far in the past. Intermediate behavior has little relevance or influence

•  Noise in the data sets: “ping-pong” effects in time-series, interference, sampling errors, etc •  Model size

Page 15: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

15

•  Build BehavioMetrics models for M classes P0, P1, P2, PM-1 •  Genders, age groups, occupations •  Behaviors, activities, actions •  Health and mental status

•  For a new behavioral text string L, we calculate the probability if L is generated by model m

•  Classification problem formulated as

P (L,m) = P (l1, l2, . . . , lN ,m) =NY

i=1

Pm(li|li�1i�n+1)

u = argmax

mP (L,m) = argmax

m

NX

i=1

logPm(li|li�1i�n+1)

Page 16: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

16

•  Is this play Shakespeare’s work?

•  Comparing the play to Shakespeare’s known library of works

•  Track words and phases patterns in the data

•  Calculate the probability the unknown U given all the known Shakespeare’s work {S}

•  Compare with a threshold θ •  Authentic work (a=1) •  Fake, Forgery or Plagiarism (a=0)

a = sign[P (U |{S}) > �]

Page 17: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

17

•  A special binary classification problem

•  Given a normal BehavioMetrics model Pn, a new behavior text sequence L, and a threshold θ, calculate the likelihood L is generated by Pn and compare with θ

•  If the outcome is -1, flag an anomaly alert

•  Variation caused by noise could be smoothed out statistically

•  Need certain feedbacks to handle false positives, usually caused by unseen behaviors or sub-optimal threshold.

a(L|n, �) = sign[P (L, n) > �)]

Page 18: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

18

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Sliding Window Position

Average Log Probability

Log ProbilityLow ThresholdHigh Threshold

A

B

DC

Page 19: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

19

•  Convert feature vector series to label streams – dimension reduction

•  Step window with assigned length

A1 A2 A1 A4

G2 G5 G2 G2

W2 W1 W2

P1 P3 P6 P1

A2 G2G5 W1 P1P3 A1A4 G2 W1W2 P1

Page 20: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

20

Page 21: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

21

Quantization

Risk Analysis Tree

Clustering

Activity Recognition

<

Application Sensitivity

Application Access Control

Certainty of Risk

Sensor Fusion and Segmentation

Application Access Control

Page 22: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

22

Inference

ModelingPreprocessingSensing

FeatureConstruction

Behavior TextGeneration

N-gramModel

Cla

ssifie

rB

inary

Cla

ss

ifierThreshold

UserAuthentication

UserClassification

•  SenSec collects sensor data • Motion sensors • GPS and WiFi Scanning • In-use applications and their traffic patterns

•  SenSec modulebuild user behavior models •  Unsupervised Activity Segmentation and model the sequence using Language model •  Building Risk Analysis Tree (DT) to detect anomaly •  Combine above to estimate risk (online): certainty score

•  Application Access Control Module activate authentication based on the score and a customizable threshold.

Page 23: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

23

•  Accelerometer •  Used to summarize acceleration stream •  Calculated separately for each dimension [x,y,z,m] •  Meta features: Total Time, Window Size

•  GPS: location string from Google Map API and mobility path

•  WiFi: SSIDs, RSSIs and path

•  Applications: Bitmap of well-known applications

•  Application Traffic Pattern: TCP UDP traffic pattern vectors: [ remote host, port, rate ]

Page 24: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

24

!

Page 25: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

25

•  Offline data collection (for training and testing) Pick up the device from a desk Unlock the device using the right slide pattern Invoke Email app from the "Home Screen” Some typing on the soft keyboard Lock the device by pressing the "Power" button Put the device back on the desk

Page 26: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

26 26

Page 27: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

27

Page 28: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

28

•  71.3% True-Positive Rate with 13.1% False Positive

Page 29: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

29

•  Alpha test in Jun 2012, 1st Google Play Store release in Oct 2012

•  False Positive: 13% FPR still annoying users sometimes

Possible Solutions •  Use adaptive model

•  Adding the trace data shortly before a false positive to the training data and update the model

•  Change passcode validation to sliding pattern

•  A false positive will grant a “free ride” for a configurable duration •  Assumption: just authenticated user should control the device for a given

period of time

•  “Free Ride” period will end immediately if abrupt context change is detected.

Page 30: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

30

Page 31: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

31

•  Hypothesis: the micro-behavior a user interacts with the soft keyboard reflects his/her cognitive and physical characteristics.

Cognitive fingerprints: typing rhythms, correction rate, delay between keys, duration at each key…. Physical characteristics: area of pressure, amount of pressure, position of contact, shift …

Page 32: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

32

Page 33: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

33

•  When pressing a key, the lifting-up position drifts away from the touch-down position.

Page 34: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

34

Page 35: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

35

•  Discriminative model can identify a user at 99% accuracy with just one keypress:

•  When all users’ behavior is known.

•  Models trained over 4000 keys each from 4 users.

•  Generative model to detect unauthorized use from an unknown user

•  Only the authorized user’s behavior is known

•  After 15 key presses: detection rate is 86% with a False Acceptance (FAR) of 14% and a False Rejection Rate (FRR) of only 2.2%.

Page 36: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

36

•  Experiments to discover anomaly usage with ~80%accuracy with only days of training data

Quantization

Risk Analysis Tree

Clustering

Activity Recognition

<

Application Sensitivity

Application Access Control

Certainty of Risk

Sensor Fusion and Segmentation

Page 37: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

37

•  Extended data set for feature construction TCP, UDP traffic; sound; ambient lighting; battery status, etc.

•  Data and Modeling Gain more insights into the data, features and factorized relationships among various sensors Try other classification methods and compare results: LR, SVM, Random Forest, etc

•  Enhanced security of SenSec components Integration with Android security framework and other applications

•  Privacy as expectation (Liu et al., 2012) Users need to know where the data resides, how the data is going to be used and shared. Whom to trust the data with?

•  Energy efficiency

Page 38: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

38

•  Participate in MobiSens and StressSens Data Collection Experiments: http://mlt.sv.cmu.edu:3000/

•  Sign-up for SenSec 2.0 and KeySens 1.0 Beta Testers

Page 39: Guest Lecture: SenSec - Mobile Security through BehavioMetrics

Thank you.