passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“in a collection of 3,289 passwords...

29
Passwords Tom Ristenpart CS 6431

Upload: lycong

Post on 14-Jun-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Passwords

TomRistenpartCS6431

Page 2: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Thegameplan

• Basics,history–MorrisandThompson1979

• Theresearchlandscape• Currentpracticesinindustry• Measuringpassworddistributions:– Florencio&Herley (client-sidemeasurement)– Bonneau (server-sidemeasurement)– Understandingpasswordstrengthmetrics

Page 3: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Passwordusecases

• OSlogin• Website/ servicelogin• PINs• Encryption

Authentication

Confidentiality

Page 4: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Morris-Thompsonpaper

Register:tom,pw

Storetom,pwinsomeform.

login:tom,pw’

Authenticationservice

Checkthatpw’=pw

Whataresecuritythreats?

Page 5: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Brute-forceattacks• Offlinebrute-forceattacks– Compromisedatabase– E.g.:“cracking”viadictionaryattacks– Countermeasures:hashpasswordswithpurposefullyslow-to-computecryptographichashfunction

(was:MD5,SHA-1now:argon2,scrypt)• Onlinebrute-forceattacks– E.g:Submitguessestowebsite– Countermeasures:Ratelimit,accountlockout

• Shouldersurfing,compelledpassworddisclosure,malware,side-channels,...

Page 6: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Brute-forceattacks• Offlinebrute-forceattacks– Compromisedatabase– E.g.:“cracking”viadictionaryattacks– Countermeasures:hashpasswordswithpurposefullyslow-to-computecryptographichashfunction

(was:MD5,SHA-1now:argon2,scrypt)• Onlinebrute-forceattacks– E.g:Submitguessestowebsite– Countermeasures:Ratelimit,accountlockout

• Shouldersurfing,compelledpassworddisclosure,malware,side-channels,...

Page 7: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Offlinebrute-forceattacks

Dictionary:Listofprobablepasswordsorwayofgeneratingthem

h1 =H(pw1)h2 =H(pw2)…hm =H(pwm)

H(guess1),H(guess2), …

Checkifanyguessesequalanyofh1,…,hm

Nowadays:Usepasswordleakstoinformdictionary

Page 8: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

“Humanbeingsbeingwhattheyare,thereisastrongtendencyforpeopletochooserelativelyshortandsimplepasswordsthattheycanremember.“

[Morris,Thompson1979]

Page 9: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

“Inacollectionof3,289passwordsgatheredfrommanyusersoveralongperiodoftime,

15wereasingleASCIIcharacter;72werestringsoftwoASCIIcharacters;464werestringsofthreeASCIIcharacters;477werestringsoffouralphamerics;706werefiveletters,allupper-caseoralllower-case;605weresixletters,alllower-case.”

[Morris,Thompson1979]

“Theresultsweredisappointing,excepttothebadguy.”

86%crackable

Page 10: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Morris,Thompsonsuggest:– Slowhashing(theycalleditencryption)– Lesspredictablepasswords(pwrequirements)– Saltbeforehashing– UsecustomversionofDEStoavoidhardware– Avoidtimingattackstodistinguishbadlogin

“Wedidnotattempttohidethesecurityaspectsoftheoperatingsystem,therebyplayingthecustomarymake-believegameinwhichweaknessesofthesystemarenotdiscussednomatterhowapparent.”

Page 11: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Theresearchlandscapesince1979…• Understandinguserpasswordselection

– Measuringpasswordstrength[seecitationsinBonneau paper],[Li,Han`14],[CMUpapers]– Measuringpasswordreuse

• Usability– Strengthmeters,requirements,etc.[Komanduri etal.‘11][Dell’Amico,Filippone ‘15][Wheeler

‘16][Melicher etal.‘16]– Passwordexpiration[Zhangetal.‘12]– Typo-tolerance[Chatterjeeetal.`16]

• Passwordtransmission,loginlogic– Singlesign-on(SSO)technologies– Password-basedauthenticatedkeyexchange[Bellovin,Merritt‘92]

• Passwordhashing– Newalgorithms[PKCSstandards],[Percival’09],[Biryukov,Khovratovich ‘15]– Proofs[Wagner,Goldberg‘00][Bellare,Ristenpart,Tessaro ‘12]

• Improvingofflinebrute-forceattacks– Time-spacetrade-offs(rainbowtables)[Hellman’80],[Oeschlin ‘03],[Narayanan,Shmatikov ‘05]– Betterdictionaries[JohntheRipper],[Weiretal.‘09],[Maetal.‘14]

• Passwordmanagers– Decoy-based[Bojinov etal.’10],[Chatterjee etal.‘15]– Breakingpasswordmanagers[Lietal.‘14][Silveretal.’15]– Statelesspasswordmanagers[Rossetal.’05]

Page 12: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Passwordhashing• RecallMorris,Thompsongoal:slowdownbrute-forceattacks

• PKCS#5approach:

H H H…pw||salt h=Hc(pw||salt)

H:{0,1}* ->{0,1}n iscryptographichashfunction(e.g.,SHA-256)

saltshouldberandombitstringlargeenoughtobeunpredictable

ctimes

Page 13: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Passwordhashing• RecallMorris,Thompsongoal:slowdownbrute-forceattacks

• Theroleofsalts:– Preventsuseoftime-memorytrade-offs(rainbowtables)

– Crackingmaccountsrequiresmtimesthework

h1 =Hc(pw1,salt1)h2 =Hc(pw2,salt2)…hm =Hc(pwm,saltm)

Proofs:See[Bellare,Ristenpart,Tessaro ‘12]

Page 14: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

3

Linkedincirca2012storedpasswordsaswhichof:• pw• MD5(pw)• H(salt,pw)• Hc(salt,pw)

2012:6.5millionhashesleakedontoInternet90%crackedin2weeks

2016:177.5millionmorehashesleaked98%crackedin1week

http://arstechnica.com/security/2016/06/how-linkedins-password-sloppiness-hurts-us-all/

Page 15: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

$cur=‘password’$cur=md5($cur)$salt=randbytes(20)$cur=hmac_sha1($cur,$salt)$cur=remote_hmac_sha256($cur,$secret)$cur=scrypt($cur,$salt)$cur=hmac_sha256($cur,$salt)

Facebookpassword“onion”

Split-trustmodel:mustcompromisebothserverstomountofflinebrute-force

Page 16: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Understandingpasswordstrength(1)Empiricalstudiesofuserpasswords

LeaksInstrumentationoflargewebsystems(Bonneau paper)Instrumentationofclients(Florencio&Herley)

(2)Developprobabilisticmodelofpasswordspw1,pw2 ,…,pwNp(pwi)=pi=probabilityuserselectspasswordpwiX

i

pi = 1

(3)Useptoeducatebrute-forcecrackers,strengthmeters

Page 17: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Rockyou databreach:32millionsocialgamingaccounts

29072912345679076123457678912345678959462password49952iloveyou33291princess21725123456720901rockyou205531234567816648abc12316227nicole15308daniel15163babygirl14726monkey14331lovely14103jessica

[Bonneau 2012]69millionYahoo!Passwords1.1%ofuserspicksamepassword

Mostcommonpasswordusedbyalmost1%

Page 18: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Rockyou empiricalprobabilitymassfunction

0

0.002

0.004

0.006

0.008

Passwords

Probabilitymass

1

(Onlyfirst5,000pointsshown)

Page 19: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Florencio&Herley study

• InstrumentWindowsLivetoolbar– 544,960clientsopted-intostudy

• Capturedpasswordstypedintobrowser– Hashedandstoredlocally– Sentreporttoserverabout(quantized)passwordstrength,associatedURL,etc.

Page 20: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Florencio&Herley 2007

• Avg user:– Has6.5passwords,eachusedat3.9differentsites– Has25accountsrequiringpasswords– Types8passwordsperday– Selects40.54“bitstring”password

• ~1.5%ofYahoousersforgettheirpasswordseachmonth(!)

Page 21: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Rockyou empiricalprobabilitymassfunction

0

0.002

0.004

0.006

0.008

Passwords

Probabilitymass

1

(Onlyfirst5,000pointsshown)

Page 22: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Bonneau Yahoopasswordstudy• Instrumentlogininfrastructure– 69millionaccountsmonitored

• HashpasswordswithkeyH(K,pw)andstoreresultinhistogram

• ThrowawayK– Can’tdobrute-forceattackslateron– Onlylearnempiricaldistributionofpasswords

• Alsostoredsomedemographicinformation• Howdowemeasurestrengthofpassworddistribution?

Page 23: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Passwordstrengthmetrics

• FlorencioandHerley approach?– Alphasize(pw)=sumofthesizesofcharacterclassesobservedinpassword• Hello12!Hasalphabetsize=26+26+10+22=84

– Bitstrength(pw)=Alphasize(pw)len(pw)

• SimplerthanclassicalNISTentropyestimate

Page 24: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Passwordstrengthmetrics

Shannonentropy:

LetX bepassworddistribution.Passwordsaredrawniid fromXNissizeofsupportofXp1 ,p2 ,…,pN areprobabilitiesofpasswordsindecreasingorder

H1(X ) =

NX

i=1

�pi log pi

Page 25: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Shannonentropyispoormeasure(forpasswordunpredictability)

N=1,000,000p1 =1/100p2 =(1– 1/100)/999,999≈1/220…pN =(1– 1/100)/999,999≈1/220

.01

2-20

Whatisprobabilityofsuccessifattackermakesoneguess?

H1(X) ≈19

19bitsof“unpredictability”.Probabilityofsuccessabout1/219

Shannonentropyisalmostneverusefulmeasureforsecurity

H∞(X) =- logp1≈6.6Themin-entropyofX

Page 26: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

PasswordstrengthmetricsBeta-successrate:

��(X ) =�X

i=1

pi

Alpha-work-factor:

µ↵(X ) = min

(j���

jX

i=1

pi � ↵

)

˜�(X ) = log(�/��(X ))

µ↵(X ) = log(µ↵(X )/�mu↵(X ))

Page 27: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

the size or composition of leaked data sets. Thus far, forexample, no leaked sources have included demographic data.

We addressed both problems with a novel experimentalsetup and explicit cooperation from Yahoo!, which maintainsa single password system to authenticate users for its diversesuite of online services. Our experimental data collectionwas performed by a proxy server situated in front of livelogin servers. This is required as long-term password storageshould include account-specific salting and iterated hashingwhich prevent constructing a histogram of common choices,just as they mitigate pre-computed dictionary attacks [39].

Our proxy server sees a stream of pairs (u, passwordu

)for each user u logging in to any Yahoo! service. Ourgoal is to approximate distinct password distributions X

fi

for a series of demographic predicates fi

. Each predicate,such as “does this user have a webmail account?”, willtypically require a database query based on u. A simplisticsolution would be for the proxy to emit a stream of tuples(H(password

u

), f1(u), f2(u), . . . ), removing user identifiersu to prevent trivial access to real accounts and using a cryp-tographic hash function H to mask the values of individualpasswords.8 There are two major problems to address:

A. Preventing password crackingIf a user u can be re-identified by the uniqueness of

his or her demographic predicates [40], then the valueH(password

u

) could be used as an oracle to perform anoffline dictionary attack. Such a re-identification attack wasdemonstrated on a data set of movie reviews superficiallyanonymized for research purposes [41] and would almostcertainly be possible for most users given the number anddetail of predicates we would like to study.

This risk can be effectively mitigated by prepending thesame cryptographically random nonce r to each passwordprior to hashing. The proxy server must generate r at thebeginning of the study and destroy it prior to making dataavailable to researchers. By choosing r sufficiently longto prevent brute-force (128 bits is a conservative choice)and ensuring it is destroyed, H(r||password

u

) is uselessfor an attacker attempting to recover password

u

but thedistribution of hash values will remain exactly isomorphicto the underlying distribution of passwords seen.

B. Preventing cross-account compromiseWhile including a nonce prevents offline search, an at-

tacker performing large-scale re-identification can still iden-tify sets of users which have a password in common. Thisdecreases security for all users in a group which share apassword, as an attacker may then gain access to all accountsin the group by recovering just one user’s password byauxiliary means such as phishing, malware, or compromiseof an external website for which the password was re-used.

8Note that H cannot incorporate any user-specific salt—doing so wouldocclude the frequency of repeated passwords.

10 12 14 16 18 20 22 24 26lg M

5

10

15

20

25

met

ric

valu

e(b

its)

H0

ˆG

H1

ˆµ0.25

ˆ�10

H1

Figure 3. Changing estimates of guessing metrics with increasing samplesize M . Estimates for H1 and �10 converge very quickly; estimates forµ0.25 converge around M = 222 (marked ⇥) as predicted in Section V-A.Estimates for H0, H1, and G are not close to converging.

Solving this problem requires preventing re-identification bynot emitting vectors of predicates for each user.

Instead, the proxy server maintains a histogram Hi

of observed hash values for each predicate fi

. Foreach pair (u, password

u

) observed, the proxy server addsH(r||password

u

) to each histogram Hi

for which fi

(u) istrue. An additional list is stored of all previously seen hashedusernames H(r||u) to prevent double-counting users.

C. Deployment detailsThe collection code, consisting of a few dozens lines of

Perl, was audited and r generated using a seed providedby a Yahoo! manager and machine-generated entropy. Theexperiment was approved by Yahoo!’s legal team as wellas the responsible ethics committee at the University ofCambridge. We deployed our experiment on a random subsetof Yahoo! servers for a 48 hour period from May 23–25,2011, observing 69,301,337 unique users and constructingseparate histograms for 328 different predicate functions. Ofthese, many did not achieve a sufficient sample size to beuseful and were discarded.

V. EFFECTS OF SAMPLE SIZE

In our mathematical treatment of guessing difficulty, weassumed complete information is available about the under-lying probability distribution of passwords X . In practice, wewill need to approximate X with empirical data.9 We assumethat we have M independent samples X1, . . . , XM

R Xand we wish to calculate properties of X .

The simplest approach is to compute metrics using thedistribution of samples directly, which we denote X .10 As

9It possible that an attacker knows the precise distribution of passwordsin a given database, but typically in this case she or he would also knowper-user passwords and would not be guessing statistically.

10We use the hat symbolˆfor any metric estimated from sampled data.

From[Bonneau ‘12]

Page 28: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

0.0 0.1 0.2 0.3 0.4 0.5success rate �

0

5

10

15

20

25

30�-w

ork-

fact

orµ

↵(b

its)

WebCo [09]

RockYou [09]

Surnames [10]

PINs [11]

Morris et al. [79]

Klein [90]

Spa�ord [92]

Wu [99]

Kuo [04]

Schneier [06]

Dell’Amico (en) [10]

Dell’Amico (fi) [10]

Dell’Amico (it) [10]

PassPoints [05]

Faces [04]

Figure 6. Guessing curve for Yahoo! passwords compared with previously published data sets and cracking evaluations.

split effect, with male-chosen passwords being slightly morevulnerable to online attack and slightly stronger against of-fline attack. There is a general trend towards better passwordselection with users’ age, particularly against online attacks,where password strength increases smoothly across differentage groups by about a bit between the youngest users andthe oldest users. Far more substantial were the effects oflanguage: passwords chosen by Indonesian-speaking userswere amongst the weakest subpopulations identified withH1 = 5.5. In contrast, German and Korean-speaking usersprovided relatively strong passwords.

Users’ account history also illustrates several interestingtrends. There is a clear trend towards stronger passwordsamongst users who actively change their password, withusers who have changed passwords 5 or more times beingone of the strongest groups.18 There is a weaker trendtowards stronger passwords amongst users who have com-pleted an email-based password recovery. However, userswho have had their password reset manually after reportingtheir account compromised do not choose better passwords

18As these password changes were voluntary, this trend doesn’t relatemandatory password change policies, particularly as many users choosepredictably related passwords when forced [49].

than average users.19 Users who log in infrequently, judgingby the time of previous login before observation in our ex-periment, choose slightly better passwords. A much strongertrend is that users who have recently logged in from multiplelocations choose relatively strong passwords.20

There is a weak trend towards improvement over time,with more recent accounts having slightly stronger pass-words. Of particular interest to the security usability researchcommunity, however, a change in the default login format Yahoo! appears to have had little effect. While Yahoo!has employed many slightly different login forms across itsdifferent services, we can compare users who initially en-rolled using each of two standard forms: one of which has nominimum length requirement and no guidance on passwordselection, and the other with a 6 character minimum and agraphical indicator of password strength. This change madealmost no difference in security against online guessing, andincreased the offline metrics by only 1 bit.

Finally, we can observe variation between users who have

19A tempting interpretation is that user choice in passwords does notplay a significant role in the risk of account compromise, though this is notclearly supported since we can only observe the post-compromise strength.

20Yahoo! maintains a list of recent login locations for each user for abusedetection purposes.

From[Bonneau ‘12]

Page 29: passwords - cs.cornell.edushmat/courses/cs6431/passwords1.pdf“In a collection of 3,289 passwords gathered from many users over a ... 69 million Yahoo! Passwords ... –Selects 40.54

Bonneau takeaways

• Useappropriatestrengthmeasuresforpassworddistributions

• Yahoostudy:peoplepicklousypasswords

• WhatdoesBonneau papernotgiveus?