acquisti faces blackhat draft

Upload: anargratos

Post on 06-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    1/48

    AlessandroAcquisti,

    Ralph

    Gross,

    Fred

    Stutzman

    HeinzCollege&CyLabCarnegieMellonUniversity

    PLEASENOTE:DRAFTVERSIONFinalversiontobepresentedatBlackHatUSAonAugust4,2011BlackHat2011

    Facesof

    Facebook:

    PrivacyintheAgeofAugmented

    Reality

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    2/48

    Computerfacerecognitionhasbeenaroundforalongtime

    (e.g.:Bledsoe,1964;Kanade,1973)

    Computersstillperformmuchworsethanhumanswhen

    recognizingfaces

    However,automaticfacerecognitionhaskeptimproving,and

    hasstartedbeingusedinactualapplications

    Especiallyinsecurity,and morerecentlyWeb2.0

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    3/48

    Face

    recognition

    in

    Web

    2.0 GooglehasacquiredNevenVision,Riya,andPittPatt anddeployed

    facerecognitionintoPicasa

    ApplehasacquiredPolarRose,anddeployedfacerecognitioninto

    iPhoto

    Facebook

    has

    licensed

    Face.com

    to

    enable

    automated

    tagging

    So,whatisdifferentaboutthisresearch?

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    4/48

    Increasingpublic selfdisclosures throughonlinesocialnetworks;

    especially,photos

    In2010,2.5billionphotosuploadedbyFacebookusersalonepermonth

    Identified profiles

    in

    online

    social

    networks IndividualsusingtheirrealfirstandlastnamesonFacebook,LinkedIn,Google+,etc.

    Continuingimprovements infacerecognitionaccuracy

    In1997,

    the

    best

    face

    recognizer

    in

    FERET

    program

    achieved

    afalse

    reject

    rate

    of

    0.54

    (atfalseacceptrateof0.001)

    By2006, thefalserejectratewasdownto0.01

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    5/48

    Statisticalreidentification: dataminingallowssurprising,sensitive

    inferencesfrom

    public

    data

    UScitizensidentifiablefromzip,DOB,gender(Sweeney,1997);Netflixprizede

    anonymization (NarayananandShmatikov,2006);SSNpredictionsfromFacebook

    profiles(Acquisti

    and

    Gross,

    2009)

    Cloud computing

    Makesitfeasibleandeconomictorunmillionsoffacecomparisonsinseconds

    Ubiquitous computing

    Combinedwithcloudcomputing,makesitpossibletorunfacerecognitionthrough

    mobiledevices e.g.,smartphones

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    6/48

    Theconvergeofthesetechnologiesisdemocratizing

    surveillance

    NotjustWeb2.0facerecognitionappslimitedand

    constrainedto

    consenting/opt

    in

    users,

    but

    .aworldwhereanyonemayrunfacerecognitionon

    anyone

    else,

    online

    and

    offline

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    7/48

    Yourfaceistheveritablelink betweenyourofflineidentityand

    youronlineidentit(ies)

    Dataaboutyourfaceandyournameis,mostlikely,already

    publiclyavailable

    online

    Hence,facerecognitioncreatesthepotentialforyourfacein

    thestreet(oronline) tobelinkedtoyouronlineidentit(ies),as

    wellastothesensitiveinferencesthatcanbemadeaboutyou

    afterblendingtogetherofflineandonlinedata

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    8/48

    Thisseamlessmergingofonlineandofflinedataraisestheissue

    ofwhat privacywillmeaninsuchaugmentedrealityworld

    Throughsocialnetworks,havewecreatedadefacto,unregulatedRealID

    infrastructure?

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    9/48

    Ourresearchinvestigatesthefeasibilityofcombining

    publiclyavailable onlinesocialnetworkdatawithoffthe

    shelffacerecognitiontechnologyforthepurposeoflarge

    scale,automated,peerbased

    1. individualreidentification,onlineandoffline

    2. accretionand

    linkage

    of

    online,

    potentially

    sensitive,

    data to

    someonesfaceintheofflineworld

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    10/48

    Democratizationofsurveillance

    Facesasconduitsbetweenonlineandofflinedata

    The

    emergence

    of

    PPI:

    personally

    predictable

    information

    Theriseofvisual,facialsearches

    Thefuture

    of

    privacy

    in

    aworld

    of

    augmented

    reality

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    11/48

    Experiment1:OnlinetoonlineReIdentification

    Experiment2:OnlinetoofflineReIdentification

    Experiment3:OnlinetoofflineSensitiveInferences

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    12/48

    UnIdentifiedDB IdentifiedDB PersonalProfilesonMatch.com,Prosper.com,etc. Photorepositories(e.g.,Flickr)

    Openwebcams CCTVs

    Yourfaceonthestreet

    []

    PersonalProfilesonFacebook.com, Linkedin,etc. Govt orcorporatedatabases []

    Additionalsensitiveinferences(e.g.sexualorientation,SSN,etc.)

    Facerecognition[1]allowsto

    matchasubjectinanun

    identifiedDBfromdatainanidentifiedDB[2]

    Oncethatisdone,sensitivedata

    inferredfromtheunidentified

    DB[3]canbelinkedtothe

    identityofthesubjectinthe

    identifiedDB[4]

    [1]

    [3]

    [4]

    [2]

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    13/48

    Onlinetoonline

    Weminedpubliclyavailableimagesfromonlinesocial

    networkprofilestoreidentifyprofilesononeofthemost

    populardatingsitesintheUS

    WeusedPittPatt facerecognizer(Nechyba,Brandy,and

    Schneiderman,2007)

    for:

    Facedetection:automaticallylocatinghumanfacesindigitalimages

    Facerecognition:measuringsimilaritybetweenanypairoffacestodetermine

    ifthey

    are

    of

    the

    same

    person

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    14/48

    Facebookprofiles

    WedownloadedprimaryprofilephotosforFacebookprofilesfrom

    aNorthAmericancityusingasearchenginesAPI(i.e.,without

    evenlogging

    on

    the

    Facebook

    itself)

    Noisyprofilesearchpattern:Combinationofsearchstrategies

    (currentlocation,memberoflocalnetworks,fanoflocal

    companies/teams,etc.)

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    15/48

    Datingsiteprofiles

    Profilesweremembersofoneofthemostpopulardatingsitesin

    theUS

    Membersuse

    pseudonyms

    to

    protect

    their

    identities

    However,facialimagesmaymakemembersrecognizablenotjust

    byfriends,butbystrangers

    Unfeasibleifdonemanually(hundredsofmillionsofpotentialmatchesto

    verify),butquitefeasibleusingfacerecognition+cloudcomputing

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    16/48

    OverlapbetweenourdatingsitedataandFacebookdatais

    inherentlynoisy

    (geographical

    search

    vs.

    keywords

    search)

    WerantwosurveystoestimateFacebook/datingsitemembers

    overlap Then,multiplehumancodersgradedmatchedpairstoevaluate

    facerecognizersaccuracy

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    17/48

    Oneoutof10datingsitespseudonymousmemberswas

    identified

    Note:

    In

    Experiment

    1,

    we

    constrained

    ourselves

    to

    using

    only

    a

    single

    Facebook

    (primary profile)photo,andonlyconsideringthetopmatchreturnedbythe

    recognizer

    However:

    Because

    an

    attacker

    can

    use

    more

    photos,

    and

    test

    more

    matches,

    ratio

    of

    re

    identifiableindividualswilldramaticallyincrease

    See,infact,Experiment2

    Also:asfacerecognizersaccuracyincreases,sodoestheratioofre

    identifiableindividuals

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    18/48

    Offlinetoonline

    WeusedpubliclyavailableimagesfromaFacebook

    Collegenetworktoidentifystudentsstrollingoncampus

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    19/48

    Collegephotos

    Weused

    awebcam

    to

    take

    3photos

    per

    participant

    PhotosgatheredovertwodaysinNovember

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    20/48

    Weaskstudentswalkingbytostopandhavetheirpicturetaken

    Then,weaskedparticipantstoansweranonlinesurveyabout

    Facebookusage

    Inthe

    meanwhile,

    face

    matching

    was

    taking

    place

    on

    an

    cloud

    computingservice

    Thelastpageofthesurveywaspopulateddynamicallywiththe

    bestmatching

    pictures

    found

    by

    recognizer

    Participantswereaskedtoselectphotosinwhichtheyrecognized

    themselves

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    21/48

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    22/48

    Roughlyoneofoutthreesubjectswasidentified

    Averagecomputationtimepersubject:lessthanthreeseconds

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    23/48

    InExperiment2wefoundtheFacebookprofilescontainingimages

    thatmatchedthefacialfeaturesofstudentsworkingoncampus

    But:in2009,weusedFacebookprofileinformationtopredict

    individualsSocial

    Security

    numbers

    AcquistiandGross,PredictingSocialSecurityNumbersfromPublicData,

    ProceedingsoftheNationalAcademyofScience,2009

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    24/48

    + =

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    25/48

    + = SSN

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    26/48

    + = SSN

    I.e., predicting SSNs from faces

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    27/48

    Experiment3wasaboutpredictingpersonalandsensitive

    informationfromaface

    Wetrainedanalgorithmtoautomaticallyidentifythemostlikely

    Facebookprofile

    owner

    given

    amatch

    between

    the

    Experiment

    2

    subjectsphotosandadatabaseofFacebookimages

    Fromthepredictedprofiles,weinferrednames,DOBs,other

    demographicinformation,

    as

    well

    as

    interests/activities

    of

    the

    subjects

    Withthatinformation,wepredictedtheparticipantsSSNs

    WethenaskedparticipantsinExperiment2whomwehadthusly

    identifiedto

    participate

    in

    afollow

    up

    study

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    28/48

    Inthefollowupstudy,weaskedparticipantstoverifyour

    predictionsabouttheir:

    Interests/Activities(fromFacebookprofiles)

    SSNs

    first

    five

    digits

    (predicted

    using

    Acquisti

    and

    Gross,

    2009s

    algorithm)

    Note:last4digitsarepredictabletoo(seeAcquistiandGross,2009).Prediction

    accuracyvariesgreatly,asfunctionofstateandyearofbirth,andcanbecorrectly

    estimatedonlywithlargersamplesizesthatwhatavailableinExperiment3

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    29/48

    Source: http://www.director-thailand.com/blog/what-is-augmented-reality

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    30/48

    Ourdemosmartphoneappcombinesandextendstheprevious

    experimentsto

    allow:

    Personalandsensitiveinferences

    Fromsomeonesface

    Inrealtime

    Onamobiledevice

    Overlayinginformation

    (obtained

    online)

    over

    the

    image

    of

    the

    individual

    (obtainedoffline)onthemobiledevicesscreen

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    31/48

    SourcesofonlinedatacanbeFacebook(toidentifysomeones

    name),Spokeo (oncesomeonesnamehasbeenidentified)

    andthen,thesensitiveinferencesonecanmakebasedon

    thatdata (e.g.,

    SSNs,

    but

    also

    sexual

    orientation,

    credit

    scores,

    etc.)

    Thatis:theemergenceofpersonallypredictableinformationfroma

    personsface

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    32/48

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    33/48

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    34/48

    Availabilityoffacialimages

    Legaland

    technical

    implications

    of

    mining

    identified

    images

    from

    online

    sources

    Cooperativesubjects

    Facerecognizersperformworseinabsenceofcleanfrontalphotos

    Onthestreet,cleanandfrontalphotosofuncooperativestrangersareunlikely

    Geographicalrestrictions

    Experiment1focusedonCityarea(~330k individuals).Experiment2focusedon

    Collegecommunity(~25kindividuals)

    Asthesetofpotentialtargetsgetslarger(e.g.,nationwide),computationsneededfor

    facerecognitiongetlessaccurate(i.e.,morefalsepositives),andtakemoretime

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    35/48

    Facerecognitionofeveryone/everywhere/allthetimeisnot yet

    feasible

    However: Currenttechnologicaltrendssuggestthatmostcurrent

    limitationswill

    keep

    fading

    over

    time

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    36/48

    Thereexistlegalandtechnicalconstraintstominingidentified

    imagesfromonlinesources

    However:

    Manysources

    are

    publicly

    available

    (e.g.,

    do

    not

    require

    login,

    such

    as

    LinkedInprofilephotos;orcanbesearchedthroughsearchengines,suchas

    Facebookprimaryprofilephotos:seeExperiment1)

    Facerecognition

    companies

    are

    already

    collaborating

    with

    social

    network

    sitestotagbillionsofimages(e.g.,seeFace.comrecentannouncement)

    Taggingself,andothers,inphotoshasbecomesociallyacceptable infact,

    widespread(thus

    providing

    agrowing

    source

    of

    identified

    images)

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    37/48

    Assearchenginesentersthefacerecognitionspace,facial

    visualsearches

    may

    become

    as

    common

    as

    todays

    text

    basedsearches

    Text

    based

    searches

    of

    someones

    name

    across

    the

    WWW,

    which

    are

    commonnow,wereunimaginable15yearsago(beforesearchengines)

    Fromspidered &indexedhtmlpages,tospidered &indexedphoto

    Googlehas

    already

    announced

    searches

    based

    on

    image

    (although

    not

    facialimage)

    patternmatching

    ThenumberofSiliconValleyplayersenteringthisspaceinrecentmonths

    demonstrates

    the

    commercial

    interest

    in

    face

    recognition

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    38/48

    Whatwedidonthestreetwithmobiledevicestoday(requiring

    pointandshootandcooperativesubjects),willbeaccomplished

    inlessintrusivewaystomorrow

    Glasses(already

    happening:

    Brazilian

    police

    preparing

    for

    2014

    World

    Cup)

    Howlongbeforeitcanbedoneon.contactlenses? Facerecognizerswillkeepgettingbetteratmatchingfacesbasedon

    nonfrontal

    images

    (compare

    PittPatt version

    5.2

    vs.

    version

    4.2)

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    39/48

    Asthesetofpotentialtargetsgetslarger(e.g.,nationwideDBof

    individuals),the

    computations

    needed

    for

    face

    recognition

    get

    lessaccurate(morefalsepositives)andtakemoretime

    However:

    databases

    of

    identified

    images

    are

    getting

    larger,

    with

    more

    individualsareinthem(seepreviousslides)

    Accuracy(numberoffalsepositives,numberoffalsenegatives)offace

    recognizerssteadily

    increases

    over

    time

    especially

    so

    in

    last

    few

    years

    Cloudcomputingclusterswillkeepgettingfaster,larger(morememory

    available==largertargetDBsfeasibletoanalyze),andcheaper,making

    massiveface

    comparisons

    economical

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    40/48

    Web2.0profiles(e.g.Facebook)arebecomingdefactounregulated

    Real

    IDs

    SeerecentFTCsapprovalofSocialIntelligenceCorporationssocialmediabackgroundchecks

    Greatpotentialforcommerceandecommerce

    ImagineMinorityReportstyleadvertising

    however,

    happening

    much

    earlier

    than

    2054

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    41/48

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    42/48

    Optinisineffective asprotection,sincemostdataisalready

    publiclyavailable

    E.g.,Facebooksetsprimaryprofilephotostobevisibletoallbydefault,

    andmemberstosignuptothenetworkwiththeirrealidentities

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    43/48

    Whatwill privacymeaninaworldwhereastrangeronthe

    streetcouldguessyourname,interests,SSNs,orcreditscores?

    Thecomingageofaugmentedreality,inwhichonlineand

    offlinedata

    are

    blended

    in

    real

    time,

    may

    force

    us

    to

    reconsiderournotionsofprivacy

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    44/48

    Infact,augmentedrealitymayalsocarrydeepreaching

    behavioralimplications

    Throughnaturalevolution,humanbeingshaveevolvedmechanismsto

    assign

    and

    manage

    trust

    in

    face

    to

    face

    interactions Willwerelyonourinstincts,oronourdevices,whenmobiledevices

    maketheirownpredictionsabouthiddentraitsofapersonwearelooking

    at?

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    45/48

    Democratizationofsurveillance

    Facesasconduitsbetweenonlineandofflinedata

    TheemergenceofPPI:personallypredictableinformation

    Theriseofvisual,facialsearches

    The

    future

    of

    privacy

    in

    a

    world

    of

    augmented

    reality

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    46/48

    Wegratefullyacknowledgeresearchsupportfrom

    NationalScienceFoundationunderGrant0713361

    U.S.ArmyResearchOfficeunderContractDAAD190210389

    HeinzCollege

    CarnegieMellonCyLab

    CarnegieMellon

    Berkman Fund

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    47/48

    MainRAs:GaneshRajManickaRaju,MarkusHuber,Nithin

    Betegeri,NithinReddy,Varun Gandhi,AaronJaech,Venkata

    Tumuluri

    AdditionalRAs:

    Aravind

    Bharadwaj,

    Laura

    Brandimarte,

    Samita

    Dhanasobhon,HazelDianaMary,NitinGrewal,AnujGupta,

    SnigdhaNayak,RahulPandey,SoumyaSrivastava,Thejas

    Varier,NarayanaVenkatesh

  • 8/2/2019 Acquisti Faces BLACKHAT Draft

    48/48

    Google:economicsprivacy

    Visit:http://www.heinz.cmu.edu/~acquisti/economics

    privacy.htm

    Email:[email protected]