epl682 -papers · • ml-based system for solving image-based captchas, that extracts semantic...
TRANSCRIPT
EPL682- PAPERS----------
Re:CAPTCHAs– UnderstandingCAPTCHA-SolvingServicesinanEconomicContext
IAmRobot:(Deep)LearningtoBreakSemanticImageCAPTCHAs
Antreas Dionysiou - DepartmentofComputerScienceUniversityofCyprus
February2019
2
BACKGROUND
What are CAPTCHAs?
• CompletelyAutomatedPublicTuringtesttotellComputersandHumansApart(CAPTCHA).• Proposedin2003byVonetal.• AlsoreferredasReverseTuringTests.• CAPTCHAstellifauserishumanornot.• DifferentversionsofCAPTCHAexists.• Blockautomatedbotsystemsattacks.• Mustresistautomatedsolving.• Mustbepainlessforhumans.
3
CAPTCHAVersions
4
Text-basedCAPTCHAs
• MostwidelyusedCAPTCHAscheme.• CAPTCHAdesigning,reflectsatrade-offbetweenprotectionandusability.
5
Paper:“Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext.”
6
Whatisallabout?(Summary)
• BriefexplanationaboutCAPTCHAs.• CAPTCHAsolvingecosystemhasemergedwith2majorcategories:
1. AutomatedCAPTCHAsolvers(software).2. Real-timehumanlabor.
• EvaluationofCAPTCHAsineconomicterms.• CAPTCHA’sunderlyingcoststructurebenefitsdefender.• PlentyofCAPTCHAsolvingserviceswithverylowprices.• CAPTCHAsshouldbeviewedasaneconomicimpedimenttoanattacker(notonlyasatechnologicalone).
$1/1000
Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010. 7
Whatisallabout?(Cont.)
• Theoverallshapeofmarketispoorlyunderstood.
• Bigevolutionofautomatedsolvingtools…
• …but,eclipsedbytheemergenceofhuman-basedsolvingmarket.
• Economicexaminationofhuman-basedsolvingmarket.
Human-basedsolvers Automated(software)solvers Hybridsolvers
8Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
Relatedwork
• Theauthorsclaimthattheyarethefirsttoidentifythegrowthofhuman-labor-basedCAPTCHAsolvingservices.• TheclosestworkrelatedisthestudyofBursztein etal.[1],BUT isfocusedonCAPTCHAdifficultyratherthantheunderlyingbusinessmodels.• Nootherrelatedwork(atthattime).
[1]E.Bursztein,S.Bethard,J.C.Mitchell,D.Jurafsky,andC.Fabry.HowgoodarehumansatsolvingCAPTCHAs?alargescaleevaluation.InIEEES&P’10,2010.
9Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
AuthorsTriedtoAnswerKeyQuestionsLike
WhichCAPTCHAsaremostlytargeted?
Roughsolvingcapacity?
Qualityofservice?
Pricingofservices?
Workforcedemographics?
Services’adaptabilitytochangesinCAPTCHAschemes?
Overall,thisresearchprovidesareasoningaboutthenetvalueofCAPTCHAsunderexistingthreats.
10Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
CAPTCHAEconomics,butwhy???
• CAPTCHA’stechnicalperspective,doesn’tcapturethebusinessrealitiesofCAPTCHA-solvingecosystem.• Theprofitabilityofanyscamisafunctionof3factors:
1. ThecostofCAPTCHAsolving.2. Theeffectivenessofanysecondarydefenses.3. Theefficiencyoftheattacker’sbusinessmodel.
• CAPTCHAsaddfrictiontotheattacker’sbusinessmodel.• CAPTCHAsminimizethecostandlegitimateuserimpactofheavier-weightsecondarydefenses(e.g.sms,etc.).
11Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
EconomicsofCAPTCHA-solvingMarket
• ThemarketforCAPTCHA-solvingserviceshasbeenexpanded…
• …but,thewagesofworkershavebeendecliningduetothesereasons:1. CAPTCHAsolvingisanunskilledjob.2. Itcanbeeasilysourcedviainternettothelowestcostlabor.3. Anincreasedcompetitionontheretailsideexist.
• Mr.Esaidthat50%ofrevenueisprofit,roughly10%isforserversandbandwidth,andtheremainderissplitbetweensolvinglabor.
12Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
CAPTCHA-SolvingMarketWorkflow
13Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
CAPTCHA-SolvingServicesAnalysis
• Evaluatedserviceswhichwerewell-advertisedatthetime.
• Evaluated8CAPTCHA-solvingservicesfor5monthscollectingCAPTCHAsbymostpopularwebsites.
• Evaluatingseveralaspectssuchas:1. Customerinterface.2. Solutionaccuracy.3. Responsetime.4. Availability.5. Capacity.
14Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
QualityofServiceAssessment
15Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
QualityofServiceAssessment(cont.)
• Medianerrorrateandresponsetime(inseconds)forallservices.Servicesarerankedtop-to-bottominorderofincreasingerrorrate.
16Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
ServicesAnalysisResults
• Antigate andImageToText providedthefastestservice.
• AccuracyandresponsetimevariedwiththetypeofCAPTCHA.
• Thevalueofaparticularsolverdependson3factors,namely:1. Accuracy.2. Responsetime.3. Price.
• DeCaptcher andCaptchaBot hadthelargestsolvingcapacity,astheycouldsolve14–15CAPTCHAspersecond.
17Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
WorkerWages
• TheyfocusedontwoservicesnamelyKolotibablo andPixProfit.
• Kolotibablo paysworkersatavariablerate(from$0.50/1,000uptoover$0.75/1,000CAPTCHAs)dependingonhowmanyCAPTCHAstheyhavesolved.
• PixProfit offersasomewhathigherrateof$1/1,000.
• Aminimumamountofmoneyshouldbecollectedbeforepayout.
• Mostservicesprovidepaymentviaanonlinee-currencysystem.
18Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
GeographicDemographics
• AllservicesincludeasizeableworkforcefluentinChinese,likelymainlandChina.• Antigate hasappreciableaccuraciesforRussianandHindi,presumablydrawingonworkforcesinRussiaandIndia.• Similarly,forCaptchaBypass andRussian.• BeatCaptcha andTamil,Portuguese,andSpanish.• DeCaptcher andTamil.• ImageToText hasappreciableaccuracyacrossaremarkablerangeoflanguages.
19Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
AdaptabilityofCAPTCHAServices
• AgainfocusedonKolotibablo andPixProfit services.• TestthemontheAsirra CAPTCHA.• ImageToText displayedaremarkableadaptability,solvingtheAsirraCAPTCHAonaverage39.9% ofthetime.
Figure5:ImageToText errorrateforthecustomAsirra CAPTCHAovertime.
20Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
MostPopularTargetedCAPTCHAs
21Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
Conclusions
• CAPTCHAs’ low-impactqualitymakesthemattractivetositeoperators,
• …but,atthesametime,easytobeoutsourcedtoglobalunskilledlabormarket.
• CAPTCHA-solvingbusinessiswell-developed,highly-competitive,andwithlargecapacityindustry.
• WholesaleandretailpricesforCAPTCHA-solvingwillcontinuetodecline.
• CAPTCHAsdon’tpreventlarge-scaleautomatedsiteaccess,
• …but,theyeffectivelylimitautomatedsiteaccess.
22Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
Conclusions(Cont.)
• AsthecostofCAPTCHAsolvingdecreases,asiteoperatormustemploysecondarydefensesmoreaggressively.
• CAPTCHAs shouldberegardedasaneconomicimpediment(notonlyatechnologicalone).
• CAPTCHAsarelow-impactmechanismsthataddfrictiontotheattacker’sbusinessmodel.
• CAPTCHAsminimizethecostandlegitimateuserimpactofheavier-weightsecondarydefenses.
23Motoyama,Marti,etal."Re:CAPTCHAs-UnderstandingCAPTCHA-SolvingServicesinanEconomicContext." USENIXSecuritySymposium.Vol.10.2010.
Paper:“IAmRobot:(Deep)LearningtoBreakSemanticImageCAPTCHAs.”
24
Whatisallabout?(Summary)
• A studyofthelatestversionofGoogle’sreCaptcha.
• AuthorsinfluencereCaptcha’s riskanalysisprocess.• IdentifyreCaptcha’s flaws,bypassrestrictions,anddeploylarge-scaleattacks.
• Proposalofaneffectiveandlow-costdeep-learning-basedattackforthesemanticannotationofimages.
• Proposalofaseriesofsafeguardsandmodificationsforresistingtheirattacks.
25Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Relatedwork
• Yanetal.,“Alow-costattackonamicrosoft CAPTCHA,”inCCS’08.
• Yanetal.,“BreakingvisualCAPTCHAswithnaivepatternrecognitionalgorithms,”inACSAC’07.
• Lietal.,“Breakinge-bankingCAPTCHAs,”inACSAC’10.• Perezetal.,“BreakingreCAPTCHAs withunpredictablecollapse:Heuristiccharactersegmentationandrecognition,”inMCPR 2012.
• Many,many,otherpapersrelatedtoautomatedCAPTCHAsolving...
26Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Google’sreCaptcha
• ThegoalofGoogle’slatestversionofreCaptcha,isto:1. Minimizetheeffortforlegitimateusers.2. Requiringtasksthataremorechallengingtocomputersthan“simple”text
recognition.
• reCaptcha isdrivenbyan“advancedriskanalysissystem”.
• reCaptcha widgetalsoperformsaseriesofbrowserchecks.
• MostwidelyusedCAPTCHAservice.
• Leveragesinformationaboutusers’activitiesthroughcookies.
27Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
HowreCaptcha works?
1. Userclicksonacheckbox.2. A requestissentcontainingallrelatedtousercollected
information.3. Therequestisanalyzedbytheadvancedriskanalysissystem,which
decidesthetypeofCAPTCHAchallengetobepresentedtotheuser.4. Iftheuserrequestsmultiplechallengesorprovidesseveralwrong
answers,thesystemwillreturnincreasinglyharderchallenges.
28Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
CAPTCHAVersions
29Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Contributions
• DeployedanautomationtoolwithoutbeingdetectedbyreCaptchawidget.
• Identifieddesignflawsthatallowattackersto“influence”theadvancedriskanalysisprocess.
• ML-basedsystemforsolvingimage-basedCAPTCHAs,thatextractssemanticinformationfromimages.
• Highlyeffectiveandefficientsystem,achieving70.78%accuracy,solvingchallengesin≈19seconds.
• Demonstratedtheirattack’sgenericapplicability.
• Evaluatedtheirtoolintermsofcost-effectiveness(offline-mode).
30Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
TheirCAPTCHA-solvingsystem
• TheirsystemisbuildonSelenium,andMozillaFirefox(v.36).
• Theirsystemisbasedon2components:1. The1st isresponsibleforcreatingtrackingcookiesthatinfluencetherisk
analysisprocess.
2. The2nd processes thechallengesfollowingdifferenttechniquesbasedonthetypeofchallenge.
31Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
ComputerVisionAlgorithmsandImageAnnotationServicesUsed• Google’sreverseimagesearch(GRIS)forconductinganimage-basedsearch.
• DifferentImageannotationservicesforassigningtags(keywords)orfree-formdescriptionofimages.
• AML-basedclassifierthatcanguessthecontentofanimagebasedonasubsetofthetags.
• A manuallycreatedlabeled-datasetwithimagesandtheirtagfromchallengestheyhavecollected(HistoryModule).
32Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Findings
• Google’sadvancedriskanalysiscanbeneutralizedbyusinga9-dayoldcookie(withorwithoutwebsurfing).
• BeingloggedinaGoogleaccount,with,andwithoutconductingaphoneverification doesnotinfluenceriskanalysissystem.
• Norestrictionbasedonthecountryinwhichacookieiscreated.• Webdriver variabledoesnothaveaneffect.
• User-agent’sbrowserandengineversionsaswellastheactualenvironmentoftheexperiment playscriticalrole.
33Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Findings(Cont.)
• User-agentthatdoesnotcontaincompleteinformation,orismiss-formattedreceivesahard(fallback)CAPTCHA.• Widgetdoesnotdetecttheunderlyingoperatingsystem.• Mismatchbetweenuser-agents,duringacookie’screationandwhenrequestingaCAPTCHAwiththatcookie,doesnothaveeffect.• Screenresolutionandmousebehaviordonotaffecttheoutcomeofriskanalysis.• Cookiesarenotassignedareputationscore(accordingtohistory).• NomechanismprohibitingthecreationofalargenumberofcookiesfromasingleIPaddress.
34Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Findings(Cont.)
• Capacityperday:1. Duringweekdays,theycouldsolvebetween52,000and55,000.2. Duringweekendstheycouldsolve59,000.
• reCaptcha versionsuffersfromsignificantflawsandomissions.
• Formostcases(74%)thenumberofcorrectcandidateimagesis2;therestcontain3andtheyalsofoundtwochallengeswith4.
• Challengesarenotcreated“on-the-fly”butselectedfromarelativelysmallpoolofchallenges.
35Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Findings(Cont.)
• 1,368redundantimagesthatbelongedto358setsofidenticalimages.
• Highlyefficientattacksolvingchallengesin≈19seconds,mentioningthatthemosttimeconsumingphaseisGRIS.
• Alimitedvarietyofimagecategorieshasbeendetected.
• AdversariescandeployaccurateandefficientattacksagainsttheimagereCaptcha withoutrelyingonexternalservices.
• EvaluatedtheirCAPTCHAbreakingsystem’seconomicviability.
36Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Findings(Cont.)
• Discusscountermeasuresfordefendingagainsttheirattacks,andtheirpotentialimpactontheusability.
• reCaptcha hasbeenupdatedafterauthorsinformedGoogleandFacebook.
37Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Conclusions– Futurework
• Furtherimprovementoftheirattack’saccuracycanbeexplored.
• ReassessmentonreverseTuringtests(CAPTCHAs)andtheirdesignisconsideredcritical.
• Demonstratedthefeasibilityoflarge-scaleCAPTCHA-solvingattacks.
• reCaptcha’s advancedriskanalysissystemandwidgetpossessvaluablefunctionality,thatcanbeincorporatedintofuturecaptchaschemesformitigatingattacks.
38Sivakorn,S.,Polakis,I.,&Keromytis,A.D.(2016,March).Iamrobot:(deep) learningtobreaksemanticimagecaptchas.In 2016IEEEEuropeanSymposiumonSecurityandPrivacy (EuroS&P) (pp.388-403).IEEE.
Thanksforyourattention!!! J
Anyquestions?
39