how big data and deep learning are revolutionizing aml and financial crime detection
TRANSCRIPT
BigDataandPredic.veAnaly.csforAMLandFinancialCrimeDetec.onSanjayKumarGMIndustrySolu.ons–Telecom&FS
2 ©HortonworksInc.2011–2016.AllRightsReserved
Agenda
à Introduc=on
à WhatisFinancialCrime,AMLandwhatweareseeingintheAMLSpace
à BriefDiscussionofCustomerAc=vityinAML
à Illustra=veUseCases
à WhereCurrentImplementa=onsfallshort?
à ReferenceArchitectureforAMLandPredic=veAnaly=cs
à Q&A
3 ©HortonworksInc.2011–2016.AllRightsReserved
FSIIndustryMarketSegments
FSI Industry"
Capital Markets"
Investment Banks" Hedge Funds" Wealth Mgmt"
Retail Lines"
Consumer lines" Corporate"
Payments"
Acquirer & Issuer Banks " Schemes"
Market Exchanges"
• Thereare4primarymarketsegments/sectorscomprisingtheglobalFSIindustry:CapitalMarkets;RetailBanking,Payments;MarketExchanges.• Eachgeography,countryandstatemayhavetheirownregula=onandcompliancerequirementsforproducts,distribu=onandra=ngrequirements.Bankingisthemostregulatedindustry!
• ItiskeytounderstandthemarketsegmentoftheBankingcompanyasthebusinessprocessanddata/informa=onneedsandchallengesareverydifferentacrossthe4.Addi=onally,challengesvarybyPremium/Revenue=er.
• TherearemanyGlobalFScompanieswhichmaydefinestandardsgloballyanddeploylocally.
4 ©HortonworksInc.2011–2016.AllRightsReserved
ImpactofBigDatain5majorareas
Predictive Analytics And ML/DL
Digital Banking
Capital Markets
Wealth Management
Cybersecurity Helpingdefendins=tu=onsagainstcyberthreats
Improvingwealthmanagementcapabili=estherebyprovidingenhancedcustomerservice
Enhancingcapabili=esacrossinvestmentbanking,tradingetc.
EnablingDigitalbank,providingseamlesscustomerexperience
Analy=csenablingbothdefensiveandoffensiveusecases
5 ©HortonworksInc.2011–2016.AllRightsReserved
WhyBigDataforFinancialCrimesandControls
à Firms,largeandsmall,needtonavigateasetofincreasinglycomplexcompliancerulesandregula=onsasregulatorybodiesclampdownonloopholesinthefinancialregulatoryframework.With=ghterregula=oncomestheneedtoseekoutmoreadvancedandcosteffec=vecompliancesolu=ons
à Itises=matedbytheFinancialAc=onTaskForcethatoveronetrilliondollarsislaunderedannually.
à Regulatorsincreasinglyrequiregreateroversightfromins=tu=ons,includingclosermonitoringforan=-moneylaundering(AML)andknowyourcustomer(KYC)compliance.
à Themethodsandtac=csusedtolaundermoneyareconstantlyevolving,fromloan-backschemesandfrontcompanies,totrustsandblackmarketcurrencyexchanges,thereisno“typical”moneylaunderingcase.
6 ©HortonworksInc.2011–2016.AllRightsReserved
WhatIsAML,FinancialCrimeandWhatweareseeinginAML
7 ©HortonworksInc.2011–2016.AllRightsReserved
WhatisAMLandFinancialCrimes
à Financialcrimeiscommonlyconsideredascoveringthefollowingoffences:– Fraud– ElectronicCrime(CreditCard,stoleninforma=onetc)– MoneyLaundering– Terroristfinancing– BriberyandCorrup=on(KYC)– marketabuseandinsiderdealing(TradeSurveillance)– Informa=onsecurity(CyberSecurity)
à An=-moneylaundering(AML)isatermmainlyusedinthefinancialandlegalindustriestodescribethelegalcontrolsthatrequirefinancialins=tu=onsandotherregulateden==estopreventorreportmoneylaunderingac=vi=es.
8 ©HortonworksInc.2011–2016.AllRightsReserved
FinancialCrimeIsOntheRise!
ofbusinesseswerevic=msoffraud
ofbanksfailedtocatchfraudbeforefundsweretransferredout
offraudaiacks,thebankwasunabletofullyrecoverassets
ofbusinessessaidtheyhavemovedtheirbankingac=vi=eselsewhere
Only20%ofbankswereabletoiden=fyfraudbeforemoneywastransferred.
“TheROIofinves/nginfraudpreven/onisclear.”
58%
Source:PonemonIns=tute/GuardianAnaly=csstudy,March,2010
80%
87%
40%
20%
Apollof500execu.vesandownersofsmallandmediumbusinessesshowed:
9 ©HortonworksInc.2011–2016.AllRightsReserved
Key AML Use Cases
10 ©HortonworksInc.2011–2016.AllRightsReserved
Case1:UnderstandCustomerProfile(KYC)• CaseDescrip.on:MrAlexisaComplianceofficeratABCbank.Whilescru=nizingnumberofthecustomerprofileandaccount
ac=vityhenotedsomesuspiciousac=vityinoneofthecustomer'saccount.Customerprofileandaccountac=vityhasthefollowinginforma=on.
• CustomerProfile:– Individualcustomeraccount,RiskTypeClassifica=on–Sensi=veClient,SeniorPublicFigure.Customerscarryingoutlarge
transac=ons– Anumberoftransac=onsintherangeof$10000to5,000,000carriedoutbythesamecustomerwithinashortspaceof=me– Anumberofcustomerssendingpaymentstothesameindividual
• UniquenessofUsecase:Mul=–ChannelLinkedAccountsinvolvingmul=plegeography• Dataelementsinvolved
– CustomerData– Transac=onDataover5yearperiod
• Challengeswithcurrenttechnology– Mul=pleLinkedAccountsandPastHistorybeyond6monthsDataretrieval– Real-=mevisualiza=on
l Suppor.ngDatarequiredtosimulatetheusecase– CrossCurrency,CrossGeographyLoca=ons– Mul=pleChannelsTransac=ons– Mul=pleCrossCurrencytransac=onsfromUSD,SGD,GBPandEUR– NearlyxAccounts– AcrossGeographyin50countries– Between500-600CR/DBtransac=oneveryMonth
l Results/Objec.veofUseCase:TodemonstrateMul=Channeltransac=onswithhistoricdatasetl Visualiza.ontoshowresultsofusecase:Tobeiden=fied
11 ©HortonworksInc.2011–2016.AllRightsReserved
Case2:Mul.ProductLinkedAccounts(KYC)• CaseDescrip.on:AcustomerprofilewithabusinessprofilewithlinkedaccountsandTransac=onacrossproductsandinvestments.
Therearemanyfunneledtransac=onsintotheaccountandinvestmentsacrossgeographicalloca=onsofhighriskcountries.• CustomerProfile:
– Businesscustomeraccount,RiskTypeClassifica=on–HighRiskClient,Customerscarryingoutlargetransac=ons– ComplexandLargecashtransac=onsintherangeof$50,000above– Mul=pleExchangeofcashinonecurrencyforforeigncurrency– Highcashbusinessessuchasrestaurants,pubs,casinos,taxifirms,beautysalonsandamusementarcades– Anumberofcustomerssendingpaymentstothesameindividual
• UniquenessofUsecase:Mul=–ProductLinkedAccounts• Dataelementsinvolved
– CustomerMasterProfile– ProductMaster– Transac=onsoverxyeardataset
• Challengeswithcurrenttechnology– Mul=pleLinkedAccountswithMul=products– Real-=melinkvisualiza=onandtracking
l Suppor.ngDatarequiredtosimulatetheusecase– CrossCurrency,CrossGeographyLoca=ons– Mul=pleProductTransac=onsandwiredtransac=ons– Mul=pleCrossCurrencytransac=onsfromUSD,SGD,GBPandEUR– NearlyxLinkedAccounts– AcrossGeographyin50countries– Between2000CR/DBtransac=oneveryMonth
l Results/Objec.veofUseCase:TodemonstrateProducttransac=onlinkswithhistoricdatasetl Visualiza.ontoshowresultsofusecase:Tobeiden=fied
12 ©HortonworksInc.2011–2016.AllRightsReserved
Case3:$200MillionCreditCardFraud• CaseDescrip.on:OnFeb.5,federalauthori=esarrested13individualsallegedlyconnectedtooneofthebiggestpaymentcard
schemeseveruncoveredbytheDepartmentofJus=ce.Thedefendants'allegedcriminalenterprise-builtonsynthe=c,orfake,iden==esandfraudulentcredithistories-crossednumerousstateandinterna=onalborders,inves=gatorssay.
• CustomerProfile:– 169BankAccounts– 25000FraudulentCreditcards– 7000falseiden==es– WiredTransac=onacrossgeographies
l UniquenessofUsecase:Mul=plecustomerprofilestracking• Dataelementsinvolved
– CustomerMasterProfilel Challengeswithcurrenttechnology
– Mul=CustomerProfiletrackingandverifica=on– Accurateprofileverifica=onbycross-verifica=onofpublicrecordswithu=litybillsandbankaccountsaroundtheworld– Createasingleen=tyview(SEV)ofsimilaren==es– Detectaliaseswhethertheyarecreatedinten=onallyorthroughhumanerror– Iden=fyirregulari=esinuserinput– Reducefalseposi=vesthroughdataenrichment
l Suppor.ngDatarequiredtosimulatetheusecase– CrossGeographyLoca=onsProfiles– xLinkedAccountsacrossdifferentbanksandproducts
l Results/Objec.veofUseCase:TodemonstrateDE-duplica=onofcustomerprofilesandverifica=onofiden=tyl Visualiza.ontoshowresultsofusecase:Tobeiden=fied
13 ©HortonworksInc.2011–2016.AllRightsReserved
Case4:SocialNetworkAnalysis• CaseDescrip.on:AnalysisofSocialNetworkNetworksitestoestablishlinkswithfraudulentcustomersLinks• CustomerProfile:
– CustomerProfileswithover5Millionrecords– AcrossGeographyin50countries– Search,matchandlinkwithTelephone,MobileNumber,Email,SocialNetworkIDs– Iden=fyirregulari=esinuserinput– Protectindividualprivacyconcernsthroughanonymousresolu=on,displayingeitherthefullmatchingrecords– Reducefalseposi=vesthroughDataenrichment
l UniquenessofUsecase:SocialNetworkAnalysisofCustomerProfiles• Dataelementsinvolved
– CustomerMasterProfilel Challengeswithcurrenttechnology
– AbilitytolinktosocialnetworksitesandTextAnalysisl Suppor.ngDatarequiredtosimulatetheusecase
– CustomerProfilesgleanedfromsocialnetworksiteslikeFacebook,LindedIn,Myspaceandothersocialnetworks/communi=es
l Results/Objec.veofUseCase:TodemonstrateSocialNetworkiden=tylinkswithcustomerprofilestoestablishFraudulentcustomerprofilesandtoreducefalseiden=ty
l Visualiza.ontoshowresultsofusecase:Tobeiden=fied
14 ©HortonworksInc.2011–2016.AllRightsReserved
Case5:WatchListFilteringandTextMining• CaseDescrip.on:Watchlistfilteringprimaryrequirementistorou=nelyscancurrentandprospec=veclientsagainstadatabase
(watchlist)consis=ngofnames,akaandaddressentries.• CustomerProfile:
– Compareandscru=nize1,000,000namesontheglobalPEPlist– Nearly120sanc=onsliststhatcollec=velyhavemorethan20,000profiles.– Watchlistscreeningiscrea=nganeffec=vescreeningprocessthatminimizesfalseposi=vesandfalsenega=ves.– Search,matchandlinkwithnamesandprovidecomparisonwithactualandoriginalrecords
l UniquenessofUsecase:TextMiningofUnstructuredData• Dataelementsinvolved
– CustomerMasterProfilel Challengeswithcurrenttechnology
– UnstructureddataresultsinFalsePosi=ves– NumberofMatchingRulesandEaseofincorpora=ngMatchMatrixchanges.– CustomerDataIntegrity– Foreignnames,mul=partnames,hyphenatednames,nameswhich“sound”similarbutspelleddifferently(eg.Muhammedv/sMohamad)
l Suppor.ngDatarequiredtosimulatetheusecase– OFAC'sSDNlist,BankofEnglandList,DeniedPerson'sList
l Results/Objec.veofUseCase:TodemonstrateReliableandscalablewatch-listfilteringl Visualiza.ontoshowresultsofusecase:Tobeiden=fied
15 ©HortonworksInc.2011–2016.AllRightsReserved
à Needforhighlyinterac=veandvisuallyappealingUI’sforinves=ga=onà Needforadvancedanaly=csfordeeperinsightintotrendsincustomerbehavior.
à HigherdegreeofdepthofanalysisinAMLprogram.à GuardagainstAgingtechnologyandManualapproachesà AutomatedRiskClassifica=onApproachesà NeedtoreducethevolumeofFalseposi=vesà Theneedforstructuredandunstructureddataanalysis
Data Analysis Trends in AML
16 ©HortonworksInc.2011–2016.AllRightsReserved
l Higherdegreeoftechnologysophis=ca=onamongcriminalsl AMLprogramsneedtomovefromrunningdetec=onprocessesonsimilardata
sets,toopera=ngacrossdiversedataFraudpaiernsoffrauddemand360viewofRiskaswellasanabilitytoworkacrossmorecomplexandlargerdatasets
l Mostillicitac=vi=esspanningacrossgeographies,productsandaccountsl LackofefficiencyinInves=ga=onToolsandProcessesl ExpertSystemsorRulesEnginebasedapproachesbecomingineffec=vel Predic=veapproachtodetec=ngfraudisemergingasakeytrendl Movetoincreasedautoma=onl Theamountofdatathatisneededtofeedthepredic=veapproachesisgrowing
exponen=ally.
What we are seeing in AML..
17 ©HortonworksInc.2011–2016.AllRightsReserved
Where current solutions fall short
18 ©HortonworksInc.2011–2016.AllRightsReserved
à FragmentedBookofRecordTransac=onsystems– Lendingsystemsalonggeographicandbusinesslines– Tradingsystemsalongdeskandgeographiclines
à Fragmentedenterprisesystems– Mul=plegeneralledgers– Mul=pleEnterpriseRiskSystems– Mul=plecompliancesystemsbybusinessline
• AMLforRetail,AMLforCommercialLending,AMLforCapitalMarkets…• Lackofreal=medataprocessing,transac=onmonitoringandhistoricalanaly=csà Typicallyproprietaryvendorandin-housebuiltsolu=onsthathavebeenacquiredover
theyearsbuildingupasignificanttechnologicaldebt.
à Unabletokeeppacewiththeprogressoftechnology
à MovetocombineFraud(AML,CreditCardFraud&InfoSec)intooneplavorm
à Issueswithflexibility,costandscalability
WhatWeHaveSeenatBanks
19 ©HortonworksInc.2011–2016.AllRightsReserved
High Level Solution - Architecture Predictive Analytics
20 ©HortonworksInc.2011–2016.AllRightsReserved
Someessen.aldataelementsforAML:StructureandUnstructured
à Inflowandouvlow
à Linksbetweenen==esandaccounts
à Accountac=vity:speed,volume,anonymity,etc.
à Reac=va=onofdormantaccounts
à Signerrela=onship
à Depositmix
à Transac=onsinareasofconcern
à Useofmul=pleaccountsandaccounttypes
à SocialMediaBehavior
à Etc.
21 ©HortonworksInc.2011–2016.AllRightsReserved
BigDataforFinancialCrimesandControls-Solu.onà Theuniquenatureofmoneylaunderingrequiresanewgenera=onofsolu=onsbasedon
– VastvarietyofHistoricalData– Businessrules– fuzzylogic– DataMining– supervisedandunsupervisedlearningandothermachinelearningtechnologiestoincrease
detec=onandreducefalseposi=ves.
à Toimplementanextgenera=onsolu=onforBSA/AML,firmsmustlooktowardsupdatedmachinelearningtoolsthatallowfinergrainresolu=onatthescaleneededtodetectAML.
à PhasedApproach– RuleBasedModel(CrawlPhase)– FeaturebasedModel(WalkPhase)– DataDrivenModel(RunPhase)
22 ©HortonworksInc.2011–2016.AllRightsReserved
AMLSolu.on:RuleBasedSolu.on(CrawlPhase)
à ManualAnalysisbyainves=gator
à Subjec=veandInconsistent
à TimeConsuming
à HighFalsePosi=ve
à Constantupdatetorules
à NotabletoCatchnomodesofFrauds
KeyHighlightsandChallenges
Transac=onData
LexisNexis
AccountsDatabase
PaymentData
Carddata
DashboardtoMatchData
NOT
AlertsfromRuleBasedSystem
Suspicious
RuleBasedAMLSolu=on
23 ©HortonworksInc.2011–2016.AllRightsReserved
AMLSolu.on:FeatureBasedSolu.on(WalkPhase)Rulebase&Supervised&UnsupervisedLearningforAML
à Featuresaremetadata(Extractedfromthedata)--averagebalanceoflast7days
à Featureshelpalgorithmscaptureinforma=onfromthedata.
à Featureengineeringisaformoflanguagetransla=on:Betweenrawdataandthealgorithm.
à UsesSupervisedand/orunsupervisedMachineLearning
à Quickclassifica=on
à Lowfalseposi=verate-tweakedbasedonriskappe=te.
Keyhighlights
Transac=onData
LexisNexis
AccountsDatabase
PaymentData
Carddata
DashboardtoMatchData
NOT
AlertsfromMLBasedSystem
Suspicious
MachineLearning
Algorithms
HistoricalAlerts
24 ©HortonworksInc.2011–2016.AllRightsReserved
TypeofMachineLearningandPoten.alUsage
25 ©HortonworksInc.2011–2016.AllRightsReserved
NextGenAMLSolu.on:DataDrivenBasedSolu.on(DeepLearning)
à Thealgorithmunderstandsmaliciousbehaviorthroughdata
à Algorithmissmarttoworkwithoutfeatures-metadata
à Doesnotneedalertsfortraining
à Helpsiniden=fyinganykindofanomalousbehavior
à Deeperinsightsaboutcustomer
Keyhighlights
Transac=onData
LexisNexis
AccountsDatabase
PaymentData
Carddata
NOT
SuspiciousDeeplearningAlgorithms
DataDrivenSolu=on
26 ©HortonworksInc.2011–2016.AllRightsReserved
HighLevelSystemArchitecture:MAXROI&FutureProofSolu.onNoteJustforAML/Fraud
SourceData
(examples)
Data.gov
Accounts
Transac=ons
lexisNexis
Social
Real-TimeEventStreamingEngine
DynamicCustomerProfile/Risk
Appe=teModel
CentralDataLake
Real-.meIntelligentAc.on• RiskSimilarity/RiskProfiling• RelatedEn=tyAnalysis(graphdatabase)• Fraud/SocialNetworkAnalysis• Mul=-line“profitable”classcode• Geospa=aldata• Updatedriskappe=te
RiskScoringEngine(examples)• Creditscore(ifallowedbyregulatoryagencies)• Ra=ngaiributes(demograhics,geographic,
social,propertyaiributes)• Likelihoodoffraud/risk(frequency/severity)
EnrichEventswithCustomer/Riskinfoand
ScoringModels
UpdateProfilesandScoringModels
External/3rdpartyDataSources
Na=veAPI
RestAPI
ODBC/JDBC
UpdateDataLake
Visualiza.on/Analy.calViews
27 ©HortonworksInc.2011–2016.AllRightsReserved
KeyDeliverabletobuildBigDataSolu.on
à Automa=ngDueDiligencearoundKYCdata– Simpleinforma=oncollectedduringcustomeronboarding– Morecomplexinforma=onforcertainen==es– Applyingsophis=catedanalysistosuchen==es– Automa=ngResearchacrossnewsfeeds(LexisNexis,DB,TR,DJ,Googleetc)
à EfficientCaseManagement
à CaptureallDataSetatoneplace
à ApplyingAdvancedAnaly=cs(twosubUseCases)– ExploratoryDataScience– AdvancedTransac=onIntelligence– MachineLearning/DeepLearning
28 ©HortonworksInc.2011–2016.AllRightsReserved
BusinessAnaly.csMustEvolveToDealWithDataTippingPoint
PROVIDEINSIGHTINTOTHEPASTviadataaggrega.on,datamining,
businessrepor.ng,OLAP,visualiza.on,dashboards,etc.
UNDERSTANDTHEFUTUREviasta.s.calmodels,forecas.ngtechniques,machinelearning,etc.
ADVISEONPOSSIBLEOUTCOMESviarules,op.miza.onandsimula.onalgorithms
29 ©HortonworksInc.2011–2016.AllRightsReserved
TheDataTippingPoint
DriversofaConnectedDataArchitecture
30 ©HortonworksInc.2011–2016.AllRightsReserved
à Afreeopensourcelinearlyscalableplavormhasonlybecomeavailablewithinthelastfewyears
à Duetotheamountofregula=onoverthelast15yearsallbankenterprisecompliance,riskandfinancesystemsnowfunc=onessen=allythesameway
à Bankspartneringwithanopensourcepartnerisverydifferentfrompartneringwithavendorwhodevelopsproprietarysoyware
à Proprietarysoywarevendorswilladoptthenewstandardssinceitisintheirselfinteresttodoso
à Regulatorscannowstreamlinetheirregulatoryprac=cesbyadop=ngaBigDatabasedapproach
à HavingastandardsbasedOpenSourceplavormmeansthatregulatorscanusethesameplavormasthebanks
WhyWillThisWorkNow?
31 ©HortonworksInc.2011–2016.AllRightsReserved
DigitalBankingSolu.onArchitecture
DistributedFileSystemStaging,Database,Structured,Unstructured,Archival,Document
DataOpera.ngSystemMul=-purposeplavormenablement
Governance&Integra.on BusinessWorkflow
Batch Search In-Memory Real-Time PivotalHAWQSQL Predic.ve
RetailBankingApps Marke.ngApps SVC
Storage
Processing
Applica.ons&Workloads
EnterpriseSecurity
NBA
RetailBankingEnterpriseData&ComputeLake
CustomerJourney
Social
RDBMS
Mainframe
DocumentMgmtSystems
DataSilos
CoreBanking
IndustryRef.
WebLogs
BankingSources
BusinessAnaly=cs
Other…
DataScience
BI&Repor.ng
SAS
BusinessLogicLayer
CloudCompu.ngStack(PublicorPrivate)PublicCloud,PrivateCloud,HybridCloudsuppor=ngafullstackofVMsandDocker
32 ©HortonworksInc.2011–2016.AllRightsReserved
Q&A