credit fraud prevention on hwx stack

20
Credit Fraud Prevention on a Connected Data Platform Kirk Haslbeck, Sr. Solution Engineer HWX

Upload: kirk-haslbeck

Post on 19-Feb-2017

189 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Credit fraud prevention on hwx stack

Credit Fraud Prevention on a Connected Data Platform

Kirk Haslbeck, Sr. Solution Engineer HWX

Page 2: Credit fraud prevention on hwx stack

2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Building a Model Show of hands, how many have built a “Model”? What are some limitations?

– Conditional based logic: if/else binary decisions

If you need a lot of data to build a good model, what tools can you use?– Data volumes can eliminate the possibility of desktop tools

Sampling?– Well… we better get an even distribution of true and false positives in each sample, but wait that

requires data munging, back to what tools can we use.

Security Concerns?– Extracting data from it’s secure resting place and pushing it into other environments, often times

unsecure files or desktops where Matlab or R can be installed.

Collaboration– Push processing to the data using modern distributed tooling.

Page 3: Credit fraud prevention on hwx stack

3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

“All models are wrong, some are useful”

George E. P. Box

Most limiting factor is the data, with modern systems we are now able to capture more data and hopefully produce better insights

Page 4: Credit fraud prevention on hwx stack

4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Credit Card Fraud

Requirement: Detect fraudulent transactions. Goal: Save the card company money and build trust amongst card users. Cut down on

fraudulent crime Functional Requirement: Detect fraud in under 2 seconds at point of sale. Learn, adapt

and make smarter decisions over time. Design

– Distance: How far can one travel over a period of time before it is fraudulent?– Category: How can we detect a purchase that a customer wouldn’t likely make?– Frequency: How can we detect purchasing patterns that do not resemble the card holder?

Ideas?– White board some conditional logic, egregiousness vs binary– Back test the data– Build a model per card holder?

Page 5: Credit fraud prevention on hwx stack

5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Rules, Statistics, Machine Learning

Rule Based Logic– Great for checking conditions that can prove to be 100% accurate. Easy to build and no reason to

over engineer.– Example: Spending Limit. Card holder limit = $2,000

• If (currentPurchaseAmount + balance > 2,000) then deny transaction

Statistics– Mean, median, mode, variance, deviation– Anomaly detection. Outliers. (i.e. womens retail example)

Machine Learning– Supervised– Unsupervised– Trainable– Adapt over time

Page 6: Credit fraud prevention on hwx stack

6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Discovery

Gathered all Credit Card Transactions– Problem is they didn’t make sense– No identifiable patterns, no log normal curves– Gas $45, Chipotle $8.50, Steak dinner $88, Amazon shoes $55

Classification

Page 7: Credit fraud prevention on hwx stack

7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Outlier Detection: identify abnormal patterns

Example: identify anomaliesFeatures:- Time frequency- Category - Amount- Distance

Page 8: Credit fraud prevention on hwx stack

8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Demo

Show me the Code!

Page 9: Credit fraud prevention on hwx stack

9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Next Steps

Limitations of current model– In an Airport ready to fly out– Changes to behavior, like just got a new girlfriend– ? What else

Dependent on the quality of the analyst’s feedback

Tech Overview– Slider, Nifi, Kafka, Storm, Zeppelin, Spark, HBase

Page 10: Credit fraud prevention on hwx stack

10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

The Future of Data: Modern Data Application

D A T A I N M O T I O N

STO

RA

GE

STO

RA

GE

GROUP 2GROUP 1

GROUP 4GROUP 3

D A T A A T R E S T

INTERNETOF

ANYTHING

Hortonworks’ unique approach to data-in-motion and data-at-rest powers Actionable Intelligence

Page 11: Credit fraud prevention on hwx stack

11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

DATA AT REST

DATA IN MOTION

ACTIONABLEINTELLIGENCE

MODERN DATA APPLICATIONS

Actionable Intelligence from Connected Data Platforms

Capturing perishable insights from data in motion

Ensuring rich, historical insights on data at rest

Necessary for modern data applications

Hortonworks DataFlow

Hortonworks Data Platform

Page 12: Credit fraud prevention on hwx stack

12 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 12

Improved Experience

/Reduced Cost

Immediate Customer Feedback

Years of Customer

Transaction Data

Fraud Detection

Complete Customer

Profile

Real time ingest of

transactions

Proactively identify potential fraudulent transactions to protect the customer and improve customer experience• Proactively monitor every credit

card transaction using machine learning to catch potential fraud

• Customer Service Analyst reviews flagged transactions in real time via a next generation application running on the connected platform

• HDF controls real time flow of data in and out of the connected platform to the various source and destination points

Innovate

Renovate

Purchase Behavior Insight

Journey to Fraud Detection

Wei Wang
Please update the red portion
Page 13: Credit fraud prevention on hwx stack

13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

D A T A I N M O T I O N Elastic Compute

Machine Learning

Online Data

Interactive Query

Visualization

Data Acquisition

Data Routing

Simple/Complex Real-time Processing

Real Time Decisions

Queuing

D A T A I N M O T I O N

D A T A I N M O T I O N

Fraud Detection Demo Architecture

Wei Wang
Page 14: Credit fraud prevention on hwx stack

14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Fraud Detection Demo Architecture

Distributed Storage: HDFS

Many Workloads: YARN

Real-time Serving (HBase)

Spark(Machine Learning)

UI and HTTP PubSub(Jetty and Tomcat)

Real-Time Data Movement

(Apache Nifi)

Data Science(Zeppelin)

Resource Allocation(Slider)

Interactive Query(Hive on Tez)

Configuration Managem

ent(Am

bari)Authorization(Ranger)

Real Time Processing(Storm)

Inbound Messaging(Kafka)

D A T A I N M O T I O N

D A T A I N M O T I O N

D A T A I N M O T I O N

Governance(Atlas)

Wei Wang
Page 15: Credit fraud prevention on hwx stack

15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

Machine Learning: Enterprise Data Science at Scale

Use flow of data and the computing power of the connected platform to enable autonomous machine learning

• Real time data flows combined with massive parallel computing allows AI to continuously improve

• Enables AI to make decisions in the “Grey Areas”

Build and train AI on full volume data not a sample• Time, effort, accuracy, scale• Visualize data as it is being manipulated

Deploy the AI model without re-implementing• Spark models can be plugged into a modern

connected platform.

Page 16: Credit fraud prevention on hwx stack

16 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 16

Credit Fraud Analyst Inbox

Wei Wang
Can you please add a screen shot of the dashboard
Page 17: Credit fraud prevention on hwx stack

17 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 17

Hortonworks Data Flow

Wei Wang
Can you please add the similar slide for the fault detection
Page 18: Credit fraud prevention on hwx stack

18 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 18

Hortonworks Data Flow

Wei Wang
Can you please add the similar slide for the fault detection
Page 19: Credit fraud prevention on hwx stack

19 © Hortonworks Inc. 2011 – 2016. All Rights ReservedPage 19

Hortonworks Data Flow

Wei Wang
Can you please add the similar slide for the fault detection
Page 20: Credit fraud prevention on hwx stack

20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved