gotcha! network analytics to augment fraud detection · gotcha! network analytics to augment fraud...

Post on 05-Oct-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Gotcha! Network Analytics to augment

Fraud DetectionBig Data in the Food Chain: the un(der)explored goldmine?

Author: Véronique Van VlasselaerVéronique Van VlasselaerVéronique Van VlasselaerVéronique Van Vlasselaer

SAS Pre-Sales Analytical Consultant

December 4th, 2018

Copyright © SAS Inst itute Inc. A l l r ights reserved.

IntroductionFraud Analytics Using Descriptive, Predictive and Social Network Techniques:

A Guide to Data Science for Fraud Detection

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• Main analytical question in fraud:

Given the current network, who shall be the next one that commits fraud?

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• Traditional approach in an fraud context:

• Finding descriptive patterns (e.g. multivariate outliers) or predictive patterns(e.g. predictive analytics) in massive amounts of structured data

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• Traditional approach in an fraud context:

• Finding descriptive patterns (e.g. multivariate outliers) or predictive patterns(e.g. predictive analytics) in massive amounts of structured data

Multivariate Outlier DetectionMultivariate Outlier DetectionMultivariate Outlier DetectionMultivariate Outlier Detection Predictive AnalyticsPredictive AnalyticsPredictive AnalyticsPredictive Analytics

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• State-of-the-art insights grounded in social sciences:

• Fraud is “socially” contagious.

- If Bart and Peter are both fraudsters, and Véronique is friends of Bart and Peter, what would you expect of Véronique’s behavior?

• Extension of traditional detection approaches by including social interactions among fraudsters (and other people).

• Data issue: networked data is unstructured.

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!Credit Card Transaction Fraud

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!Social Security Fraud

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• Networked data? Where to find?

• Much more than data on social media channels.

• Call behavior data

• Review data

• Transactional data

• Employee data

• Financial data

• Sales data (e.g. Ebay)

• ...

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• Networked data? Where to find?

• International agro-food trade network

- Network of food suppliers and nations

- Detection of faulty food production

- Impact of food contamination

- How to quickly shortcut a potential safety breach?

• Food supply network

- Network of raw material suppliers, food processors andretail

• Chemical networks

- Network of OTU’s

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network Analytics! Say what?!

• Main analytical question in fraud:

Given the current network, who shall be the next one that commits fraud?

• Main analytical solutions

• Featurization of the network

• Collective inference algorithms (incl. behavioral propagation)

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Featurization

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network AnalysisFeaturization

• Featurization is the process in which the unstructured network is transformed to a structured form

unstructured dataunstructured dataunstructured dataunstructured data structured datastructured datastructured datastructured data predictive modelpredictive modelpredictive modelpredictive model

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network AnalysisFeaturization

• Featurization is the process in which the unstructured network is transformed to a structured form

unstructured dataunstructured dataunstructured dataunstructured data structured datastructured datastructured datastructured data predictive modelpredictive modelpredictive modelpredictive model

FEATURIZATIONFEATURIZATIONFEATURIZATIONFEATURIZATION

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network AnalyticsNetwork Representation

• Sociogram:

• Matrix representation:

Copyright © SAS Inst itute Inc. A l l r ights reserved.

• Network featurization processNetwork featurization processNetwork featurization processNetwork featurization process based on

• the first-order neighborhood or egonet of each entity

- How many churners/fraudsters/adopters are connected to node (i.e., degree)?

- Density of the egonet?

- Number of suppliers/addresses/customers from a black list in the egonet?

- Velocity of the network (time-based network analysis)?

- …

• the n-order neighborhood

- Betweenness, closeness, community detection…

Network AnalyticsFeature Engineering

Copyright © SAS Inst itute Inc. A l l r ights reserved.

• Network featurization processNetwork featurization processNetwork featurization processNetwork featurization process examples:

• the n-order neighborhood

Network AnalyticsFeature Engineering

BetweennessBetweennessBetweennessBetweenness ClosenessClosenessClosenessCloseness

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Network AnalysisFeaturization

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Collective Inference Algorithms

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Collective Inference Algorithms

• Collective inference algorithms

• The label of a node is said to dependent on the labels of the neighboring nodes.

• Chicken-egg problem:

- The label of node A depends on the label of node B, and

- The label of node B depends on the label of node A.

• In general: iterative procedure with random ordering

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Collective Inference Algorithms

RULE: RULE: RULE: RULE: IF MORE THAN HALF OF IF MORE THAN HALF OF IF MORE THAN HALF OF IF MORE THAN HALF OF NEIGHBORS IS FRAUDULENT, NEIGHBORS IS FRAUDULENT, NEIGHBORS IS FRAUDULENT, NEIGHBORS IS FRAUDULENT, NODE IS FRAUDULENTNODE IS FRAUDULENTNODE IS FRAUDULENTNODE IS FRAUDULENT

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Collective Inference Algorithms

• Influence propagation through the network

• E.g. Gotcha!, based on Google’s famous PageRank algorithm

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Collective Inference Algorithms

• Influence propagation through the network

• E.g. Gotcha!, based on Google’s famous PageRank algorithm

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Conclusion: Fraud DetectionA Hybrid Approach

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Hybrid Approach for Detection

Capability

Manual Detection

Rules

Predictive Models

Fraud Network

Analysis

Anomaly Detection

Va

lue

HYBRID ANALYTICAL METHODSHYBRID ANALYTICAL METHODS

Copyright © SAS Inst itute Inc. A l l r ights reserved.

Questions? Feedback? Comments?veronique.van.vlasselaer@sas.com

top related