gotcha! network analytics to augment fraud detection · gotcha! network analytics to augment fraud...
TRANSCRIPT
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Gotcha! Network Analytics to augment
Fraud DetectionBig Data in the Food Chain: the un(der)explored goldmine?
Author: Véronique Van VlasselaerVéronique Van VlasselaerVéronique Van VlasselaerVéronique Van Vlasselaer
SAS Pre-Sales Analytical Consultant
December 4th, 2018
Copyright © SAS Inst itute Inc. A l l r ights reserved.
IntroductionFraud Analytics Using Descriptive, Predictive and Social Network Techniques:
A Guide to Data Science for Fraud Detection
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• Main analytical question in fraud:
Given the current network, who shall be the next one that commits fraud?
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• Traditional approach in an fraud context:
• Finding descriptive patterns (e.g. multivariate outliers) or predictive patterns(e.g. predictive analytics) in massive amounts of structured data
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• Traditional approach in an fraud context:
• Finding descriptive patterns (e.g. multivariate outliers) or predictive patterns(e.g. predictive analytics) in massive amounts of structured data
Multivariate Outlier DetectionMultivariate Outlier DetectionMultivariate Outlier DetectionMultivariate Outlier Detection Predictive AnalyticsPredictive AnalyticsPredictive AnalyticsPredictive Analytics
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• State-of-the-art insights grounded in social sciences:
• Fraud is “socially” contagious.
- If Bart and Peter are both fraudsters, and Véronique is friends of Bart and Peter, what would you expect of Véronique’s behavior?
• Extension of traditional detection approaches by including social interactions among fraudsters (and other people).
• Data issue: networked data is unstructured.
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!Credit Card Transaction Fraud
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!Social Security Fraud
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• Networked data? Where to find?
• Much more than data on social media channels.
• Call behavior data
• Review data
• Transactional data
• Employee data
• Financial data
• Sales data (e.g. Ebay)
• ...
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• Networked data? Where to find?
• International agro-food trade network
- Network of food suppliers and nations
- Detection of faulty food production
- Impact of food contamination
- How to quickly shortcut a potential safety breach?
• Food supply network
- Network of raw material suppliers, food processors andretail
• Chemical networks
- Network of OTU’s
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network Analytics! Say what?!
• Main analytical question in fraud:
Given the current network, who shall be the next one that commits fraud?
• Main analytical solutions
• Featurization of the network
• Collective inference algorithms (incl. behavioral propagation)
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Featurization
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network AnalysisFeaturization
• Featurization is the process in which the unstructured network is transformed to a structured form
unstructured dataunstructured dataunstructured dataunstructured data structured datastructured datastructured datastructured data predictive modelpredictive modelpredictive modelpredictive model
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network AnalysisFeaturization
• Featurization is the process in which the unstructured network is transformed to a structured form
unstructured dataunstructured dataunstructured dataunstructured data structured datastructured datastructured datastructured data predictive modelpredictive modelpredictive modelpredictive model
FEATURIZATIONFEATURIZATIONFEATURIZATIONFEATURIZATION
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network AnalyticsNetwork Representation
• Sociogram:
• Matrix representation:
Copyright © SAS Inst itute Inc. A l l r ights reserved.
• Network featurization processNetwork featurization processNetwork featurization processNetwork featurization process based on
• the first-order neighborhood or egonet of each entity
- How many churners/fraudsters/adopters are connected to node (i.e., degree)?
- Density of the egonet?
- Number of suppliers/addresses/customers from a black list in the egonet?
- Velocity of the network (time-based network analysis)?
- …
• the n-order neighborhood
- Betweenness, closeness, community detection…
Network AnalyticsFeature Engineering
Copyright © SAS Inst itute Inc. A l l r ights reserved.
• Network featurization processNetwork featurization processNetwork featurization processNetwork featurization process examples:
• the n-order neighborhood
Network AnalyticsFeature Engineering
BetweennessBetweennessBetweennessBetweenness ClosenessClosenessClosenessCloseness
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Network AnalysisFeaturization
•
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Collective Inference Algorithms
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Collective Inference Algorithms
• Collective inference algorithms
• The label of a node is said to dependent on the labels of the neighboring nodes.
• Chicken-egg problem:
- The label of node A depends on the label of node B, and
- The label of node B depends on the label of node A.
• In general: iterative procedure with random ordering
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Collective Inference Algorithms
RULE: RULE: RULE: RULE: IF MORE THAN HALF OF IF MORE THAN HALF OF IF MORE THAN HALF OF IF MORE THAN HALF OF NEIGHBORS IS FRAUDULENT, NEIGHBORS IS FRAUDULENT, NEIGHBORS IS FRAUDULENT, NEIGHBORS IS FRAUDULENT, NODE IS FRAUDULENTNODE IS FRAUDULENTNODE IS FRAUDULENTNODE IS FRAUDULENT
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Collective Inference Algorithms
• Influence propagation through the network
• E.g. Gotcha!, based on Google’s famous PageRank algorithm
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Collective Inference Algorithms
• Influence propagation through the network
• E.g. Gotcha!, based on Google’s famous PageRank algorithm
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Conclusion: Fraud DetectionA Hybrid Approach
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Hybrid Approach for Detection
Capability
Manual Detection
Rules
Predictive Models
Fraud Network
Analysis
Anomaly Detection
Va
lue
HYBRID ANALYTICAL METHODSHYBRID ANALYTICAL METHODS
Copyright © SAS Inst itute Inc. A l l r ights reserved.
Questions? Feedback? [email protected]