machine intelligence for fraud prediction

23
MACHINE INTELLIGENCE FOR FRAUD PREDICTION #paymentsecurity Dmitry Petukhov, ML/DS Preacher, Machine Intelligence Researcher @ OpenWay && Coffee Addicted

Upload: dmitry-petukhov

Post on 21-Jan-2018

74 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Machine Intelligence for Fraud Prediction

MACHINE INTELLIGENCE FOR

FRAUD PREDICTION

#paymentsecurity

Dmitry Petukhov,ML/DS Preacher,

Machine Intelligence Researcher @ OpenWay &&

Coffee Addicted

Page 2: Machine Intelligence for Fraud Prediction

Говорят, что компьютерная программа обучается на основе опыта E по отношению к

некоторому классу задач T и меры качества P, если качество решения задач из T, измеренное

на основе P, улучшается с приобретением опыта E.T.M. Mitchell. Machine Learning, 1997.

Машинное обучение — процесс, в результате которого машина (компьютер) способна

показывать поведение, которое в нее не было явно заложено (запрограммировано).

A.L. Samuel. Some Studies in Machine Learning Using the Game of Checkers, 1959.

Терминология

Page 3: Machine Intelligence for Fraud Prediction

Machine Learning is the FutureThesis #1

Page 4: Machine Intelligence for Fraud Prediction

Machine Intelligence Cases for Retail Banking

Personalized

Product Offering

Real-timeBatch Processing

Processing Speed

Log

(Vo

lum

e)

Pbytes

Tbytes

Gbytes

Structured

data

Semi-structured

Unstructured

Customer Loyalty

Operational Efficiencies

Fraud Detection

Compliance and

Regulatory Reporting

Voice Identity, Chat-bots

Customer Segmentation

Credit Scoring

Credit Card Fraud

Web-/Mobile Bank Fraud

Insider Threats

Information Attacks

Page 5: Machine Intelligence for Fraud Prediction

Data are everywhereThesis #2

Page 6: Machine Intelligence for Fraud Prediction

Card-not-present Fraud Volume == Big Data caseV

olu

me

Variety

Velocity

Page 7: Machine Intelligence for Fraud Prediction

Machine Intelligence + Big Data New Paradigm

Page 8: Machine Intelligence for Fraud Prediction

Old School vs Big Data Paradigm

Page 9: Machine Intelligence for Fraud Prediction

Dynamic threshold

Static* threshold

Old School vs AI Paradigm

* ∆t attack ≪ ∆t reaction

Page 10: Machine Intelligence for Fraud Prediction

Evolution or and Revolution

1.

2.

3.

FALSE

FALSE

TRUE

Page 11: Machine Intelligence for Fraud Prediction

Data

Infrastructure

Intelligence

Machine Intelligence Stack

Page 12: Machine Intelligence for Fraud Prediction

MachineHuman

Private cloud Public cloudHybrid cloud

Forget or Secure Store and share

Machine Intelligence Stack

Cost

Law? Ethics?

Black box?

Page 13: Machine Intelligence for Fraud Prediction

Architecture: Data Flow OnlineReal-time processing

Transactions stream

Risk score

Internal dataTransactions Log (WAY4),

customers/merchants CRMs,

black/white lists

External dataНБКИ, ФНС, ПФР, ФССП,

location & devices definition, social

graph, mobile provider score

1. Preprocessing data 2. Calculate statistics 3. Train model 4. Evaluate model

DetailsRaw Aggregates Model

Private data (152-ФЗ)

Payment data (PCI DSS)

0. Retrieve data

Page 14: Machine Intelligence for Fraud Prediction

Step 1: Preprocessing DataTransaction Amount Challenge

1. 2. 3.

Page 15: Machine Intelligence for Fraud Prediction

1. 2.

Step 1: Preprocessing DataCustomer Clustering Challenge

Page 16: Machine Intelligence for Fraud Prediction

Step 2: Calculate Statistics

Page 17: Machine Intelligence for Fraud Prediction

1% женщин в возрасте 40 лет, участвовавших в регулярных обследованиях, имеют рак груди. 80% женщин с раком

груди имеют положительный результат маммографии. 9,6% здоровых женщин также получают положительный

результат (маммография, как любые измерения, не дает 100% результатов).

Женщина-пациент из этой возрастной группы получила положительный результат на регулярном обследовании.

Какова вероятность того, что она фактически больна раком груди?

Step 2: Calculate Statistics

Page 18: Machine Intelligence for Fraud Prediction

Step 3: Train ModelAlgorithm Selection Challenge

Algorithm Accuracy Speed Specifics

1. Logistic regression low fast linearly separable

2. Decision Tree low medium human-readable

3. Boosted Decision Tree high medium generalization ability

4. Neural Networks medium-high low pattern recognition

5. Deep Learning high very low magic AI

Page 19: Machine Intelligence for Fraud Prediction

Step 4: Evaluate Model

Accur𝑎𝑐𝑦 =𝑇𝑃 + 𝑇𝑁

𝑃 + 𝑁

𝑅𝑒𝑐𝑎𝑙𝑙∗=

𝑇𝑃

𝑇𝑃 + 𝐹𝑁

Challenges:

Imbalanced classes;

False Positive Penalty != False Negative Penalty;

Calculate business-metrics:

Direct and indirect losses;

Bonus:

if you change Threshold, you will change everything…

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑇𝑃

𝑇𝑃 + 𝐹𝑃

𝐹2 =𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∙ 𝑟𝑒𝑐𝑎𝑙𝑙

𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙

Wikipedia

Page 20: Machine Intelligence for Fraud Prediction

Rule-based or AI-based?

Page 21: Machine Intelligence for Fraud Prediction

References1. Bansal, M. Credit Card Fraud Detection Using Self Organised Map (2014) International Journal of Information & Computation Technology,

Volume 4, Number 13.

2. Chan, P.K., Fan, W., Prodromidis, A.L., Stolfo, S.J. Distributed data mining in credit card fraud detection (1999) IEEE Intelligent Systems and

Their Applications, 14 (6).

3. Grolinger, K., Hayes, M., Higashino, W.A., L'Heureux, A., Allison, D.S., Capretz, M.A.M. Challenges for MapReduce in Big Data (2014)

Proceedings of the 2014 IEEE World Congress on Services.

4. Khan, A., Akhtar, N., and Qureshi, M. Real-Time Credit-Card Fraud Detection using Artificial Neural Network Tuned by Simulated Annealing

Algorithm (2014) ACEEE, Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC 2014 Chandigarh,

India.

5. Lu, Q., Ju, C. Research on credit card fraud detection model based on class weighted support vector machine (2011) Journal of Convergence

Information Technology, 6 (1).

6. Mardani, S., Akbari, M.K., Sharifian, S. Fraud detection in Process Aware Information systems using MapReduce (2014) 2014 6th Conference on

Information and Knowledge Technology, IKT 2014.

7. Dmitry Petukhov, A. Tselykh. Web service for detecting credit card fraud in near real-time (2015) Proceedings of the 8th International

Conference on Security of Information and Networks.

Advanced References

1. Максим Федотенко. Как защищают банки: разбираем устройство и принципы банковского антифрода. Журнал Хакер, 2017.

2. Дмитрий Петухов. Цикл статей: Антифрод как сервис. Интернет-ресурс 0xCode.in, 2016.

Page 22: Machine Intelligence for Fraud Prediction

© 2017, Dmitry Petukhov. CC BY-SA 4.0 license. OpenWay and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.

Thank you!

Page 23: Machine Intelligence for Fraud Prediction

Q&A

Now or later (see contacts below)

Stay connected

Facebook: @code.zombi

Habr: @codezombie

All contacts: http://0xCode.in/author

Download presentation from

http://0xCode.in/2017/paymentsecurity or