how to use graphs to identify credit card thieves?
DESCRIPTION
Graph technologies like Neo4j and Linkurious can help spot credit card thieves. How? Read this use case to learn!TRANSCRIPT
How to use graphs to identify credit card thieves
SAS founded in 2013 in Paris | http://linkurio.us | @linkurious
WHAT IS A GRAPH?
Father Of
Father Of
Siblings
This is a graph
WHAT IS A GRAPH : NODES AND RELATIONSHIPS
Father Of
Father Of
Siblings
A graph is a set of nodes linked by relationships
This is a node
This is a relationship
People, objects, movies, restaurants, music
Antennas, servers, phones, people
Supplier, roads, warehouses, products
Graphs can be used to model many domains
DIFFERENT DOMAINS WHERE GRAPHS ARE IMPORTANT
Supply chains Social networks Communications
But why can graphs can help identify credit card thieves?
GRAPH AND FRAUD DETECTION
Get access to the numbers...and turn them into cash
HOW CREDIT CARD THIEVES OPERATE
Steal the credit card
Make online purchases
Turn goods in cash
Later on, the criminal uses the credit card numbers to make purchases online. He chooses items that can be
sold back.
The criminal intercepts the goods at the shipping
address. He sells back the goods : now he has cash!
The criminal is an employee in a store. During check-out
he copies the credit card information of certain
customers.
The first step to detect card thieves is to turn transaction history into a graph
A GRAPH DATA MODEL TO IDENTIFY CARD THIEVES
Paul(Person)
Nicole(Person)
(Merchant)
(Merchant)
(Merchant)(Merchant)
(Merchant)
HAS_BOUGHT_AT29$ (05/05/2014)
8$ (0
5/05/2
014)
HAS_BOUGHT_A
T
19.5$ (05/05/2014)
HAS_BOUGHT_AT
8$ (06/05/2014)
HAS_BOUGHT_AT
10.5$
(05/0
5/201
4)
HAS_B
OUG
HT_AT
199$ (08/05/2014)
HAS_BOUGHT_AT
78.9$ (08/05/2014)
HAS_BOUGHT_AT
The edges are transactions. In red two fraudulent transactions.
WHERE IS THE THIEF?
We are looking for the common connection between the 2 victims….
The only place the theft could have happened is at the coffee shop...
LOOKING AT THE COMMON CONNECTION
Paul(Person)
Nicole(Person)
(Merchant)
(Merchant)(Merchant)
(Merchant)
HAS_BOUGHT_AT29$ (05/05/2014)
8$ (0
5/05/2
014)
HAS_BOUGHT_A
T
19.5$ (05/05/2014)
HAS_BOUGHT_AT
8$ (06/05/2014)
HAS_BOUGHT_AT
10.5$
(05/0
5/201
4)
HAS_B
OUG
HT_AT
199$ (08/05/2014)
HAS_BOUGHT_AT
78.9$ (08/05/2014)
HAS_BOUGHT_AT
(Merchant)
WHAT IF WE NEED TO ANALYSE >100M TRANSACTIONS?
Doing it in real life involves querying a large number of transactions to find connections
THE PAINS OF WORKING ON CONNECTED DATA WITH RELATIONAL TECHNOLOGIES
Relational databases are not good at handling... relationships
Depth RDBMS execution time (s) Neo4j execution time (s) Records returned
2 0.016 0.01 ~2500
3 30.267 0.168 ~110 000
4 1543.505 1.359 ~600 000
5 Unfinished 2.132 ~800 000
Finding extended friends in a 1M people social network (from the book Graph Databases)
GRAPH DATABASES MAKE IT POSSIBLE TO QUERY LARGE GRAPHS
Graph databases makes it possible to identify the fraud patterns in real-time
An event triggers security checks
Customer complaint
Suspicious transaction
Merchant alert
A Neo4j Cypher query runs to detect patterns
Identification of the fraudsters
EXAMPLE : A GRAPH QUERY TO IDENTIFY CREDIT CARD THIEVES
MATCH (victim:person)-[r:HAS_BOUGHT_AT]->(merchant)
WHERE r.status = "Disputed"
MATCH victim-[t:HAS_BOUGHT_AT]->(othermerchants)
WHERE t.status = "Undisputed" AND t.time < r.time
WITH victim, othermerchants, t ORDER BY t.time DESC
RETURN DISTINCT othermerchants.name as suspicious_store, count(DISTINCT t) as count, collect(DISTINCT victim.name) as victims
ORDER BY count DESC
EXAMPLE : A GRAPH QUERY TO IDENTIFY CREDIT CARD THIEVES
MATCH (victim:person)-[r:HAS_BOUGHT_AT]->(merchant)
WHERE r.status = "Disputed"
We select the victims, people involved in “disputed” transactions
MATCH victim-[t:HAS_BOUGHT_AT]->(othermerchants)
WHERE t.status = "Undisputed" AND t.time < r.time
We look at the transactions that happened before the fraudulent transactions
WITH victim, othermerchants, t ORDER BY t.time DESC
RETURN DISTINCT othermerchants.name as suspicious_store, count(DISTINCT t) as count, collect(DISTINCT victim.name) as victims
ORDER BY count DESC
We return the list of suspicious merchants, ordered by the number of transactions they are involved in
Complete explanation and dataset here!
The fraud teams acts faster and more fraud cases can be
avoided.
WHAT IS THE IMPACT OF LINKURIOUS
If something suspicious comes up, the analysts can use Linkurious to quickly assess the
situation
Linkurious allows the fraud teams to go deep in the data and build cases against fraud
rings.
Treat false positives
Investigate serious cases
Save money
Linkurious allows you to control the alerts and make sure your customers are not
treated like criminals.
TECHNOLOGY
Cloud ready and open-source based
OTHER USE CASES
Graphs are everywhere, learn to leverage them
Article on credit card thieves identification
- the article : http://linkurio.us/stolen-credit-cards-and-fraud-detection-with-neo4j/
- the dataset : https://www.dropbox.com/s/4uij4gs2iyva5bd/credit%20card%20fraud.zip
GraphGist on credit card fraud :
- the article : http://gist.neo4j.org/?3ad4cb2e3187ab21416b
SOME ADDITIONAL RESOURCES TO CONSIDER