110524 iapa social networks

Upload: rohan-baxter

Post on 05-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 110524 IAPA Social Networks

    1/16

    Social Network Analysis forIntelligence and Analytics

    Rohan Baxter

    Corporate Analytics

    Office of the Chief Knowledge Officer

    Australian Taxation Office

    24 May 2011

  • 8/2/2019 110524 IAPA Social Networks

    2/16

    Social Network Mining 2

    Overview

    What is a Social Network?

    Analytics vs Intelligence/Visualisation

    A scalable network-finding algorithm

    Four case studies (abstracted, de-identified)

    A method for representing networks

    Future work/lessons learnt

  • 8/2/2019 110524 IAPA Social Networks

    3/16

    Social Network Mining 3

    What is a Social Network

    Network consists of nodes and links

    Real Social Network: nodes are people and links

    mean is a friend of

    Node type examples: forms, companies,

    individuals, various asset classes

    Link type examples: share an attribute,

    distributes to, is owner of

  • 8/2/2019 110524 IAPA Social Networks

    4/16

    Social Network Mining 4

    Social Network Analysis Uses

    Issues around network data collection

    Data Analysis

    Low volume, manual Specific targets, Complex

    Analysis

    (for Intelligence or Audit work)

    High volume, automated risk of assessment of

    large populations of networks

    (for Analytics for larger scale treatments)

    Visualisation (usually low volume, but see later)

  • 8/2/2019 110524 IAPA Social Networks

    5/16

    Social Network Mining 5

    Social Network Risk Assessment

  • 8/2/2019 110524 IAPA Social Networks

    6/16

    Social Network Mining 6

    SQL Implementation

    Consolidation of internal implementations in a

    variety of systems and languages (e.g. Visual

    Basic, Python, SAS, R, Netmap )

    SQL implementation is about 45 lines, follows

    Union-Find algorithm.

    Demonstrated advantages: Scalable, Correct,

    Concise

  • 8/2/2019 110524 IAPA Social Networks

    7/16

    Social Network Mining 7

    Algorithm has Fast Convergence

    Network Dete ction Algorithm Convergen

    1

    10

    100

    1000

    10000

    100000

    1000000

    10000000

    1 2 3 4 5 6 7 8 9

    Numbe r of Iteration

    Log(Number

    of

    Components

    to

    Merge)

  • 8/2/2019 110524 IAPA Social Networks

    8/16

    Social Network Mining 8

    Case Studies

    Purpose No. of Nodes No. ofLinks

    Filter low-risk entities from a

    list of high risk entities

    2,000 8,000

    Using starting-list of identifiedhigh-risk entities and find

    related unknown high riskentities

    300-2,000

    2,000-16,000

  • 8/2/2019 110524 IAPA Social Networks

    9/16

    Social Network Mining 9

    Case Studies

    Purpose No. of nodes

    No. ofLinks

    Find non-agent tax

    returns that appear tohave a common

    guiding mind

    1.2m

    Networks ofinterest:

    6,000

    20m

    Find high-risk entitynetworks involvingcompany structures

    2m 18m

  • 8/2/2019 110524 IAPA Social Networks

    10/16

    Social Network Mining 10

    A High Risk Social Network

  • 8/2/2019 110524 IAPA Social Networks

    11/16

    Social Network Mining 11

    Network Representation in a database field

    (coy -> ind)

    (ptr -> ind)

    (trt->ind unk)

    (coy -> (coy -> ind))

  • 8/2/2019 110524 IAPA Social Networks

    12/16

    Social Network Mining 12

    Network Representation in a Database Field

  • 8/2/2019 110524 IAPA Social Networks

    13/16

    Social Network Mining 13

    Power laws: No. of Networks vs Network Size

    1

    10

    100

    1000

    10000

    0 20 40 60 80 100

    Network Size (Trus ts-Beneficiarie

    NumberofNetworks(

    scale)

  • 8/2/2019 110524 IAPA Social Networks

    14/16

    Social Network Mining 14

    Risk Differentiation for Networks

  • 8/2/2019 110524 IAPA Social Networks

    15/16

    Social Network Mining 15

    Related Work

    PWC Research Centre, San Jose, Ca: Used networkdetection algorithm to assist with financial accounts

    audit, by highlight high-risk entries in a general ledger

    Internal Revenue Service(IRS), US: Built database of

    company networks and a graphical tool to query the

    networks with known scheme structures

    Detecting securities fraud based on network of

    relationships between brokers

  • 8/2/2019 110524 IAPA Social Networks

    16/16

    Social Network Mining 16

    Lessons Learnt

    Continued awareness of network -level risk

    assessments in forward plan of risk assessment work

    Using links has helped reduce false positive rate fordiscovering non-compliant entities

    Network discovery is intuitive and reinvented atleast 6 times across organisation, but advantages in

    corporate approach to get it scalable and correct