anomalous node detection in homophilic networks with ......juan camilo campos dpt. electrical...

58
Anomalous Node Detection in Homophilic Networks with Communities of Varying Size Juan Camilo Campos Dpt. Electrical Engineering and Computer Science Pontificia Universidad Javeriana Santiago de Cali, Colombia May 2017 J. Campos (PUJ) Anomalous Node Detection May 2017 1 / 38

Upload: others

Post on 24-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

  • Anomalous Node Detection in HomophilicNetworks with Communities of Varying Size

    Juan Camilo Campos

    Dpt. Electrical Engineering and Computer SciencePontificia Universidad Javeriana

    Santiago de Cali, Colombia

    May 2017

    J. Campos (PUJ) Anomalous Node Detection May 2017 1 / 38

  • Motivation

    - Large interaction platforms

    - Hundred of thousands of transactions

    - Size → easy target for fraudsters (anomalous nodes)- Anomalous node: someone trying to deceive regular user be-

    havior

    J. Campos (PUJ) Anomalous Node Detection May 2017 2 / 38

  • Scenario

    RegistrationName

    Email

    Password

    By creating an account, you are agree to

    this page Conditions of Use and Privacy

    Notice.

    Con rm Password

    Fake 3

    [email protected]

    **********

    **********

    RegistrationName

    Email

    Password

    By creating an account, you are agree to

    this page Conditions of Use and Privacy

    Notice.

    Con rm Password

    Fake 2

    [email protected]

    **********

    **********

    RegistrationName

    Email

    Password

    By creating an account, you are agree to

    this page Conditions of Use and Privacy

    Notice.

    Con rm Password

    Fake 1

    [email protected]

    **********

    **********

    J. Campos (PUJ) Anomalous Node Detection May 2017 3 / 38

  • Scenario

    J. Campos (PUJ) Anomalous Node Detection May 2017 3 / 38

  • Scenario

    Regular usersRegular users

    J. Campos (PUJ) Anomalous Node Detection May 2017 3 / 38

  • Overview

    1 Previous concepts

    2 Problem definition

    3 The model

    4 Properties

    5 Approach A

    6 Approach B - proposed

    7 Empirical network

    8 Conclusions

    J. Campos (PUJ) Anomalous Node Detection May 2017 4 / 38

  • Previous concepts

    Homophily

    J. Campos (PUJ) Anomalous Node Detection May 2017 5 / 38

    networkFormation.movMedia File (video/quicktime)

  • Previous concepts

    Random Link Attacks (RLAs)

    J. Campos (PUJ) Anomalous Node Detection May 2017 6 / 38

    networkFormationWithFrausters.movMedia File (video/quicktime)

  • Problem Definition

    Given:

    - A network that is divided into two communities (with stronghomophilic relationships); and

    - Some anomalous nodes who perform RLAs.

    We want to:

    - Characterize the expected cohesion indices for varying com-munity sizes, and

    - Find anomalous node, i.e., users who are performing RLAs.

    J. Campos (PUJ) Anomalous Node Detection May 2017 7 / 38

  • Related work

    - K. Guerrero and J.Finke, “On the formation of community structuresfrom homophilic relationships”, IEEE Proceedings of the American Con-trol Conference, (Montreal, Canada), pp. 5318-5323, June 2012

    - X. Ying, X. Wu, and D. Barbará, “Spectrum based fraud detection insocial networks,” Proceedings of the IEEE International Conference onData Engineering, (Hannover, Germany), pp.912-923, April 2011

    J. Campos (PUJ) Anomalous Node Detection May 2017 8 / 38

  • Characterize dynamics

    J. Campos (PUJ) Anomalous Node Detection May 2017 9 / 38

  • The model

    G(t) = (N,A(t)): network at time t

    N = {1, ..., n}: set of nodes

    A(t), {i, j} ∈ A(t) if node i links to node j: set of edges

    M(t), mi,j(t) ∈ {0, 1}: adjacency matrix

    N1, N2: sets of regular nodes

    N0: set of anomalous nodes

    gi : N → {0, 1, 2}: type of a node

    J. Campos (PUJ) Anomalous Node Detection May 2017 10 / 38

  • The model

    Ai(t) = {{j′, j} ∈ A(t) : j′ = i}: neighborhood

    Ri(t) ⊆ Ai(t): subset of edges established by a node

    r = |Ri(t)|: edges that each node establishes

    kδi (t): number of links to nodes of the same type

    ki(t) = |Ai(t)|: degree of a node

    J. Campos (PUJ) Anomalous Node Detection May 2017 11 / 38

  • regular node

    wc =

    {w if gi = gc,

    1− w if gi 6= gc.

    J. Campos (PUJ) Anomalous Node Detection May 2017 12 / 38

  • regular node

    πc(t) =wc kc(t)1∑

    {i,j}∈Aci(t)

    wjkj(t)

    wc =

    {w if gi = gc,

    1− w if gi 6= gc.

    J. Campos (PUJ) Anomalous Node Detection May 2017 12 / 38

  • regular node

    πc(t) =wc kc(t)1∑

    {i,j}∈Aci(t)

    wjkj(t)

    wc =

    {w if gi = gc,

    1− w if gi 6= gc.

    J. Campos (PUJ) Anomalous Node Detection May 2017 12 / 38

  • anomalous node

    J. Campos (PUJ) Anomalous Node Detection May 2017 13 / 38

  • anomalous node

    πc =

    1− wan0

    if gc = 0,

    wan1 + n2

    if gc ∈ {1, 2}.

    J. Campos (PUJ) Anomalous Node Detection May 2017 13 / 38

  • anomalous node

    πc =

    1− wan0

    if gc = 0,

    wan1 + n2

    if gc ∈ {1, 2}.

    J. Campos (PUJ) Anomalous Node Detection May 2017 13 / 38

  • Topological Measures

    Cohesion index

    hδ(t) =1nδ

    ∑i∈Nδ

    kδi (t)ki(t)

    The average proportion ofneighbors of the same type

    J. Campos (PUJ) Anomalous Node Detection May 2017 14 / 38

  • Topological Measures

    Community modularity

    q(t) =2∑δ=1

    ( |{i, j} ∈ A(t) : gi = gj = δ||A(t)|

    −|{i, j} ∈ A(t) : gi = δ or gj = δ|2

    |A(t)|2)

    Modularity is based on thenumber of edges withincommunities compared to thenumber of edges between them

    J. Campos (PUJ) Anomalous Node Detection May 2017 14 / 38

  • Topological properties

    Expected cohesion index

    minority group majority group

    J. Campos (PUJ) Anomalous Node Detection May 2017 15 / 38

  • Topological properties

    Average community modularity

    0.05

    0.1

    0.150.2

    0.25

    0.3

    0.35 0.4

    0.45

    0.2 0.4 0.6 0.8 1.00.5

    0.6

    0.7

    0.8

    0.9

    1.0

    n1 /n2

    Preferencew

    J. Campos (PUJ) Anomalous Node Detection May 2017 16 / 38

  • Spectral properties

    J. Campos (PUJ) Anomalous Node Detection May 2017 17 / 38

  • Detection (Approach B)

    Edge-non-randomness

    f(i, j) = ||αi||2||αj ||2 cos(αi, αj)

    J. Campos (PUJ) Anomalous Node Detection May 2017 18 / 38

  • Detection (Approach B)

    Edge-non-randomness

    f(i, j) = ||αi||2||αj ||2 cos(αi, αj)

    cos(αi, αj) ≈ 0

    J. Campos (PUJ) Anomalous Node Detection May 2017 18 / 38

  • Detection (Approach B)

    Edge-non-randomness

    f(i, j) = ||αi||2||αj ||2 cos(αi, αj)

    cos(αi, αj) ≈ 1

    J. Campos (PUJ) Anomalous Node Detection May 2017 18 / 38

  • Detection (Approach A)

    Identifying Suspects

    - Degree of membership to well-defined communities basedon node-non-randomness

    fi(t) =∑

    j∈Ai(t)f(i, j) =

    2∑j=1

    λj(t) z2ji(t)

    - Suspect if

    fi ≤ BEi + β(BVi )1/2

    BEi and BVi : upper bounds of the expected value and

    variance

    J. Campos (PUJ) Anomalous Node Detection May 2017 19 / 38

  • Detection (Approach A)

    Detecting anomalous nodes

    Most-dense subgraph (number of edges/number of nodes)

    D=12/8=1.5

    J. Campos (PUJ) Anomalous Node Detection May 2017 20 / 38

  • Detection (Approach A)

    Detecting anomalous nodes

    Most-dense subgraph (number of edges/number of nodes)

    D=11/7=1.57

    J. Campos (PUJ) Anomalous Node Detection May 2017 20 / 38

  • Detection (Approach A)

    Detecting anomalous nodes

    Most-dense subgraph (number of edges/number of nodes)

    D=9/6=1.5

    J. Campos (PUJ) Anomalous Node Detection May 2017 20 / 38

  • Detection (Approach A)

    Detecting anomalous nodes

    Most-dense subgraph (number of edges/number of nodes)

    D=8/5=1.6

    J. Campos (PUJ) Anomalous Node Detection May 2017 20 / 38

  • Detection (Approach A)

    Detecting anomalous nodes

    Most-dense subgraph (number of edges/number of nodes)

    D=8/5=1.6

    J. Campos (PUJ) Anomalous Node Detection May 2017 20 / 38

  • Algorithm performance

    Performance Measures

    Accused nodes

    Anomalous

    nodes

    e1 = 2/3

    e2 = 1/3

    - False positive error rate (e1): number of regular nodes ac-cused as anomalous nodes over the total number of accusednodes

    - True positive error rate (e2): number of anomalous nodesdetected over the total number of anomalous nodes

    J. Campos (PUJ) Anomalous Node Detection May 2017 21 / 38

  • Algorithm performance

    Performance Measures

    Area acceptable performance

    e1 ≤ 0.05 and e2 ≥ 0.95J. Campos (PUJ) Anomalous Node Detection May 2017 22 / 38

  • Algorithm performance (Approach A)

    Identification of suspects

    J. Campos (PUJ) Anomalous Node Detection May 2017 23 / 38

  • Algorithm performance

    Detection of anomalous nodes

    J. Campos (PUJ) Anomalous Node Detection May 2017 24 / 38

  • Algorithm performance

    Area of acceptable performance

    bad

    performance

    acceptable

    performance

    J. Campos (PUJ) Anomalous Node Detection May 2017 25 / 38

  • Performance of the Approach A for all generatednetworks

    J. Campos (PUJ) Anomalous Node Detection May 2017 26 / 38

  • Node-non-randomness

    xxxx xx

    xxxxxx xxxx xxxxxxxx xx xxxxxxxx xxxx

    x xx xxxxxxxxxxxx xx xxxx

    x

    -------------------------------

    ---------------

    -------------

    -----------

    ----------

    ---------

    --------

    --

    0 20 40 60 80 1000.000

    0.005

    0.010

    0.015

    0.020

    0.025

    0.030

    0.035

    Degree ki

    node

    -non-ran

    domnessf i

    J. Campos (PUJ) Anomalous Node Detection May 2017 27 / 38

  • Node-non-randomness

    xxxx xx

    xxxxxx xxxx xxxxxxxx xx xxxxxxxx xxxx

    x xx xxxxxxxxxxxx xx xxxx

    x

    -------------------------------

    ---------------

    -------------

    -----------

    ----------

    ---------

    --------

    --

    0 20 40 60 80 1000.000

    0.005

    0.010

    0.015

    0.020

    0.025

    0.030

    0.035

    Degree ki

    node

    -non-ran

    domnessf i

    Suspects distinguishable fromregular nodes

    design parameterβ = 2

    J. Campos (PUJ) Anomalous Node Detection May 2017 27 / 38

  • Node-non-randomness