fraud system based on big data and machine learning

Click here to load reader

Post on 16-Apr-2017

409 views

Category:

Data & Analytics

2 download

Embed Size (px)

TRANSCRIPT

  • - [email protected]

    - [email protected]

  • 1

    .

    1

    ...

  • 2

    2

  • 3 3

    2013

  • 4

    4

  • 5

    5

  • 6

    6

    ATM,POS,INTERNET,BRANCH

    ,

    ...

    /

  • 7

    7

  • 8 8

  • 9 9 9

  • 10 10 10

  • 11 11 11

  • 12

    2014 Deloitte The Netherlands

  • ( 1

    13

    . . .

    : . .

  • 14

    ( 2

    POS

    ATM

    10%20%

    70%

    12 10 10% 20% 65% 5%

    1%4%

    95%

    107 12 60% 0% 5% 35%

    21 100

    .

  • ( 2

    15

    ....

  • ( 2

    16

    : . . .

    : (Concept Drift) . .

    Concept Drift

    POS

    ATM

    10%20%

    70%

    12 50 10% 20% 65% 5% -

    1%39%

    60%

    2 500 0% 50% 40% 10% -

  • ( 3

    17

    .

    .

  • ( 3

    18

    ,OC-SVM, SVDD: : Neural Network, SVM, Decision tree,

    .

  • (3

    19

    : .

    : . (. ) .

  • ( 4

    20

    30

    2

    37

    22 5

    1 2

    3 4

    1 2 30

    1 3 5

    2 3 22

    2 4 2

    3 4 37

  • ( 4

    21

    : ( )

    .

    . - --.....

    . ( edge) ( node)

    .

    https://fa.wikipedia.org/w/index.php?title=%D8%B3%D8%A7%D8%AE%D8%AA%D8%A7%D8%B1_%D8%A7%D8%AC%D8%AA%D9%85%D8%A7%D8%B9%DB%8C&action=edit&redlink=1

  • ( 4

    22

    1 10 30

    2 10 5

    3 10 22

    4 10 2

    5 10 37

  • ( 4

    23

  • ( 4

    24

    Strongly Connected Component :

    Label Propagation :

  • 25

    !

    So what's the problem?

  • 26

    : ) ) ) )

  • 27

    !

  • 28

  • 29 29 29

  • (Big Data)

    30

  • 31

  • 32

  • 33

    - Ad-hoc querying and reporting - Data mining techniques - Structured data, typical sources - Small to mid-size datasets

    - Optimizations and predictive analytics - Complex statistical analysis - All types of data, and many sources - Very large datasets - More of a real-time

    33

  • 34

  • 35

  • 36

  • 37

    Logistic regression in Hadoop and Spark

  • 38 Big Data Storymap

  • 39

    J.P.Morgan . (2014) .2014 AFP Payments Fraud and Control Survey , Report of Survey Results

    www.ismgcorp.com 94 .

    The Forrester Wave: Enterprise Fraud Management, Forrester, 2013

    2015 IBM Corporation Fraud Detection & Management System A real time actionable counter fraud decision management system Antonio

    DellOlio Senior IT Architect Barbara Camandone Client IT Manager

    Montazer, G. A., & ArabYarmohammadi, S. (2015). Detection of phishing attacks in Iranian e-banking using a fuzzyrough hybrid system. Applied Soft Computing, 35, 482-492. doi:10.1016/j.asoc.2015.05.059 Alcaraz, C., Cazorla, L., & Fernandez, G. (2014). Context-Awareness Using Anomaly-Based Detectors for Smart Grid Domains. In Risks and Security of Internet and Systems (pp. 17-34). Springer International Publishing. doi: 10.1007/978-3-319-17127-2_2 Pfitzmann, B., Powers, C., & Waidner, M. (2007). IBMs Unified Governance Framework (UGF) Initiative. IBM Research Division. Research Report RZ, 3699(99709), 10. Kaisler, S. H., Espinosa, J. A., Armour, F., & Money, W. H. (2014, January). Advanced Analytics--Issues and Challenges in a Global Environment. In System Sciences (HICSS), 2014 47th Hawaii International Conference on (pp. 729-738). IEEE. Katal, A., Wazid, M., & Goudar, R. H. (2013, August). Big data: Issues, challenges, tools and Good practices. In Contemporary Computing (IC3), 2013 Sixth International Conference on (pp. 404-409). IEEE. Mohanty, S., Jagadeesh, M., & Srivatsa, H. (2013). Big Data Imperatives: Enterprise Big DataWarehouse,BIImplementations and Analytics. Apress. Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 15. Doi:10.1145/1541880.1541882 Kovach.S, Ruggiero. W.V. (2011). Online Banking Fraud Detection Based on Local and Global Behavior. The Fifth International Conference on Digital Society 29-43). ACM. Dean, J., & Ghemawat, S. (2008). MapReduce: simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.

    http://www.ismgcorp.com/http://dx.doi.org/10.1016/j.asoc.2015.05.059http://dx.doi.org/10.1016/j.asoc.2015.05.059http://dx.doi.org/10.1016/j.asoc.2015.05.059http://dx.doi.org/10.1016/j.asoc.2015.05.059http://dx.doi.org/10.1145/1541880.1541882http://dx.doi.org/10.1145/1541880.1541882http://dx.doi.org/10.1145/1541880.1541882