fight fraud with big data analytics
DESCRIPTION
You can view the full presentation of this webinar here: http://info.datameer.com/Slideshare-Fighting-Fraud-this-Holiday-Season.html In 2012, retailers lost $3.5 billion in revenue to online fraud. These losses spike by a substantial estimated 20% during the holiday season. Join Datameer and Hortonworks in this webinar to learn how Big Data Analytics can be used to identify new fraud schemes during peak fraud season. In this webinar, you will learn about: current challenges in identifying fraud what to look for in a big data solution addressing fraud how big data analytics can identify credit card fraud best practicesTRANSCRIPT
© 2013 Datameer, Inc. All rights reserved.
Fight Fraud with Big Data Analytics this Holiday Season
View Full Recording
View the full recording of this webinar at:
http://info.datameer.com/Slideshare-Fighting-Fraud-this-Holiday-Season.html
© 2013 Datameer, Inc. All rights reserved.
Fight Fraud with Big Data Analytics this Holiday Season
About our Speakers
Karen Hsu (@Karenhsumar) – Karen is Senior Director, Product Marketing
at Datameer. With over 15 years of experience in enterprise software, Karen Hsu has co-authored 4 patents and worked in a variety of engineering, marketing and sales roles.
– Most recently she came from Informatica where she worked with the start-ups Informatica purchased to bring data quality, master data management, B2B and data security solutions to market.
– Karen has a Bachelors of Science degree in Management Science and Engineering from Stanford University.
About our Speakers • John Kreisa (@marked_man)
– A veteran from the enterprise marketing industry John has worked worked on products at every level of the IT stack from the depths of storage through to the insight of business intelligence and analytics. Currently John leads partner and strategic marketing initiatives at open source leader Hortonworks who develops, distributes and supports Apache Hadoop.
© 2013 Datameer, Inc. All rights reserved.
Fight Fraud with Big Data Analytics this Holiday Season
Agenda • Current challenges • What to look for in a solution addressing
fraud
• Demo • Q&A
Challenges
Merchants paying $200-250B in fraud losses annually
Banks and Financial Organizations losing $12-15B annually
eTailers lost $3.5B to online fraud
Over 20B credit card
transactions annually
H E L L O my name is
greg 7-ELEVEN
$4.10
$3.22 $4.55
$5.15 $4.15
$3.95
Location Data Transactions Authorizations POS Reports
Face of Fraud is Changing
© Hortonworks Inc. 2013
Challenges with Existing Data Architecture AP
PLICAT
IONS
DATA
SYSTEM
REPOSITORIES
SOURC
ES
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Business Analy4cs
Custom Applica4ons
Packaged Applica4ons
Source: IDC
2.8 ZB in 2012
85% from New Data Types
15x Machine Data by 2020
40 ZB by 2020
© 2013 Datameer, Inc. All rights reserved.
What to Look For in a Fraud Analytics Solution
Big Data Analytics Lifecycle
1. Integrate
3. Analyze
4. Visualize 2. PrepareIdentify
Use Case Deploy
Modern Day Architecture
© 2013 Datameer, Inc. All rights reserved.
▪ Use Cases " Customer Analytics " Operational Analytics " Legacy Modernization " Fraud and Compliance
ROI and TCO Methodology " ROI customer metrics"" ROI and TCO calculator"
Funnel Optimization
Behavioral Analytics
Fraud Prevention
EDW Optimization
Customer Segmentation
Increase Customer
conversion by 3x Increase
Revenue by 2x Identify $2B in potential fraud
98% OpEx savings$1M+
CapEx savings
Lower Customer Acquisition
Costs by 30%
Define!
© 2013 Datameer, Inc. All rights reserved.
Polling question 1
Polling Question What use cases are looking at or implementing today? ▪ Profiling and segmentation ▪ Product development and operations optimization ▪ Cross-sell / up-sell ▪ Campaign management ▪ Acquisition and retention ▪ EDW optimization ▪ Fraud and compliance ▪ Other
© 2013 Datameer, Inc. All rights reserved.
Codeless Integration " Reuse existing DB views and SQL"" 50+ Datameer connectors, plug-in API"
Integrate!Big Data Management " Data Partitioning"" Data Retention policies"
© 2013 Datameer, Inc. All rights reserved.
Interactive Data Preparation
" JSON, XML, URL-specific functions
" Multi-column joins, unions"
Interactive + Smart Analytics
" 250+ built-in functions"
" Automated machine learning"
" SmartSampling "
Transparency + Governance
" Visual data lineage"
" Complete audit trail"
" Metadata catalog"
Prepare and Analyze!
© 2013 Datameer, Inc. All rights reserved.
Visualization Anywhere " Infographic or dashboard"
" Run on tablets and smart phone devices"
Visualize!
Visual Discovery " Machine Learning algorithms"
© 2013 Datameer, Inc. All rights reserved.
Scheduling " Dependency triggers"
" Data synchronization"
" External scheduling integration"
Monitoring " Monitoring system, jobs, performance, throughput"
" Error handling"
" Log management"
Deploy!Security " LDAP / Active Directory "
" Role based access control"
" Support for Kerberos"
© Hortonworks Inc. 2013 - Confidential
Modern Data Architecture Enabled
Page 20
APPLICAT
IONS
DATA
SYSTEM
REPOSITORIES
SOURC
ES
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Emerging Sources (Sensor, Sen4ment, Geo, Unstructured)
OPERATIONAL TOOLS
MANAGE & MONITOR
DEV & DATA TOOLS
BUILD & TEST
Business Analy4cs
Custom Applica4ons
Packaged Applica4ons
© Hortonworks Inc. 2013 - Confidential
Integrated Interoperable with existing data center investments Skills
Leverage your existing skills: development, operations, analytics
Requirements for Hadoop Adoption
Page 21
Key Services Platform, operational and data services essential for the enterprise
3 Requirements for Hadoop’s Role in the Modern Data Architecture
© Hortonworks Inc. 2013 - Confidential
1
Integrated Engineered with existing data center investments
Key Services Platform, Operational and Data services essential for the enterprise Skills Leverage your existing skills: development, analytics, operations
2
3
Requirements for Enterprise Hadoop
Page 22
OS/VM Cloud Appliance
PLATFORM SERVICES
CORE
Enterprise Readiness High Availability, Disaster Recovery, Rolling Upgrades, Security and Snapshots
HORTONWORKS DATA PLATFORM (HDP)
OPERATIONAL SERVICES
DATA SERVICES
HDFS
SQOOP
FLUME
NFS
LOAD & EXTRACT
WebHDFS
KNOX*
OOZIE
AMBARI
FALCON*
YARN
MAP TEZ REDUCE
HIVE & HCATALOG PIG HBASE
© Hortonworks Inc. 2013 - Confidential
Requirements for Enterprise Hadoop
Page 23
1
Integration Engineered with existing data center investments
Key Services Platform, operational and data services essential for the enterprise
Skills Leverage your existing skills: development, analytics, operations
2
3 DE
VELO
P AN
ALYZE
OPE
RATE
COLLECT PROCESS BUILD
EXPLORE QUERY DELIVER
PROVISION MANAGE MONITOR
© Hortonworks Inc. 2013 - Confidential
Familiar and Existing Tools
Page 24
1 Key Services Platform, operational and data services essential for the enterprise
Skills Leverage your existing skills: development, analytics, operations
2
DEVE
LOP
ANAL
YZE
OPE
RATE
COLLECT PROCESS BUILD
EXPLORE QUERY DELIVER
PROVISION MANAGE MONITOR
Integration Interoperable with existing data center investments 3
© Hortonworks Inc. 2013 - Confidential
APPLICAT
IONS
DATA
SYSTEM
REPOSITORIES
SOURC
ES
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
RDBMS EDW MPP
Emerging Sources (Sensor, Sen4ment, Geo, Unstructured)
OPERATIONAL TOOLS
MANAGE & MONITOR
DEV & DATA TOOLS
BUILD & TEST
Business Analy4cs
Custom Applica4ons
Packaged Applica4ons
Requirements for Enterprise Hadoop
Page 25
Integration Engineered with existing data center investments 3
Integrated with Applications Business Intelligence, Developer IDEs, Data Integration
Systems Data Systems & Storage, Systems Management
Platforms Operating Systems, Virtualization, Cloud, Appliances
© Hortonworks Inc. 2013 - Confidential
Datameer in the Modern Data Architecture
Page 26
APPLICAT
IONS
DATA
SYSTEM
SOURC
ES
RDBMS EDW MPP
Emerging Sources (Sensor, Sen4ment, Geo, Unstructured)
HANA
OPERATIONAL TOOLS
DEV & DATA TOOLS
Exis4ng Sources (CRM, ERP, Clickstream, Logs)
INFRASTRUCTURE
© 2013 Datameer, Inc. All rights reserved.
Demonstration 1
Identifying Potential Fraud
How much has been spent at a vendor? Is that spend normal?
Were there transactions… When a credit card stolen?
Identify Outliers in Transactions 1. Calculate average and standard deviation
for each category 2. Identify outliers in all transactions
Transaction Amount
Category Average - > 2 * Std Dev of
Category
© 2013 Datameer, Inc. All rights reserved.
Demonstration 2
Fraud and Data Mining on Hadoop
Clustering Column Dependencies
Decision Tree Recommendations
© 2013 Datameer, Inc. All rights reserved.
Demonstration 3
Model Deployment Integration / Execution
Model Building
Datameer Server
UPPI
PMML (models) PMML (models) PMML (models)
PMML
Predictive Modeling and Datameer
Predictive Modeling and Fraud
1. Bring in model
2. Apply function data to get likelihood transaction is fraudulent
Next Steps:
Page 35
More about Datameer and Big Data www.datameer.com
Get started on with Datameer and Hortonworks http://hortonworks.com/hadoop-tutorial/datameer/
Contact us: John Kreisa [email protected] Karen Hsu [email protected]
Polling Question What part of webinar did you find the most useful? ▪ Use cases ▪ Tool ease of use of setup comparison ▪ Tool quality comparison ▪ Best practices ▪ Demonstration
Q&A
© 2013 Datameer, Inc. All rights reserved.
Best Practices
Calculating ROI is a process
Apply ROI to Multiple Projects
BusinessBenefit
SoftwareSavings
HardwareSavings Productivity
Project 1
Project 2
Project 3
Calculating Return
Costs ReturnBenefits - =
Hardware
Software
Operations
People
Integration
Identify Fraud
Increase Sales
Improve Marketing
Increase Conversion
Improve Product
Lower IT expenses
$$$
Time
Flexibility
Logistics
Universal Plug-In Overview Features and Model Types
42
The Plug-in delivers a wide range of predictive analytics for high performance scoring, including:
• Decision Trees for classification and regression • Neural Network Models: Back-Propagation, Radial-Basis Function, and Neural-Gas • Support Vector Machines for regression, binary and multi-class classification • Linear and Logistic Regression (binary and multinomial) • Naïve Bayes Classifiers • General and Generalized Linear Models • Cox Regression Models • Rule Set Models (flat decision trees) • Clustering Models: Distribution-Based, Center-Based, and 2-Step Clustering • Scorecards (including reason codes) • Association Rules • Multiple Models: Model ensemble, segmentation, chaining and composition
It also implements the a data dictionary, missing/invalid values handling and data pre-processing.