big data: overhyped or underexploited?
TRANSCRIPT
www.fico.com Make every decision countTM
Big Data!Big Data?Big Data
Big Data
Big Data
Big Data!Big Data?
Big Data?
INSIGHTS WHITE PAPER
Big Data: Overhyped or Underexploited?Six analytic practices that extract more value and amplify performance
Number 81
The honeymoon of business and Big Data is over. Lately, Big Data has been the target of a bit
of backlash, including from the New York Times, Harvard Business Review, Wired and the
Financial Times. That’s probably because we’ve reached a moment of truth—a point where, early
vision aside, companies must figure out conclusively how to use Big Data analytics in profitable ways.
That can be easier said than done. Given the enormous volume of data, “signal” (useful information
for your purposes) must be separated from a whole lot of “noise” (useless information). Across the
wide variety of data, extracting more value can require adopting a diverse set of analytic techniques.
And the velocity with which Big Data accumulates and ages means analytic insights impact business
performance only to the extent they can be quickly brought into operations to drive actions.
So how are the leaders using Big Data analytics to achieve
better, faster customer decisions? Based on FICO’s work
with organizations of different sizes, in different industries
around the world, this paper examines six fundamental
practices for driving more value.
Find out how a leader boosted customer acquisition by 300% and cross-sell penetration by 15%—without raising risk
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
October 2014 www.fico.com page 2
While the effort to extract business value from Big Data is not without challenges, it’s clearly
underway. Last year, two surveys of large companies found that a significant percentage of
respondents (NewVantage Partners (NVP): 90%; IDG Enterprise: 49%) already had one or more Big
Data applications in place or in process. Meanwhile, cloud-based solutions are leveling the playing
field by expanding access to Big Data analytics, infrastructure and services for organizations of all sizes.
The value of most of these investments
will be measured in terms of the ability to
make better operational decisions. In the
NVP survey, 70% of companies investing in
Big Data projects said accelerating time-to-
answer—the speed with which they can
gain insights for answering critical business
questions to enable better fact-based
decisions—was their top goal. Decision
areas ranking high as investment drivers
included sales, marketing, risk, fraud and
customer management. In the IDG survey,
59% of respondents said improving the
quality of decision making was the top goal,
and 53% cited making quicker decisions as
of primary importance. According to 20% of
respondents, Big Data projects had already
improved the quality and speed of their
decision making.
This paper covers six best practices for
obtaining more business value from the
ever-growing volume, variety and velocity
of available data. We also share stories of
companies in various industries—including
financial services, retail and telecom—at the
forefront of implementing these practices
to drive performance gains.
1. Start with a business problem in mind
Exploring huge amounts of data with Hadoop and other advanced analytic tools can be lots of fun
for data scientists. But it can be a huge waste of time and resources if the results do not translate into
something that solves real-world business problems.
To identify projects that are both promising and practical, work with business experts to understand
their challenges and opportunities. Also, understand the types of problems the various types of Big
Data and analytic techniques can solve.
For instance, while much of the Big Data buzz has been around analyzing unstructured data such
as text and speech (discussed in best practice #3), the most important source of Big Data for most
businesses is consumer transactions. Payment card, DDA/current account and loyalty program
transactions produce abundant, timely streams of data. These are replete with granular details on the
what, when, how much and how often of individual spending.
Getting Down to Business with Big Data
Behind the big data backlash is the classic hype cycle, in which a technology’s early proponents make overly grandiose claims, people sling arrows when those promises fall flat, but the technology eventually transforms the world... The Economist, April 2014
The digital economy is all about capturing, analyzing, and using information to serve customers. Harvard Business Review, December 2013
Gartner released last week its latest Hype Cycle for Emerging Technologies. Last year, big data reigned supreme, at what Gartner calls the “peak of inflated expectations.” But now big data has moved down the “trough of disillusionment”… Forbes, August 2014
www.fico.com page 3
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
Until recently, most analysis of such data was done by banks and other creditors for fraud
detection. Today, however, with standards-based technologies greatly reducing the cost
of processing huge amounts of streaming data, transactional analysis is being adopted for
a much wider range of purposes. Throughout this paper, we showcase examples of FICO
clients tapping into transactional data for greater insight into both the risk and reward side of
customer relationships.
Automatically generated customer archetypes
#1 #2 #3 #4 #5 ... #200
35.2% 2.4% 50.5% 0.1% 11.8%
Best offer timingis six weeks from now
PRO
PEN
SITY
TIME
Offers
A C DB
Today
Allocation values available as variables for time-to-event predictive models
Simple probability model generates constant estimate
over an extended time horizon.
Nina’s current transaction allocated in real time across archetypes
Customer shopping transactions
PROBLEM: Target and interact with customers earlier in the purchase cycle
SOLUTION: Increase depth and speed of analytic insights
A leading North American supermarket chain has already seen
the predictive power of time-to-event (TTE) models that pinpoint
when a customer is likely to purchase particular products. Results
include targeted offers that produced a 150% lift in average visits for
redeemers over nonredeemers.
As a next step, FICO is exploring whether a combination of
descriptive and predictive analytics will lift performance further.
The descriptive technique used, FICO’s streaming Collaborative
Profiles, analyzed stock keeping unit (SKU) data and their
hierarchical groupings to automatically discover 200 similarity-
based customer groupings (called “archetypes”). As shown
in Figure 1, this technique allocates individual customers (by
percentage of similarity) across any number of archetypes—and
updates allocations in real time with every purchase.
The resulting insights tell the retailer, for example, that a customer
Nina purchases Thai foods at a certain frequency, prefers organic
products and is likely to buy nail polish while shopping for groceries.
The company can reach out to Nina at the right time in her purchase
cycle to offer recipes she’s likely to enjoy, a personalized shopping
list and discounts on ingredients. Knowing when to tuck a nail polish
e-coupon into Nina’s package is especially valuable to the retailer,
since this is a low-volume but high-margin product category.
Preliminary results indicate manifold improvements in model
performance. The biggest improvements are being seen in low-
volume SKUs, where the limited amount of data on observed
customer behavior makes it difficult to predict time-based
propensities with the plain-vanilla TTE model alone. For these
items, including archetypes in TTE models significantly improves
targeting capability.
FIGURE 1: ANALYZING TRANSACTIONAL DATA IN MORE WAYS TO DEEPEN UNDERSTANDING OF CUSTOMERS
CASE STUDY
www.fico.com page 4
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
2. Look ahead to how you’ll deploy insights in operations
To achieve real business value, you have to be able to operationalize the results of your analysis.
Although this seems obvious, far too many projects are left gathering dust or encounter delays
because it is too hard to leverage findings where they could provide value. The opportunity cost to
the company—from all the suboptimal decisions made in the interim—can be immense.
Wise selection of data is critical. What looks wonderful in the lab may not be available or may be
too expensive to obtain at the time needed for use in day-to-day business operations. Industry
regulations may affect where and how data can be used. Moreover, most analytics require extensive
calculations to be made from the raw data to turn it into useful variables and engineered predictive
features. All that has to happen in efficient, automated ways that make insights available fast enough
to drive operational decisions and actions.
Analytic development teams must carefully consider how
their models will be published and used by operations teams.
Models that rely on manually intensive data processing steps,
for instance, can cause problems at implementation. The
quality of scripts and how well they’ve been documented
may determine whether recoding is required for deployment.
And such issues can have far-reaching effects, especially in
regulated areas like lending and insurance underwriting,
where they make it difficult to explain and defend data-driven
decisions to auditors and customers.
Technology advances are helping organizations avoid these problems and speed up analytic lifecycle
processes. In the past, for instance, it wasn’t just models that might have to be recoded for target
applications. Each customer characteristic usually also had to be implemented separately, sometimes
requiring several hours for coding and testing. The current state-of-the-art is to use the same
characteristic coding in both development and production—enabling deployment of thousands of
characteristics at once in about the same amount of time it used to take for just one.
A key reason for this improvement is widespread use of business rules management, enabling
applications to execute or access analytic models as part of making decisions. As a result,
implementations increasingly revolve around deployable analytic libraries—sets of models,
characteristics and business rules codifying additional logic needed for production. In some cases,
everything needed for a scorecard model (good/bad schema, log odds slope and intercept,
performance window, scoring exclusions and characteristics associated bins with multiple ranges,
unexpected flags and reason codes, etc.) can be imported as a single package, ready for consumption
by the application. These streamlined methods not only reduce time to operational value, but also
make analytic work easier to share and reuse for multiple purposes.
In addition to being able to deploy analytics quickly, having a mechanism in place to enforce best
practices in model lifecycle management helps avoid development and implementation delays. It
also reduces the time and cost of regulatory compliance.
Our survey suggests most companies aren’t yet able to quickly introduce change to their operational systems, but that they are working to do better. The Era of Intimate Customer Decision is at Hand, Forrester Consulting, June 2013
www.fico.com page 5
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
Centralized model management, depicted in
Figure 2, automates manual tasks and accelerates
lifecycle processes. It captures granular detail
about the tasks performed and decisions made
at each step—data quality analysis, new variable/
predictor generation, segmentation analysis,
model engineering, pre-deployment testing,
periodic validation and updates. State-of-the-art
solutions make it possible to enforce consistent,
approved methods across the lifecycle of all
models, while giving business units and analytic
teams the flexibility to create workflows that fit
their needs.
Companies can also keep a detailed inventory of
every model in their operational environment.
With centralized management, they can even
monitor which predictive characteristics are in
use by which models, evaluate performance and
stability over time and manage the downstream
effects of characteristics changes. A top-five
US bank we work with considers the ability to
comprehensively manage not only models, but
also characteristics as essential for maximizing
analytic value.
FIGURE 2: SHARED INFRASTRUCTURE FOR ENTERPRISE-WIDE MODEL LIFECYCLE GOVERNANCE
ModelData Mart
Tracking
Monitoring
OngoingValidation
ManagementReporting
Alerts
DecisionSimulation
DecisionExecution
ScoringServices
DecisionOptimization
Development&
Calibration
Deployment&
Verification
ModelData Mart
ADVANCED
PR
OFESSIONAL
DEC
IS
IONING
DEV
ELOPMENT
FO
UN
DA
TION
Source: FICO® Model Central™
With consumer credit behavior changing and competition for good
customers intensifying, a super-regional bank needed a breakthrough
in cross-selling, while also reaching out to additional population
segments for new customers. Both depended on eliminating the
inefficiencies that occurred when individuals pre-approved for credit
did not qualify in underwriting for those offers.
Today the bank targets customers for acquisition and cross-selling
likely to be approved by underwriting. A shared analytic learning hub,
shown in Figure 3 (next page), enables marketing and originations
to base decisions on the same criteria. It also continuously captures
operational data and outcomes. Analyzing current customer
responses to credit decisions, business experts edit business rules
weekly to improve strategies. Marketing and originations learn from
each other’s results and work toward common profit and loss (P&L)
objectives for driving portfolio growth.
In the first year after deployment, acquisition volume rose 300%,
cross-sell penetration 15% and average balance per account 8%. And
by getting the right offers to the right individuals, the bank has been
able to maintain a steady risk profile.
PROBLEM: Acquire and cultivate more profitable customers
SOLUTION: Eliminate silos via a shared analytic learning hub
CASE STUDY
www.fico.com page 6
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
3. Leverage analytic innovationInnovations in Big Data processing and analytics are transforming how businesses get value from
their customer data. We’re seeing a shift from approaches that supply periodic snapshots in the
form of descriptive reports and dashboards (what happened) to systems that continuously analyze
incoming data to produce predictions (what is likely to happen) and prescriptions (what to do about
it) that are actionable in real-time.
Many types of analytics will increasingly operate inside production streams. Relying less on persistent
historical data, they’ll respond more to changes in the current environment. Analytic outputs will be
combined with complex event processing to enable very rapid responses to customer behavior.
Big Data tools and infrastructure are also making it easier to apply machine learning techniques to
explore huge datasets that include a wide variety of structured and unstructured data. The right
balance of these techniques with human analytic and domain expertise not only lifts business
performance but also improves the ability of companies to learn at a fast pace from data-driven
experiments.
Here are a few highlights of advanced techniques delivering tangible business value:
Fraud detection is at the vanguard of Big Data analytics for business. Fraud management
systems have analyzed huge amounts of streaming transaction data for decades, and have continued
to incorporate leading-edge innovations. FICO® Falcon® Fraud Manager models, for instance, rely
on transaction profiles that summarize data in the stream as it passes by in order to compute the
pertinent fraud feature variables without relying on the persistence of data in production. Initially
BookedAccounts
Credit OffersTaken toMarket
Direct MailResponders
Prescreenof One
Responders
Test and Learn
Responders
Originations Performance
Tracking
Acquisition Performance
Tracking
Existing Customers
Bank FootprintProspects
MarketingProspects
Originations Management
System
Analytic Learning Hub
1. Risk Score2. Economic Impact
Model3. Segmentation Model4. Action-Effect Model
1. Tracking2. Simulation3. Learning
1. Direct Mail2. Prescreen of One3. Test and Learn4. Accept/Reject5. Initial Credit Line6. Test and Learn
Analytics Analytic Data Mart
Decision Strategies for Acquisitions
and Originations
FIGURE 3: ANALYTIC LEARNING HUB ENABLES MARKETING AND ORIGINATIONS TO SHARE DATA, STRATEGIES AND RESULTS
www.fico.com page 7
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
applied to customer accounts, the technique is now extensible to other entities, such as merchants,
ATMs and point-of-sale terminals, providing a more complete picture of payment card transactions. A
“bolt-on” adaptive model layer automatically adjusts its model feature weights based on production
data, improving sensitivity to emerging fraud patterns. Self-calibrating technologies for both profiles
and models increase detection accuracy where service/channel usage and other customer behaviors
are changing.
Unstructured data analytics can increase model
predictiveness. Up to 80% of the Big Data available to
businesses is text, speech, video and other unstructured data.
A growing number of automated techniques for transforming
these inputs into numerical representations can be used
with statistical analysis to discover predictive features. Other
techniques find patterns without such transformation, including
from a messy, mixed bag of different types of data.
Either way, features and patterns from unstructured data
can be combined with those from traditional structured
data into predictive models. In one project (see Figure
4), FICO demonstrated that a risk scorecard imbued with
text-extracted insights lifted predictiveness by 8% over a
traditional scorecard. In another project, analyzing notes
from sales inquiries, the addition of text insights enabled
a scorecard to identify 3% more leads resulting in sales.
Named entity extraction, a complementary text analytic
technique, identified individuals likely to have authority to
make a purchase decision—a strong predictor for improving
intelligent automated lead generation.
Machine learning can speed improvement cycles. Champion-challenger contests (pitting
the current best-performing strategy against proposed alternatives) are a widely used method of
improving data-driven decisions. But to accelerate learning and provide even more momentum for
performance improvement, they should incorporate some amount of deliberate experimentation.
That’s the only way to introduce enough diversity into the resulting outcome data to analyze causal
relationships (this change in action A causes outcome Y to change in this specific way). Machine
learning algorithms can help by automatically generating challenger strategies that maximize
learning speed within company-specified constraints on testing cost and risk.
4. Embrace analytic diversity
R, Python, Hive, Groovy, Scala, MATLAB, SQL, SAS. One of the side effects of the exploding world of
analytic innovation is that taking advantage of the latest techniques often requires learning a new
set of tools. Analytic teams will inevitably need to use multiple development methods to deliver the
insights the business needs.
It’s also clear that combining different types of analytic techniques often delivers superior results.
In the retail case study discussed on page 3, for instance, we described the benefits of using
Collaborative Profiles with time-to-event (TTE) predictive models. In this implementation, individual
customer allocation values, updated in real time with every transaction, become a pool of potential
FIGURE 4: INSIGHTS FROM UNSTRUCTURED DATA ANALYTICS CAN RAISE PREDICTIVE ACCURACYFigure 4: Analyzing hidden signals in text lifts predictive performances
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
PERC
ENTA
GE
“BA
DS”
PERCENTAGE “GOODS”
TraditionalScorecard
SemanticScorecard
At any percentage of “goods” (current, fully paid accounts), the scorecard with text insights predicts a higher percentage of “bads” (charged-off and defaulted accounts) than the traditional scorecard.
www.fico.com page 8
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
variables for the tens of thousands of TTE models. During model generation and refresh, these
variables are automatically selected and combined with other variables based on how strongly they
predict the target outcome of that model—say, the purchase of noodles within the next 10 days. For
some product categories, this combination of techniques improved predictiveness over the model
alone by as much as a factor of four.
In addition, Collaborative Profiles substantially improve the ability to predict new behaviors that
are probable but never before observed for a particular customer. They also pick up early signs
of behavioral change. A significant shift in archetype allocation could indicate, for example, that
the customer is transitioning to a vegan diet or a new baby has joined the household. Both are
opportunities for the retailer to adjust its offers for greater relevance and value to the customer.
To get multiple types of analytic models to work together like that in an efficient development
environment and robust production environment, you need a flexible infrastructure that embraces
diversity. Fundamental requirements include the ability to operationalize models authored by a wide
range of tools by supporting extensible libraries, web services and standards such as the Predictive
Modeling Markup Language (PMML). Centralized lifecycle management should extend across
models, business rules and analytic assets from any source.
5. Leverage cloud services and productivity platforms
Creating Big Data analytics no longer requires making a huge investment in expensive infrastructure
and specialized skills. By leveraging cloud services, companies can let a dedicated third party securely
handle the underlying systems and services, paying just for the capacity and services they need.
In the bank case study on page 5, the shared analytic learning hub for marketing and originations
was rapidly deployed using FICO hosting (similar infrastructure-as-a-service is also available today
through the FICO® Analytic Cloud). In addition, the open, hub-based architecture is a quicker,
less costly way to improve cross-functional visibility and coordination than traditional one-to-one
systems integration.
A global telecom company is leading the way for an industry
transitioning from years of go-go growth to a new era of deliberate,
precise management of risk and reward. Intense competitive pressure
in saturated markets had led to marketing campaigns netting too
many customers who end up in collections and/or cost the company
more than their accounts were worth.
To make more profitable originations decisions, the company
is moving beyond traditional credit classes to more granular
analytic segmentation that separates populations by credit risk
and customer lifetime value. To do it, it’s tapping an increasingly
wide range of data, including customer transactions and service
interactions. And it’s using analytics not only to predict customer
behavior, but also to balance all the key elements of risk and
reward in an originations decision to prescribe the best action for
maximizing discounted cash flow over time.
These deeper analytic insights are also helping the company reduce
attrition of valuable customers. Now the company knows when
allowing some leniency—such as to an otherwise current account
that falls behind during the holidays—is likely to encourage long-term
customer loyalty and profitability.
PROBLEM: Improve customer lifetime value and cash flow
SOLUTION: Balance risk and reward by optimizing customer acquisition decisions
CASE STUDY
Big Data: Overhyped or Underexploited?
INSIGHTS WHITE PAPER
For more information North America Latin America & Caribbean Europe, Middle East & Africa Asia Pacificwww.fico.com +1 888 342 6336 +55 11 5189 8222 +44 (0) 207 940 8718 +65 6422 7700 [email protected] [email protected] [email protected] [email protected]
FICO, Falcon, Model Central and “Make every decision count” are trademarks or registered trademarks of Fair Isaac Corporation in the United States and in other countries. Other product and company names herein may be trademarks of their respective owners. © 2014 Fair Isaac Corporation. All rights reserved.
4057WP 10/14 PDF
The Insights white paper series provides briefings on research findings and product development directions from FICO. To subscribe, go to www.fico.com/insights.
Unless analytics are to interact only with applications, you also need tools for packaging analytic
services for business users. Today’s application development productivity platforms (available for site-
install or via cloud services) provide everything needed to create complete applications, including
user forms and workflows powered by the analytic models.
6. Give control to the business experts
Everything we’ve discussed so far produces substantial value only when companies nail this final
best practice. The whole point of Big Data analytics is to give business experts new insights they can
quickly turn into decision strategies that ultimately improve results with customers.
For instance, visual tools for building strategies (decision trees) enable business experts to quickly
segment customer populations using any mix of policies and data-driven insights. A direct
marketing company that deployed FICO® Analytic Modeler Decision Tree Professional was able to
re-segment its customer database, extending credit to pockets of customers previously misidentified
as high-risk—generating nearly $12 million in incremental sales in just four months.
A European bank currently automating its originations process provides another example. Its leap
from manual methods to industry best practices—including use of visual strategy development
tools—is a major transformation. But this bank will avoid the “silos” of information that still challenge
many institutions that automated earlier. It will support its ambitious growth plans with systems
that make it easy for product marketing and risk management to collaboratively develop, test and
improve operational decision strategies.
The value of Big Data to business is easy to understand. But it’s not as easy to extract customer
insights from immense stores and incoming streams of data in an actionable form—and in time to
make a difference. Fortunately a reliable set of best practices for Big Data analytics is proving itself
in industries and markets around the world. There’s no need to “reinvent the wheel”—just take
advantage of its momentum.
To learn more about best practices for Big Data analytics, visit the FICO Blog and read these other
Insights white papers:
• Harnessing the Speech Analytics Advantage (No. 76)
• Cloud Democratizes Access to Big Data Analytics (No. 74)
• Extracting Value from Unstructured Data (No. 71)
• When Is Big Data the Way to Customer Centricity? (No. 67)
Conclusion