Download - Having it All is not Having it All at All!
Having it All is not Having it All at All!
Problem Formulation in the Face of Overwhelming
Quantities of Data
A journey of discovery… Where’s the fire?
START FROM THE BEGINNING -- “Before the beginning of great brilliance, there must be Chaos.” -- (I Ching)
“At the beginning of the 21th century, the population of the Earth [was] 6.300.000.000., who annually experience a reported 7,000,000 -8,000,000 fires with 70,000 –80,000 fire deaths and 500,000 –800,000 fire injuries.
Dr. Ing. Peter Wagner 2006 ”
Data everywhere.
Who knew?
Gone are the days when there was a single source of “truth”…
Baker Library
Entries in a book on Australia business owners
About a storekeeper in Halifax County, N.C. – June 1873:
“purchaser or stolen goods, a great scamp.”
Entry about one J. B. Alford, who sold groceries and liquors: June 1870
“This man is said to be in thriving circumstances. He has some Real & personal estate & I think it is safe to trust him.”
Entry on Hannah Griffith, a milliner in Springfield, Ill. In 1869
“about to marry a fellow [of] no account.”
An entry two years later noted with some relief, that that plan had fallen through.
Harvard R.G. Dun Credit Report Collection
"is not much of a businessman, but had some capital, it is said, advanced by his father, who is reputed well off“ -- About J.D. Rockefeller – who turned out to be a good credit risk; 1863 was the year he set up a refinery that blossomed into Standard Oil.
Hold on… things are changing.
Hold on… things are changing.
Framing our case for change…
• We all know that the world is changing• We are aware that the rate of change is increasing
at an unprecedented rate• We see new types of data, technologies, and
behaviors every day• More and more, we are tasked with discerning the
discoverable need from the articulated want
The Operating Environment
• What has made us successful so far is insufficient• We now have the ability to succeed… or fail, much faster• The connectedness of information and the ways in which
it is changing is impacting the risk and opportunity space in ways we are only beginning to understand
The Case for Change
Sometimes, a picture is worth a thousand words.
Pope Benedict Inauguration
Pope Francis Inauguration
Lately, a thousand pictures are taken in the time it takes to speak a single word!
• What about the digital footprint of all of the smartphones?
• What about the social networks the crowd?
• What about the metadata in the photos?
• What are the opportunity costs to other activities?
• The largest corpus of data preceded the event
• Most data created about the event had significant, and asymmetric latency
• The rate of “data decay” attributable to the participants in the event is significant
Asking the right questionAsking the right question
How deep would the ocean be if sponges didn’t live there?
What if the Hokey Pokey really is what it’s all about?
What if there were no hypothetical questions?
How many more of these silly questions till the next slide?
Questions about risk and opportunity are at the heart of our focus.
10
What other companies is this individual associated with?
How do I identify changes with my current contact relationships?
Who is the right decision maker at this company and how do I effectively reach them?
How can I de-dupe my current customer base at the contact
level?
I need answers!
I need insights about a contact to help me
target my messaging
What other risks should I know about
before doing business with this small company?
Should I extend credit?
Should I extend credit?
What about fraud?
What about fraud?
What is the right credit
limit?
What is the right credit
limit?
What do my best
customers look like?
What do my best
customers look like?
Which customers
should I call on next
Which customers
should I call on next
Which prospects are
most promising?
Which prospects are
most promising?
It is extremely important to frame the question in the right context.
The right universe of data is often implied by the scope and context of the question.
12
Business
Name
Telephone
Address
SIC Employee Size
Sales Revenue
Year Started
Primary Contact
Linkage
Foundational Firmographic
• Unit of Analysis: Set of matched results
• Response Variables
• CC = Confidence Code Attribution
• MG = MatchGrade Attribution
• WACC = Weighted Average Confidence Code
Rational Subgroups
• By Confidence Code Cluster
• By MatchGrade “cousin” cluster within Confidence Code
• Potential Explainable Factors:
• Cleansing Process – things w e do to the Korean text w hich may cause it to be ‘less matchable’
• Candidate retrieval methods that w e use• Evaluation & Decisioning – w e may need to adjust our
definition of A / B / F for Korea• Availability of AME-K data• Distribution bias in aggregate f ile behavior of scoring
system• MatchGrade mappings
– Unknown or ignored, potentially explainable, causes of variation
• Unexplainable• Quality of customer input• Completeness of customer input• Emergence of new jargon/Acronyms• New Chinese Idioms• Statutory changes• Differences in privacy expectations• Differences in w ord order, sound, stroke weight
• Data in hand• Discoverable data• Computable data• Extent, unavailable data
(opportunity cost)• Understanding of cause
systems• Relevant theory
D&B Proprietary information
Veracity: How do I adjudicate the truth when the malfeasants are learning so much faster?
Volume: How much data is “too much” to see the answer?
Velocity: Can the rate of change of data itself be part of the answer?
Variety: How can heterogeneous and unstructured data inform new ways of inquiry?
Leveraging the “V’s” to get to the best answerLeveraging the “V’s” to get to the best answer
A typical M&A takes 6-9 months from announcement to deal completion
• Some take longer, or may never close
• Regulatory requirements sometimes drive pre- and post- close changes over years
Family trees updated as the deal completes
• Average update within 10 days
• Linkage updates frequently precede official registry changes
• Updates include re-linking records, re-structuring tree levels, taking entities to out of business and creating new entities
Announced restructuring and re-organizations often take 6 months to 2 years
A good example can be seen in tracking mergers, acquisitions, and divestitures.
1414
Traditional analysis of this data can reveal interesting risks
15
CITGO PETROLEUM CORPORATIONTexas, USA
CITGO PETROLEUM CORPORATIONTexas, USA
PDV AMERICA, INCOklahoma, USA
PDV AMERICA, INCOklahoma, USA
Propernyn B.V.Netherlands
Propernyn B.V.Netherlands
3 additional subsidiary levels3 additional subsidiary levels
National Government:Republic of VenezuelaNational Government:Republic of Venezuela
Combining the articulated want (family tree) with the discoverable need (what’s really going on)…
16
Ceramics Inc50 Employees
Glass MfrWichita, Kansas
Ceramics Inc50 Employees
Glass MfrWichita, Kansas
Medi-Cell125 Employees
Lab Equip Mfr.Abayance, FL
Medi-Cell125 Employees
Lab Equip Mfr.Abayance, FL
AdvDesigns AG30 Employees
R&DStem Cell Rsrch
Frankfurt, Germany
AdvDesigns AG30 Employees
R&DStem Cell Rsrch
Frankfurt, Germany
Mediquip1000 Employees
Mediquip1000 Employees
Monsanto500 member family
treeLargest Genetically
modified food producer
Monsanto500 member family
treeLargest Genetically
modified food producer
Pending Decision: Underwrite Directors and Officers Policy
Pending Decision: Underwrite Directors and Officers Policy
49% 30%
The story is true. The names have been
changed to protect the innocent..
The story is true. The names have been
changed to protect the innocent..
Language, identity, and intention can significantly impact the complexity of the situation.
D&B Proprietary information
株式会社カワサキモータースジャパン“Kabushikigaisha Kawasaki Mōtāsu Jyapan ”
(aka Kawasaki Motors Japan)
한국가와사키“Hanguggawasaki”
(aka Kawasaki Korea)
川崎重工咨询“Chuanxi Zhonggong zuishin”
(aka Kawasaki Heavy Industries Consulting)
KAWASAKI KK(Local electricians in a suburb of Kawasaki)
川崎涂料有限公司“Chuanxi chuliao Youxian Gonxi”
(aka Kawasaki Paint Co, Dongguan)
川崎重工業株式会社“Kawasaki Jūkōgyō Kabushiki-gaisha”
(aka Kawasaki Heavy Industries)
“Ka-wa-sa-ki”Kawasaki (idiom)- “river beside mountainous terrain”
Privacy and other statutory constraint
Multiple names
Digital natives vs. digital immigrants
Overlapping “identities”
People are strange…People are strange…
As the boundary between people and small business becomes increasingly blurred, we continue to focus on the concept of People In The Context of Business
Cleanse, de-dupe, identity resolution and enrichment services for your contact data
Understand when people move from organization to organization
Sharpen the line between the individual and the business when engaging small businesses
Malfeasance and fraud are perpetrated by people, not by businesses. This solution reveals relationships that will help all of us more effectively identify potential for bad behavior.
19
THE CHALLENGE THE GOAL THE VALUE
#1 – the “John Smith” problem – multiple people with the same name
#2 – the “Ann Taylor” problem – data about businesses named after people
Caroline M Smith
302 N Liberty St.Albion, IAAddr Type: Residential
Carrie SmithMeredith Corporation1716 Locust St.Des Moines, IAAddr. Type: Commercial
Caroline SmithUniversity of Iowa21 E Market St.Iowa City, IAAddr. Type: Commercial
#3 – the “Sybil” problem – one person with multiple persona or names
Carrie SmithTenderheart Daycare2635 Cleveland Dr.Adel, IAAddr. Type: Commercial
Many people connected to one business
Many businesses connected to one
person
Businesses connected through people
People connected through associations with other people
A single view of customers and prospects, both in the context of entities and people will drive key actionable outcomes for your business.
D&B Proprietary information
Creating the foundation for People in the Context of Business.
20
• There will be a point of inflection reached whereby we have sufficiency of indicia (by quality and count) to say we can recognize a “soul”
• Dynamic clustering will allow us to adjust our opinion of existing indicia or an existing Soul as new Flexible Alternative Indicia is identified
SoulIndicia Dynamic Clustering
Indicia
D&B Proprietary information
I’ll bet you knew this was coming Learning from
the way things move, even if you don’t understand them fully… seriously?
How do you predict something that has no precedent?
Predictions, predictions…Predictions, predictions…
Commercial signal and proxy are now added to existing predictive attributes to provide deeper insights and even more predictive analytics.
Traditional Business Data
Robust Predictive Data Available
No DataAvailable
Non-Traditional
Insight
Low
High
P
red
icti
ve
Co
nte
nt
Limited Data Available
Signal & proxy sources add significant decisioning content on small businesses with limited or no traditional
predictive data footprint
‘Signals’ aggregated and analyzed over time, correlated with other data sources expose hard-to-find patterns.
23
BIG DISPARATE SOURCES OF
DATA
SIGNALEXTRACTION
ADVANCEDANALYTICS
PREDICTIVE MODEL GAINS
We’re harnessing the massive flow of data through our systems and distilling the signals that describe a company’s behavior.
This is helping to increase levels of precision in predictive models.
Customer Cross-border
Inquiries
Customer Match
Inquiries
Global Trade
Experiences
Transactional
WorldBase Updates
Third Party Exchange
Customer Portfolio
Monitoring
Intelligence EngineTraffic
Phone and Email
Connectivity Testing
Call Center Activity
Other Proprietary
Sources
D&B Proprietary information
Extending the deployed capability to better understand malfeasance…
24
•Apply learning and integrate new targeted severe risk prevention and detection rules in data supply processes and platforms
Continuous Improvement
Data Collection & Input
D&B Proprietary information
Combining people, linkage, and daily signals to quickly recognize and analyze patterns and take action…
25
In the above use-case, with millions of payment experiences a week, we were able to quickly identify and analyze a suspicious pattern and take action Not only on all related cases but also the “three ring leaders”
“Ring Leaders”
D&B Proprietary information
Data sensing: Advanced analytics also play a significant role in acquiring new data sources.
26
Scale
Depth
Value
Other Data
Multi-national footprint?
Comprehensive coverage across all verticals and sizes of business?
Positive correlation with trade or other predictors to serve as a proxy?
Some current efforts under way to utilized this hybrid capability…
Helping you gain visibility into your supplier’s suppliers, from tier 1 to tier N.
With this knowledge you can reduce the risk of being blind-sided by disruption(s) anywhere within your supplier network.
We use analytical methods to build an implied supply chain using our extensive knowledge of buyer-seller relationships.
31
TIER-N SUPPLY CHAIN RISK LINKAGE DISCOVERY ENGINE MATERIAL CHANGE
Tier 1 Tier 2 Tier … Tier N
A B
Buyer Seller
B
BuyerC
Seller
Providing you more linked families with a focus on small and medium businesses.
Gain a more comprehensive view of your multi-site business partners, revealing new opportunities and overall risk.
Innovative technology and analytics are efficiently guiding us to potential linkage relationships we had not previously seen.
Ultimate Parent
Headquarters
Branch
Branch
Parent Subsidiary
Helping you stay ahead by anticipating important changes before they occur.
Knowing which businesses are poised for growth, or which may be headed for elevated risk is valuable foresight.
Anticipatory analytics is helping us identify unique drivers, root causes, and sensitivities leading to material change.
Signals that predict a change…
…in traditional predictors…
…that predict business
outcomes
Derive insights from signals over time
Pinpoint combinations with greatest predictive value
D&B Proprietary information
28
New Techniques to address Big DataNew approaches to Discovery, Curation, and Synthesis
Data sensing at the “Event Horizon”
We are increasingly faced with information that is rich, varied, and replete with opportunity – our focus is shifting from “hunting and gathering” to new challenges.
“And now we welcome the new year, full of things
that have never been” – Rainer Maria
Rilke