why big data analytics needs business intelligence too
DESCRIPTION
Business and IT are facing the challenge of getting real and urgent value from ever-expanding information sources. Building independent silos of big data analytics is no longer enough. True progress comes only by integrating data from traditional operational and informational sources with the new sources that are becoming available, whether from social media or interconnected machines. In this April 2014 BrightTALK webinar, Dr. Barry Devlin describes the thinking, architecture, tools and methods needed to achieve a new joined-up, comprehensive data environment.TRANSCRIPT
Copyright © 2014 9sight Consulting, All Rights Reserved
Dr Barry DevlinFounder & Principal
9sight Consulting
Why Big Data Analytics NeedsBusiness Intelligence Too
BrightTALK Webinar9 April 2014
Dr. Barry Devlin
2 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Founder and Principal9sight Consulting, www.9sight.com
Dr. Barry Devlin is a founder of the data warehousing industryand among the foremost authorities worldwide on businessintelligence (BI) and beyond. He is a widely respectedconsultant, lecturer and author of the seminal “DataWarehouse—from Architecture to Implementation”. His newbook, “Business unIntelligence—Insight and InnovationBeyond Analytics and Big Data” (http://bit.ly/BunI-Technics)was published in October 2013.
Barry has 30 years of experience in IT, previously with IBM, asan architect, consultant, manager and software evangelist.
As founder and principal of 9sight Consulting (www.9sight.com),Barry provides strategic consulting and thought-leadership tobuyers and vendors of BI solutions. He is currently developingnew architectural models for fully consistent business support—from informational to operational and collaborative work.
Based in Cape Town, South Africa, Barry’s knowledge andexpertise are in demand both locally and internationally.
Email: [email protected]: @BarryDevlin
Big data analytics began with social mediaand web logs Understanding and tracking sentiment
– What do you think? How do you react?– Basic analytics and BI activity on a new
data source
Real-time insight into and influenceon website activities– Why did you abandon your cart?– What would you most likely buy
on getting a cross-sell?– Deep, real-time analytics and BI
with operational integration
3 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Add the Internet of Things to big data analyticsand reinvent businesses Significant new considerations
– Micro-management of supply chains andextension all the way to the consumer– Sourcing and delivery
– Completely new business models (usually depending on bigdata analytics)– Motor insurance– Health monitoring
4 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
But wait… it’s not just big data…we also need traditional business data
Traditional business processes– Data created, managed and used in a
structured and regulated way– “Process-mediated data”
– The legal basis of business
Big data analytics– Data gathered from unreliable sources,
often designed for unrelated purposes
Business value of big data depends onlinking it to traditional business processes
5 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Characteristics– Tactical decision making
based on reconciled data– Consistency and truth
– Separation ofoperational andinformational needs
– Vertical and horizontalsegmentation of data
– Unidirectional data flow
Note: key business needs andtechnology limitations of the ’80s and ’90s
6
Process-mediated data is the core of BI and layeredData Warehouse since the early ’90s
Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Data marts
Enterprise data warehouse
Met
adat
a
Datawarehouse
Operational systems
“An architecture for a businessand information system”,B. A. Devlin, P. T. Murphy,IBM Systems Journal, (1988)
The tri-domain model shows two new types of data /information Process-mediated data
– “Traditional” operational& informational data
– Via data entry andcleansing processes
Machine-generated data– Output of machines
and sensors– The Internet of Things
Human-sourced information– Subjectively interpreted
record of personalexperiences
– From Tweets to Videos
7 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Human-sourced information
Machine-generated
data
Process-mediateddata
Structure/Context
Timeliness/Consistency
HistoricalReconciledStableLiveIn-flight
[In the context of these domains, “data” signifies well-structured and/ormodeled and “information” is more loosely structured and human-centric.]
The modern, REAL logical architecture Realistic, Extensible,
Actionable, Labile
Three interconnected pillarsof information– Messages, events, measures
and transactions from realworld
– Metadata is context-settinginformation
Adaptive process– Business and IT– Information processing
– Instantiation, assimilation andreification – ETL, ELT,Virtualization
– Workflows and activities– Choreography
8 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
EventsMeasures Messages
Transactions
Reification
Utilization
Cho
reog
raph
y
Org
aniz
atio
n
Instantiation
Human-sourced
(information)
Machine-generated
(data)
Process-mediated
(data)
Context-setting (information)
Assimilation
Transactional(data)
Key characteristics of information pillars Single architecture includes all
types of data/information– Mix/match technology as needed– Relational, NoSQL, CEP, Graph,
etc.
Integration of sources and stores– Operational processes gather
measures, events, messages andtransactions
– Assimilation integrates storedinformation
Data flows as fast as needed andreconciled when necessary– No unnecessary storage or
transformations
9 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
EventsMeasures Messages
Transactions
Human-sourced
(information)
Machine-generated
(data)
Process-mediated
(data)
Context-setting (information)
Assimilation
Transactional(data)
OperationalProcesses
Process-mediated data: Relational databases evolveto allow de-layering and reintegration Drivers: Stability, Consistency and Reliability
Relational databases remain core technology– “New” approaches to storage and processing
– Columnar (and compressed) to hybrid– Solid-state disk and in-memory– Massively parallel processing
– Advantages:– Reduced physical modelling– Faster read and write
Sample offerings:– Upgraded databases: e.g. IBM DB2 BLU, etc.– Appliances: e.g. Actian ParAccel, HP Vertica,
SAP HANA, etc.– BI Tools: e.g. Tableau, Qlikview, etc.
10 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Multi-coreMPP
Machine-generated data: NoSQL and streaming takeon relational at the extremes Drivers: Speed, Size and Flexible Structure
NoSQL is the current darling,especially at the extreme ofall three drivers
CEP (complex event processing) /Streaming at extreme speed
Relational can address manyof these drivers– Even flexible structure (see my
relational vs. Hadoop session)
11 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Human-sourced information: Hadoop and/orenterprise content management
Drivers: Soft, Large and Ill-defined data
Hadoop , Hadoop and more Hadoop– Hadoop 2.0 enables more real-time
processing
Traditional ECM tools shouldnot be forgotten– Enterprise content management– Soft information needs
to be managed
12 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Information processing creates, maintains andmediates access to all information. Instantiation
– Turns measures, events andmessages into info. instances
– File access, ETL, change capture…
Assimilation– Creation of reconciled and consistent
info. sets prior to business use– Key to big data – BI linkage
– With context-setting information– ETL, ELT and virtualization
Reification (making the abstract real)– Providing a real-time, consistent,
cross-pillar access to info. accordingto an overarching model
– Virtualization
13 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
EventsMeasures Messages
Transactions
Reification
Instantiation
Human-sourced
(information)
Machine-generated
(data)
Process-mediated
(data)
Context-setting (information)
Assimilation
Transactional(data)
Org
aniz
atio
n
Context-setting information (metadata) is key.
Metadata is two four-letter words!– Information (not data)– Describes all “stuff” (not just data)– Indistinguishable from “business information”
by non-IT people (and some IT people)
Context-setting information (CSI)– New image: describes what it is and does– Context-setting information provides the background to each
piece of information, to every process component and to allthe people that constitute the business
– All information is actually context-setting for something else
How to create CSI– Modeling up-front combined with Text Mining on the fly
14 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Mars Climate Orbiter,lost in 1999, $325M:metadata error
From BI to Business unIntelligence
Rationality of thought and far beyond it
Logic of process, predefined and emergent
Information, knowledge and meaning
The confluence of– Reason and inspiration– Emotion and intention– Collaboration and competition– All that comprises the human and
social milieu that is business
Not business intelligence
Business unIntelligence
http://bit.ly/BunI-Technics: 25% discount with code “BIInsights25”
15 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Conclusions Big data and the Internet of Things only
offer background to “real” business
Reconciled and consistent data built viaData Warehouse and BI contains thereality of the business – legally-bindingactions and transactions
The emerging architecture consists ofthree interconnected information pillarsbased on appropriate technologies
16 Copyright © 2014, 9sight ConsultingCopyright © 2014, 9sight Consulting
Copyright © 2014 9sight Consulting, All Rights Reserved
Dr Barry DevlinFounder & Principal
9sight Consulting
Thank you
Questions?
17