big data management demystified - dama chicago · intelligent business intelligencesm big data...
TRANSCRIPT
www.EWSolutions.com © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 1
EWSolutions
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Big Data Management
Demystified
By David Marco
President
EWSolutions
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 2
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
EWSolutions’ Background EWSolutions is a Chicago-headquartered strategic partner and full life-cycle
systems integrator providing both award winning strategic consulting and full-
service implementation services. This combination affords our client partners a
full range of services for any size enterprise information management, meta data
management, data governance and data warehouse/business intelligence
initiative. Our notable client partner projects have been featured in the Chicago
Tribune, Federal Computer Weekly, Journal of the American Medical Informatics
Association (JAMIA), Crain’s Chicago Business, and won the 2004 Intelligent
Enterprise’s RealWare award, 2007 Excellence in Information Integrity Award
nomination and DM Review’s 2005 World Class Solutions award.
For more information on our Strategic Consulting Services, Implementation Services,
or World-Class Training, email us at [email protected] or call at 630.920.0005
Best Business Intelligence Application
Information Integration
Client: Department of Defense
World Class
Solutions Award
Data Management
2007 Excellence in Information
Integrity Award Nomination
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 3
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Partial Client Partner List
For more information on our Strategic Consulting Services, Implementation Services,
or World-Class Training email us at [email protected]
AFLAC Arizona Supreme Court Bank of Montreal
BankUnited
Basic American Foods
Becton, Dickinson and Company
Blue Cross Blue Shield companies Booz Allen Hamilton
Branch Banking & Trust (BB&T) British Petroleum (BP)
California DMV
California State Fund Canadian National Railway
Capella University
Cigna
College Board
Comcast Corning Cable Systems
Countrywide Financial
Defense Logistics Agency (DLA)
Delta Dental
Department of Defense (DoD) Driehaus Capital Management
Eli Lilly and Company Environment Protection Agency
Farmers Insurance Group Federal Aviation Administration
Federal Bureau of Investigation (FBI)
Fidelity Information Services
Ford Motor Company
GlaxoSmithKline
Harbor Funds Harris Bank The Hartford
Harvard Pilgrim HealthCare
Health Care Services Corporation
Hewitt Associates
HP (Hewlett-Packard) Information Resources Inc.
International Paper Janus Mutual Funds
Johnson Controls
Key Bank LiquidNet
Loyola Medical Center
Manulife Financial
Mayo Clinic
McDonalds Microsoft
MoneyGram
NASA
National City Bank
Nationwide Neighborhood Health Plan
NORC Physicians Mutual Insurance
Pillsbury Quintiles
Sallie Mae
Schneider National
Secretary of Defense/Logistics
Singapore Defense Science & Technology Agency
Social Security Administration
South Orange County Community College
Standard Bank of South Africa
SunTrust Bank Target Corporation
The Regence Group
Thomson Multimedia (RCA) Thrivent Financial
United Health Group United Nations (ICAO)
United States Air Force
United States Army
United States Department of State United States Navy
United States Transportation Command
University of Michigan
University of Wisconsin Health
USAA US Cellular
Waste Management
Wells Fargo
Wisconsin Department of Transportation
Workers’ Compensation Insurance Rating Bureau of California (WCIRB)
Zurich Cantonal Bank
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 4
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
David Marco – Professional Profile
Best known as the world’s foremost authority on meta data management and the father of the Managed Meta Data Environment, he is an internationally recognized expert in the fields of data governance, big data, data warehousing, master data management and enterprise information management (EIM). In 2004 David Marco was named the “Melvil Dewey of Metadata” by Crain’s Chicago Business as he was selected to their very prestigious “Top 40 Under 40” list. David Marco has authored several books including the widely acclaimed “Universal Meta Data Models” (Wiley, 2004) and the classic “Building and Managing the Meta Data Repository: A Full Life-Cycle Guide” (Wiley, 2000).
2014 EWSolutions was inducted into the Hinsdale business Hall-of-Fame after 6 consecutive years of receiving “Best of” awards in Enterprise Information Management
Selected to the prestigious 2004 Crain’s Chicago Business “Top 40 Under 40”
2008 DAMA Data Management Hall of Fame (Professional Achievement Award)
2007 DePaul University named him one of their “Top 14 Alumni Under 40”
Presented hundreds of keynotes/seminars across four continents
Published hundreds of articles on information technology
Author of several best selling information technology books
Taught at the University of Chicago and DePaul University
Holds both a CDMP and a CBIP certification
Email: [email protected]
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 5
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Agenda
Introduction – an Overview of Big Data Definitions
Concepts in Big Data
Big Data Success Rates
Synopsis of Big Data Technologies Overview of major technologies used in big data
5 V’s of Big Data
Data Management and Big Data
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 6
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Big Data Defined
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 7
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
What is Big Data? Definition #1: “Big data is a broad term for data sets so large or
complex that traditional data processing applications are inadequate.” (www.wikipedia.com)
#2: “data of a very large size, typically to the extent that its manipulation and management present significant logistical challenges.” (Oxford English Dictionary)
#3: “datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze,” (widely-quoted 2011 big data study by McKinsey)
#4: “The ability of society to harness information in novel ways to produce useful insights or goods and services of significant value” and “…things one can do at a large scale that cannot be done at a smaller one, to extract new insights or create new forms of value.” (“Big Data: A Revolution that Will Transform How We Live, Work, and Think”, Viktor Mayer-Schönberger & Kenneth Cukier)
#5: “The broad range of new and massive data types that have appeared over the last decade or so.” (Tom Davenport )
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 8
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
What is Big Data?
#6: The new tools helping us find relevant data
and analyze its implications.
#7: The convergence of enterprise and
consumer IT
#8: The shift (for enterprises) from processing
internal data to mining external data.
#9: A new attitude by businesses, non-profits,
government agencies, and individuals that
combining data from multiple sources could
lead to better decisions List adapted from Forbes, “12 Big Data Definitions: What's Yours?”, 9/3/2014, Gil Press
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 9
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Big Data Defined
Big Data: a collection of data sets so large
and complex that it becomes difficult to
process using traditional database
management tools or conventional data
processing applications
This definition has much meaning
Big Data technology and techniques should
be used when traditional database
management tools CANNOT
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 10
Big Data is NOT..
The cure for cancer and global hunger
The answer to all data ecosystem issues and
challenges
A replacement for the enterprise data warehouse
Reliable, scalable, and flexible on its own without
hard work performed by skilled professionals
Solving your data management issues
A fully matured solution
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 11
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Big Data
Landscape
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 12
Big Data Growth Rates - 1
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 13
Big Data Growth Rates - 2
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 14
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
What is Big Data?
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 15
What is Big Data?
A MASSIVE marketing campaign
Actual technology, methods and techniques
A significant impact to the data management
field!!
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 16
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Explosive Data Growth
Every minute of the day:
YouTube receives 48 hours of uploaded
video
Over 2 million search queries hit Google
Twitter users post about 100,000 tweets
571 new websites are created
Over 200,000,000 email messages are
created and sent http://ftp-hosting-services-review.toptenreviews.com/the-explosive-growth-of-digital-data.html
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 17
Need for Data Management
Trends in data growth show need for data
management in every industry (Gartner 2014,
US Bureau of Labor Statistics 2014,
ComputerWorld 2014, etc..) Data Architect
Data Modeler
Data Governance specialist
Data Quality specialist
Master Data Management specialist
Etc……
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 18
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Start of Big Data
Big Data is nothing but collection of data sets so large and complex that it becomes difficult to process using traditional database management tools or conventional data processing applications
When you search for word “Big Data” into Google, you get 780 million results in 0.31 seconds
Companies increasingly want to do enterprise-level decision making, based on all data sources
Social networks (LinkedIn, Facebook, Twitter, etc.) dramatically increase the amount of data that we bring into our companies
Other unstructured data sources, e.g., video, images, etc. add complexity to the management of available data
This is ONLY the beginning!!
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 19
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Start of Big Data “Smart” is a new term that we are beginning to hear about
Smart appliances are linked to a central system (sometimes the manufacturer) and send that central system information from sensors on the appliance
Smart grid is a modernized electrical grid that uses analog or digital information and communications technology to gather and act on information - such as information about the behaviors of suppliers and consumers - in an automated fashion to improve the efficiency, reliability, economics, and sustainability of the production and distribution of electricity. (wikipedia)
There are also smart homes, cars, toll ways, etc., etc.
Smart can be translated to “it is monitoring and recording you”
All of these “smart” stuff creates massive new volumes of data
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 20
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Key Definitions
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 21
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Hadoop
Apache Hadoop is an open-source software framework that supports data-intensive distributed applications
Hadoop was derived from Google’s GFS (Google File System) and MapReduce systems
The library, instead of the hardware, is designed to detect and handle failures, to deliver highly-available service on top of a cluster of computers
• Google File System (GFS or GoogleFS): is a proprietary distributed file system developed by Google for its own use. It is designed to provide efficient, reliable access to data using large clusters of commodity hardware
• Hadoop Common: The common utilities that support the other Hadoop modules.
• Hadoop Distributed File System (HDFS™): A distributed file system that provides high-throughput access to application data
• Hadoop Yet Another Resource Negotiator (YARN): A framework for job scheduling and cluster resource management
• Hadoop MapReduce: A YARN-based system for parallel processing of large data sets
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 22
Purpose of Hadoop
Hadoop implements Google’s MapReduce, using
HDFS
MapReduce divides applications into many small
blocks of work
MapReduce can then process the data where it is
located
Hadoop‘s processes can run on thousands of
clusters, making processing very fast
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 23
Hadoop Use Examples
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
• Yahoo! : More than 100,000 CPUs in ~20,000 computers running Hadoop; biggest cluster: 2000 nodes (2*4cpu boxes with 4TB disk each); used to support research for Ad Systems and Web Search
• AOL : Used for a variety of things ranging from statistics generation to running advanced algorithms for doing behavioral analysis and targeting; cluster size is 50 machines, Intel Xeon, dual processors, dual core, each with 16GB Ram and 800 GB hard-disk giving us a total of 37 TB HDFS capacity
• Facebook: To store copies of internal log and dimension data sources and use it as a source for reporting/analytics and machine learning; 320 machine cluster with 2,560 cores and about 1.3 PB raw storage
• FOX Interactive Media : 3 X 20 machine cluster (8 cores/machine, 2TB/machine storage) ; 10 machine cluster (8 cores/machine, 1TB/machine storage); Used for log analysis, data mining and machine learning
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 24
Why is Hadoop Different?
Scalable: Reliably store and process high volumes of data (e.g. petabytes)
Economical: Distributes the data and processing across clusters of
commonly available computers (in thousands)
Efficient: By distributing the data, it can process data in parallel on the
nodes where the data is located
Reliable: Automatically maintains multiple copies of data and automatically
redeploys computing tasks based on failures
Anybody notice anything in the “Reliable” bullet?
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 25
NoSQL
NoSQL (Not Only SQL) is a database
format that provides for storage and retrieval
of data that is modeled in ways other than
tabular formats used in relational databases
NoSQL databases give more flexibility and
availability for less-structured data, and can
make some operations faster, especially
when traversing very large unstructured
datasets
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 26
Types of NoSQL Databases
The list of NoSQL database types seems to
change on a daily basis
We will discuss the following 4 types NoSQL
databases: Key-Value
Document
Column-Oriented
Graph
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 27
Types of NoSQL Databases
Key-Value: have a single table with key-value pairs, meaning two columns: one being the (Primary) Key, and the other being the Value generally useful for storing session information, user
profiles, preferences, shopping cart data, unstructured data
Document: these types of NoSQL differ from each other but that are built to store documents and encode data (or information) in some standard format(s) generally useful for content management systems,
blogging platforms, web analytics, real-time analytics, ecommerce-applications
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 28
Types of NoSQL Databases
Column-Oriented: each storage block contains data from only one column. RDBMS store a single row as a continuous disk entry. Different rows are stored in different places on disk while Columnar databases store all the cells corresponding to a column as a continuous disk entry thus makes the search/access faster generally useful for content management systems, blogging
platforms, maintaining counters, expiring usage, heavy write volume such as log aggregation
Graph: are the kind of NoSQL database designed for data whose relations are well represented as a graph (elements interconnected with an undetermined number of relations between them) very well suited to problem spaces for connected data, such as social
networks, spatial data, routing information for goods, social relations, road maps, network topologies, money, recommendation engines
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 29
Schema vs. Schema-less
All NoSQL databases claim to be schema-less
Schema-less means there is no schema enforced by
the database itself
Rather the scheme is built into the access code
Databases with strong schemas, such as relational
databases, can be migrated by saving each schema
change, plus its data migration, in a version-
controlled sequence
Schema-less databases need careful migration and
data management due to the implicit schema in any
code that accesses the data
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 30
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
5 Vs of Big Data
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 31
5 V’s of Big Data
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Volume
(scale of data)
Variety (different forms of
data)
Veracity (uncertainty
of data)
Velocity (speed of
analysis of data)
Value (worth of the data)
Most US
companies
have at least
100 terabytes
of data stored
4+ billion hours
of content are
watched monthly
on YouTube
alone
Poor data quality cost the
US economy $3.1 trillion in
2012 (Gartner)
Every session
of the NYSE
captures at
least 1 TB of
data
Big data is not
valuable in and
of itself. It must
have a definitive
business case
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 32
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Is Big Data
Successful?
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 33
Big Data Success
In 2013 companies spent $31 billion on Big Data
The market is expected to top $114 billion by 2018
60% of executive believe Big Data will upend their industries within 3 years*
55% of Big Data projects do not get completed**
Through 2017, 60% of Big Data projects will fail to go beyond piloting and experimentation and will be abandoned
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
* “Cracking the Data Conundrum: How Successful Companies Make Big Data Operational”, Cap Gemini, January 14, 2015
** “CIOs & Big Data : What Your IT Team Wants You To Know”, infochimps, 2013
*** Ted Friedman, VP of Gartner, Twitter 12/29/2014
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 34
Big Data Success
Executives describe the own Big Data projects
as*:
8% “Very Successful”
27% “Successful”
65% Below “Successful”
Some other sources show Big Data failure
rates at 30%, 50%, 75% and one showed
90%**
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
* “Cracking the Data Conundrum: How Successful Companies Make Big Data Operational”, Cap Gemini, January 14, 2015
** “Marketing IT In-House: Don’t Let BI Failure Statistics Stop You”, Max T. Russell, February 18, 2014
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 35
Big Data Success
Big Data Implementations
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 36
Big Data Success
Big Data Issues
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 37
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Top 10 Reasons for
Big Data Failure
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 38
Big Data have been over hyped and over
marketed, especially to the C-Level
They believe that it is a magic bullet
In reality, most don’t even know what Big Data
actually means
They buy into the marketing and the false
promises
They attempt overly ambitus projects
#1: Believing the Hype
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 39
Remember our definition?
Big Data: a collection of data sets so large
and complex that it becomes difficult to
process using traditional database
management tools or conventional data
processing applications
Big Data should not be used when a traditional
database will suffice
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#2: Using Big Data When There Wasn’t a Need
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 40
Do you know what a Data Lake is?
Data Lake: is a large data repository sourced
from multiple applications where the data is
stored in its native format
Most Big Data initiatives are stand-alone
efforts and lack integration
They are just silos of data; regardless, of if
they are Big or not
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#3: Big Data Silos
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 41
Many Big Data projects start with poor or
unclear business objectives
Big Data should NOT be a science project!!
They should have a clear, definable and
profitable business case
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#4: Poor Business Objectives
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 42
Big Data skills are sparse in the industry
Truly qualified people are difficult to find and
are likely to be hired away
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#5: Lacking the Right Skills
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 43
Many companies select the wrong uses for Big
Data
Most management users their “gut” to make
decisions as opposed to trusting the actual
analytics being presented
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#6: Management Failure
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 44
Too many Big Data projects are ran as
projects and not programs
They use a Waterfall methodology, instead of
an iterative approach
Big Data programs are best built iteratively
and over time
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#7: Big Data Waterfall
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 45
Big Data technology can handle petabytes of data and over time it will have greater capacity
In general, going beyond 1 – 3 petabytes is very dangerous and takes high levels of expertise and money
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#8: Targeting too Many Petabytes
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 46
Newer server cabinets are much more powerful than before and the also produce much more heat
Temperatures in most data centers aisles range from 80 to 115 degrees
Larger data centers can be 7 – 9 times the size of a football field
Typically a 1/3 of the energy consumed by a Big Data center is used by the cooling system for the facility
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#9: Can You Feel The Heat?
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 47
Common complaints heard during Big Data failures include:
Poor data quality
Business didn’t understand the analytics they were being presented
Ineffective governance models
Data security and privacy issues
This is all data management!!
You cannot build a sustainable Big Data solution without meta data management and data governance
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
#10: Lack of Data Management
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 48
DAMA-DMBOK ©
Strategic Partner & Systems Integrator
Intelligent Business Intelligencesm
•Analysis
•Data
Modeling
•Specification
Analysis
Measurement
Improvement
•Enterprise Data Modeling •Value Chain Analysis
•Standards
•Administration
•Authentication
•Auditing
•Code
Management
•Common Data
Management
•Analysis
•Definition
•Rationalization
•Architecture
•Integration
•Control
•Delivery
•Architecture
•Training & Support
•Implementation and
Maintenance
www.EWSolutions.com
Contact us at [email protected] © 2016 Enterprise Warehousing Solutions, Inc. (EWSolutions) – 49
Strategic Partner & Systems Integrator Intelligent Business Intelligencesm
Questions