why are data lakes the future of big data? - oracle · 2019-11-22 · title: why are data lakes the...

1
DATA SOURCES DATA LAKES IN USE Why a re Data Lakes the Future of Big Data? Go to oracle.com/data-lake to learn why data lakes optimize data management in the era of big data IS YOUR BUSINESS READY TO GET THE MOST OUT OF ITS DATA? Copyright © 2019, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. As big data gets bigger, the increasing volume of data and data sources can easily overwhelm data scientists. A data lake puts that all in one simple, cost-effective, and configurable repository. THE NEED FOR DATA MANAGEMENT Without a smart data management strategy, organizations can quickly get overwhelmed – and fall behind the competition Data Lake vs Data Warehouse 4.4 GB of data are used by Americans every minute 50% of businesses say that big data has changed their sales and marketing 90% of data has been generated since 2016 95% of businesses handle unstructured data Structured or unstructured data Data comes in all types in the digital age from countless sources Sourced internally or externally DATA LAKE VS DATA WAREHOUSES Data lakes and data warehouses both act as repositories, but they are designed for very different purposes. Data warehouses work best for specific projects with set resources while data lakes are optimized for managing all incoming big data. Structured, unstructured, and raw data Schema-on-read Large repository for data scientists Flexible configuration Low cost, high volume Cost scales with volume Schema-on-write Structured, processed data Ready for reports by business users Fixed configuration Use with BI / analytics Data comes from many types of sources. In fact, businesses rarely come to a data-driven decision with just one source. Instead, it takes numerous inputs, and much of that can be unstructured. Data Lakes place these all in a single repository, saving time , effort, and cost 5 Average number of data sources consulted to reach a data-driven decision 80% Amount of data that is unstructured and incapable of being handled by a data warehouse (original source IDC) ? ? ? ? Structured Data Sources Unstructured Data Sources Invoices and receipts Sensor data Online forms CRM profiles Spreadsheets Social media content Emails Podcasts Security footage Transcripts Data Lakes create a single repository for both structured and unstructured data, making it easy for data scientists to pull exactly what they need for analysis.

Upload: others

Post on 26-Apr-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Why are Data Lakes the Future of Big Data? - Oracle · 2019-11-22 · Title: Why are Data Lakes the Future of Big Data? Author: Oracle Corporation Subject: As big data gets bigger,

DATA SOURCES

DATA LAKES IN USE

Why are Data Lakes the Future of Big Data?

Go to

oracle.com/data-laketo learn why data lakes optimize data management in the era of big data

IS YOUR BUSINESS READY TO GET THE MOST OUT OF ITS DATA?

Copyright © 2019, Oracle and/or its af�liates. All rights reserved. Oracle and Java are registered trademarks of Oracle and/or its af�liates. Other names may be trademarks of their respective owners.

As big data gets bigger, the increasing volume of data and data sources can easily overwhelm data scientists. A data lake puts that all in one simple, cost-effective, and con�gurable repository.

THE NEED FOR DATA MANAGEMENT

Without a smart data management strategy, organizations can quickly get overwhelmed – and fall behind the competition

Data Lake

vs

Data Warehouse

4.4 GBof data are used by Americans every minute

50%of businesses say that big data has changed their sales and marketing

90%of data has been generated since 2016

95%of businesses handle unstructured data

Structured or unstructured data

Data comes in all types in the digital age from countless sources

Sourced internally or externally

DATA LAKE VS DATA WAREHOUSES

Data lakes and data warehouses both act as repositories, but they are designed for very different purposes. Data warehouses work best for speci�c projects with set resources while data lakes are optimized for managing all incoming big data.

Structured, unstructured, and raw data

Schema-on-read

Large repository for data scientists

Flexible con�guration

Low cost, high volume Cost scales with volume

Schema-on-write

Structured, processed data

Ready for reports by business users

Fixed con�guration

Use with BI / analytics

Data comes from many types of sources. In fact, businesses rarely come to a data-driven decision with just one source. Instead, it takes numerous inputs, and much of that can be unstructured.

Data Lakes place these all in a single repository, saving

time, effort, and cost

5Average number of data sources consulted to reach a data-driven decision

80%Amount of data that is unstructured and incapable of being handled by a data warehouse(original source IDC)

????

Structured Data Sources

Unstructured Data Sources

Invoices andreceipts

Sensordata

Onlineforms

CRMpro�les

Spreadsheets

Social mediacontent

Emails Podcasts Securityfootage

Transcripts

Data Lakes create a single repository for both structured and unstructured data, making it easy for data scientists to pull exactly what they need for analysis.