introduction to big data technology · the sole discretion of oracle. agenda • what is big data...

61
<Insert Picture Here> Introduction to Big Data Technology

Upload: others

Post on 24-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

<Insert Picture Here>

Introduction to Big Data Technology

Page 2: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The following is intended to outline our general product

direction. It is intended for information purposes only, and

may not be incorporated into any contract. It is not a

commitment to deliver any material, code, or functionality,

and should not be relied upon in making purchasing

decisions.

The development, release, and timing of any features or

functionality described for Oracle’s products remain at

the sole discretion of Oracle.

Page 3: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Agenda

• What is Big Data

• Real World Use of Big Data

• Architecture Overview

• Oracle Big Data Solution

• Summary

Page 4: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

WHAT IS BIG DATA?

Page 5: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Why Is Big Data Important?

Source: * McKinsey Global Institute: Big Data – The next frontier for innovation, competition and productivity (May 2011)

US HEALTH CARE

$300 B

Increase industry value per year by

McKinsey Global Institute

US RETAIL

60+%

Increase net margin by

MANUFACTURING

–50%

Decrease dev., assembly costs

by

GLOBAL PERSONAL LOCATION DATA

$100 B

Increase service provider revenue by

EUROPE PUBLIC SECTOR ADMIN

€250 B

Increase industry value per year by

Page 6: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Data are Growing

• 400+ Million Twitter accounts (100m+ tweeting)

• 900+ Million Facebook subscribers

• 1.2+ Billion Mobile Web users

• Sensors everywhere

Page 7: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

What is Big data

• Big data analytics is concerned with the analysis of large volumes of

transaction/event data and behavioral analysis of human/human a

human/system interactions. (Gartner)

• Big data represents the collection of technologies that handle large

data volumes well beyond that inflection point and for which, at least

in theory, hardware resources required to manage data by volume

track to an almost straight line rather than a curve. (IDC)

Page 8: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

What is Big data

VOLUME VELOCITY VARIETY VALUE

SOCIAL

BLOG

SMARTMETER

101100101001001001101010101011100101010100100101

Page 9: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Structured B.. Non Structured

Level Example

Structured Relational database

Semi-structured XML data files

Quasi-structured Text documents

Unstructured Images and video

A new class of problems

has emerged which

demands an ability to

accept and manage data

without advanced knowledge of its

structure or format.

Non Structured

Page 10: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Real world use of big data

Today’s Challenge

New Data What’s Possible

HealthcareExpensive office visits

Remote patient

monitoring

Preventive care, reduced

hospitalization

ManufacturingIn-person support

Product sensorsAutomated diagnosis,

support

Location-Based Services

Based on home zip code

Real time location dataGeo-advertising, traffic,

local search

Public SectorStandardized services

Citizen surveysTailored services,

cost reductions

RetailOne size fits all

marketing

Social mediaSentiment analysis

segmentation

Page 11: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Public HealthcareManagement of Outbreak Through Early Detection of Clusters

Page 12: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

1. B

loo

d P

res

su

re

He

art

ra

te T

rac

kin

g

4. H

ea

lth

Tra

ck

ing

He

alt

h c

he

ck

up

re

co

rds

1. B

loo

d P

res

su

re

Sle

ep

Tra

ck

ing

4. H

ea

lth

Tra

ck

ing

He

alt

h c

he

ck

up

re

co

rds

3. Public/Private Hospital executes Health Program

integratedEHR/EMR Systems

2. Government creates/revisesNational Health

Program

1. B

loo

d P

res

su

re

Co

ac

h T

rac

kin

g

4. H

ea

lth

Tra

ck

ing

He

alt

h c

he

ck

up

re

co

rds

Big Data

Cloud

Suggested

Health Improvement(Secured Personal Access)

Government Officer

Doctor

Nurse

Officer

Integrated Medical Device

Integrated Medical Device

Integrated Medical Device

Integrated Medical Device

Integrated Medical Device

Personal HealthImprovement

Page 13: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

0101010101010101010101010101010101010101

0101010101010101010101010101010101010101

0101010101010101010101010101010101010101

What does a Big Data World look like?Utilities

What they collect

Smart Metering -Monitors power usage

How they use it

Better demand planning

Better targeted marketing

Better targeted products based on individuals power needs

The ability to predict demand at household level

Reduce exposure to spot market

Big Data meansB

Page 14: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Real world use of big data

MEDIA/ENTERTAINMENT

Viewers / advertising effectiveness

COMMUNICATIONS

Location-based advertising

EDUCATION &RESEARCH

Experiment sensor analysis

CONSUMER PACKAGED GOODS

Sentiment analysis of what’s hot,

problems

HEALTH CARE

Patient sensors, monitoring, EHRs

Quality of care

LIFE SCIENCES

Clinical trials

Genomics

HIGH TECHNOLOGY / INDUSTRIAL MFG.

Mfg quality

Warranty analysis

OIL & GAS

Drilling exploration

sensor analysis

FINANCIALSERVICES

Risk & portfolio analysis

AUTOMOTIVE

Auto sensors reporting location, problems

RETAIL

Consumer sentiment

Optimized sales & marketing

LAW ENFORCEMENT

& DEFENSE

Threat analysis -social media

monitoring, photo analysis

TRAVEL &TRANSPORTATION

Sensor analysis for optimal traffic flows

Customer sentiment

UTILITIES

Smart Meter analysis

ON-LINE SERVICES / SOCIAL MEDIA

People & career matching

Web-site optimization

Challenged by: Data Volume, Velocity, Variety in finding Value

Page 15: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

ARCHITECTURE OVERVIEW

Page 16: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Basic Architecture

Client

Page 17: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Basic Architecture

Client

Name Node

HDFS

Page 18: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Basic Architecture

Client

Name Node

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

HDFS

Page 19: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Basic Architecture

Client

Name Node Job Tracker

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

HDFS Map Reduce

Page 20: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Basic Architecture

Client

Name Node Job Tracker

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

HDFS Map Reduce

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

Data Node

Task Tracker

Page 21: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 22: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 23: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 24: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 25: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 26: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 27: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 28: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 29: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

The concept in not new B

Page 30: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData

Page 31: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData

Non Structured Data

(Input)

Page 32: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Integration

Page 33: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Integration

Integration

(Data Import, Log Collection)

Page 34: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Management

Page 35: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Management

Data Management

File System

Data Management

DBMS

Page 36: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Processing

Page 37: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Processing

Processing

Map ()

Processing

Reduce ()

Processing

Output

Processing

Serialization

Page 38: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Query

Page 39: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataData Query

Data Query

Page 40: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataAnalysis

Page 41: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataAnalysis

Analysis(statistical computing and graphics env)

Analysis(machine learning and data mining library )

Analysis (displaying, monitoring &

analyzing)

Page 42: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataManagement and Co-ordination

Page 43: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Mapping to world of Big dataManagement and Co-ordination

Reliable coordination service for distributed

computing environments

workflow/coordination/ job scheduling service

to orchestrate dependencies between

jobs

Page 44: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Big Data Functional Building BlocksEcosystem

Data Management Fx

Distributed Database

Processing Fx

Workflow

Development Fx

Script based Data flow /

transformation

Block based Query

Language

Integration Fx

Data Import Log Collection

Mgmt Fx

Monitor all Building Blocks

Analysis Fx

Coordination

Distributed File System

Job: Input -> Plot (key,data) -> Process -> Output

Display & Monitoring SDK

Streaming Data Collection

Data Serialization

machine learning & data mining

Lib

statistical computing and graphics envScript based log

processing

High Level Data Processing

API

NLP and Math Lib

X-Language service Dev

Page 45: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Big Data Solution Building BlocksTechnologies

Data Management Fx

Cassandra,Hbase,BigTable

Processing Fx

OOZIE

Development Fx

PIG

HIVE Integration Fx

SQOOP SCRIBE

Mgmt Fx

ClouderaMgmt Tools

Analysis Fx

ZOO KEEPER

HDFS,Mogile FS,Lustre

MAP REDUCE

CHUKWA

FLUME

AVRO

MAHOUT R

SAWZALL

CACADING

CLOJRE

THRIFT

Page 46: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

ORACLE BIG DATA SOLUTION

Page 47: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Oracle Big Data ApplianceBig Data for the Enterprise

HDFS ����

Map Reduce ����

Ecosystems ����

EnterpriseSupport ����

Page 48: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

• Deep Analytics

• Agile Development

• Massive Scalability

• Real Time Results• High Throughput

• In-Place Preparation

• All Data Sources/Structures

• Low, predictable Latency

• High Transaction Volume

• Flexible Data Structures

Big Data: Infrastructure Requirements

Acquire Organize Analyze

Page 49: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Divided Solution Spectrum

Acquire AnalyzeOrganize

DBMS (DW)

DBMS (OLTP)

Advanced AnalyticsETL

DynamicSchema

DataVariety

Schema

Page 50: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Divided Solution Spectrum

Acquire AnalyzeOrganize

MapReduceSolutions

DBMS (DW)

DBMS (OLTP)

Advanced Analytics

DistributedFile Systems

Transaction (Key-Value)Stores

ETL

DynamicSchema

DataVariety

Schema

Page 51: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Divided Solution Spectrum

Acquire AnalyzeOrganize

DynamicSchema

DataVariety

Schema OracleDatabase(DW)

OracleDatabase(OLTP)

Oracle NoSQL DB

HDFS

Map Reduce

In-DB

Analytics

“R”

Mining

Oracle

BI EE

OracleData Integrator

Oracle Loader for

Hadoop

OracleODI

Page 52: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Divided Solution Spectrum

Acquire AnalyzeOrganize

DynamicSchema

DataVariety

Schema OracleDatabase(DW)

OracleDatabase(OLTP)

Oracle

BI EE

OracleODI

Oracle BigData• Hadoop

• NoSQL Database

• Oracle Loader for hadoop

• Oracle Data Integrator, R

Page 53: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Divided Solution Spectrum

Acquire AnalyzeOrganize

DynamicSchema

DataVariety

Schema

Oracle

BI EE

Oracle BigData• Hadoop

• NoSQL Database

• Oracle Loader for hadoop

• Oracle Data Integrator, R

Oracle Exadata• OLTP & DW

• Data Mining & Oracle R

• Semantics

• Spatial

Page 54: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Divided Solution Spectrum

Acquire AnalyzeOrganize

DynamicSchema

DataVariety

Schema

Oracle BigData• Hadoop

• NoSQL Database

• Oracle Loader for hadoop

• Oracle Data Integrator, R

Oracle Exadata• OLTP & DW

• Data Mining & Oracle R

• Semantics

• Spatial

Oracle

Exalytic

Page 55: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Name Node

•18 Sun X4270 M2 Servers

–48 GB memory per node = 864 GB memory

–12 Intel cores per node = 216 cores

–24 TB storage per node = 432 TB storage

•40 Gb p/sec InfiniBand

•10 Gb p/sec Ethernet

Oracle Big Data Appliance Hardware

Name Node

Data NodeData NodeData Node

Data NodeData Node

Data Node

Data Node

Data NodeData Node

Data Node

Data Node

Data NodeData Node

Data Node

Name Node

Data Node

Page 56: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

•Oracle Linux 5.6

•Apache Hadoop Distribution

•R Distribution

•Oracle NoSQL Enterprise Edition

•Oracle Data Integrator Application

Adapter for Hadoop

•Oracle Loader for Hadoop

Oracle Big Data Appliance Software

Page 57: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Scale Out to Infinity

Scale out by connecting racksto each other using Infiniband

• Expand up to eight racks without additional switches

• Scale beyond eight racks by adding an additional switch

Page 58: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Big Data ApplianceBatch Usage Model

Oracle Big Data Appliance

Oracle Exadata

InfiniBand

Acquire Organize Analyze

Oracle Exalytics

InfiniBand

Page 59: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

SUMMARY

Page 60: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution

Big Data, An Integrated Architecture

Transaction

Data

Transaction

Data

Ma

na

ge

me

nt

Se

cu

rity

, G

ove

rna

nc

e

Ma

na

ge

me

nt

Se

cu

rity

, G

ove

rna

nc

e

Advanced

Analytics

Advanced

Analytics

Visual

Discovery

Visual

Discovery

DBMS (OLTP)DBMS (OLTP)

Master &

Ref Data

Master &

Ref Data

Str

uctu

red

Data Warehouse

Data Warehouse

Text Analyticsand Search

Text Analyticsand Search

Reporting &DashboardsReporting &Dashboards

Real-TimeReal-Time

Machine

Generated

Machine

Generated

Social

Media

Social

Media

Text, Image

Video, Audio

Text, Image

Video, Audio

Key-Value

Data Store

Key-Value

Data Store

Un

str

uctu

red

Sem

i-str

uctu

red

Hadoop Cluster w

MapReduce

Hadoop Cluster w

MapReduce

AlertingAlerting

In-Database

Analytics

In-Database

Analytics

EPM

BI Applications

EPM

BI Applications

Message-Based

Message-Based

DB ReplicDB Replic

ETL/ELTETL/ELT

ChangeDCChangeDC

ODSODS

Data MartsData Marts

Streaming(CEP Engine)

Streaming(CEP Engine)

CaptureCaptureStore/Proces

sStore/Proces

sIntegrateIntegrate OrganizeOrganize AnalyzeAnalyze

Govern

Govern

Page 61: Introduction to Big Data Technology · the sole discretion of Oracle. Agenda • What is Big Data • Real World Use of Big Data • Architecture Overview • Oracle Big Data Solution