ibm big data platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/ss130514ibm.pdf ·...

23
IBM Big Data Platform Turning big data into smarter decisions © 2013 IBM Corporation May 16, 2013 Stefan Söderlund. IBM kundarkitekt, Försvarsmakten Sesam vår-seminarie “Big Data, Bigga byte kräver Pigga Hertz!”

Upload: others

Post on 13-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

IBM Big Data Platform

Turning big data into smarter decisions

© 2013 IBM CorporationMay 16, 2013

Stefan Söderlund. IBM kundarkitekt, Försvarsmakten

Sesam vår-seminarie “Big Data, Bigga byte kräver Pigga Hertz!”

Page 2: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

By 2015, 80% of all available data uncertain

Glo

bal D

ata

Vo

lum

e in

Exab

yte

s

100

90

80

70

Ag

gre

gate

Un

cert

ain

ty %

9000

8000

7000

6000

5000

By 2015 the number of networked devices will be double the entire global population. All sensor data has uncertainty.

The total number of social media accounts exceeds the entire global population. This data is highly uncertain in both its expression and content.

© 2013 IBM Corporation2

Glo

bal D

ata

Vo

lum

e in

Exab

yte

s

Multiple sources: IDC,Cisco

70

60

50

40

30

20

10

Ag

gre

gate

Un

cert

ain

ty %

4000

3000

2000

1000

0

2005 2010 2015

Data quality solutions exist for enterprise data like customer, product, and address data, but this is only a fraction of the total enterprise data.

Page 3: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Smarter Defence

� Ever increasing range of sensors

� Volume, velocity, variety

� Military collectors & open source

� Agility & mobility

� Highly connected systems – blurred

Instrumented

Interconnected

© 2013 IBM Corporation33

� Highly connected systems – blurred edges

� Collaboration across coalitions

� From data to actionable intelligence

� From reactive to proactive

� Whole lifecycle system optimisation

Intelligent

Sustained Information Superiority

Page 4: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Big data is a hot topic because technology makes it possible to analyze ALL available data

Cost effectively manage and analyze all available data,

in its native form – unstructured, structured, streaming

© 2013 IBM Corporation4

ERPCRM RFID

Website

Network Switches

Social Media

Command

control

Page 5: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

In order to realize new opportunities, you need to think beyond traditional sources of data

Transactional & Application Data

Machine Data Social Data Enterprise Content

© 2013 IBM Corporation5

• Volume

• Structured

• Throughput

• Velocity

• Semi-structured

• Ingestion

• Variety

• Highly unstructured

• Veracity

• Variety

• Highly unstructured

• Volume

Page 6: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

HadoopStreaming

Data

Data Warehouse

Traditional ApproachStructured, analytical, logical

New ApproachCreative, holistic thought, intuition

Web logs, URLsTransaction Data

Analysis expanding from enterprise data to big data, creating new cost-effective opportunities for competitive advantage

© 2013 IBM Corporation6 6

New Sources

UnstructuredExploratory

Iterative

StructuredRepeatable

Linear

TraditionalSources

EnterpriseWide

IntegrationSocial data

Text Data: emails, chats

RFID, sensor data

Network data

Internal App Data

ERP data

Core Business Data

OLTP System Data

Page 7: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

The IBM Big Data Platform

� Process any type of data

– Structured, unstructured, in-motion, at-rest

� Built-for-purpose engines

– Designed to handle different

Solutions

IBM Big Data Platform

Analytics and Decision Management

Systems

Management

Application

Development

Visualization

& Discovery

© 2013 IBM Corporation7

requirements

� Analyze data in motion

� Manage and govern data in the ecosystem

� Enterprise data integration

� Grow and evolve on current infrastructure

7

Big Data Infrastructure

Accelerators

Information Integration & Governance

Hadoop

System

Stream

Computing

Data

Warehouse

Page 8: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Solutions

IBM Big Data Platform

Analytics and Decision Management

The IBM Big Data Platform

Delivers deep insight

with advanced in-

database analytics &

operational analytics

� PureData System –

expert integrated

systems to make deep

and operational

© 2013 IBM Corporation8

Big Data Infrastructure

analytics faster &

simpler

� InfoSphere

Warehouse -- data

warehouse software

to access operational

info in real time

Data

Warehouse

Data

Warehouse

Page 9: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Solutions

IBM Big Data Platform

Analytics and Decision Management

The IBM Big Data Platform

Analyze streaming

data and large data

bursts for real-time

insights

© 2013 IBM Corporation9

Big Data Infrastructure

Stream

Computing

Data

Warehouse

� InfoSphere Streams

– software enabling

continuous analysis of

massive volumes of

streaming data with

sub-millisecond

response times

Stream

Computing

Page 10: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Solutions

IBM Big Data Platform

Analytics and Decision Management

The IBM Big Data Platform

Cost-effectively

analyze Petabytes of

unstructured and

structured data

� InfoSphere

BigInsights --

enterprise-grade

© 2013 IBM Corporation10

Big Data Infrastructure

Hadoop

System

Stream

Computing

Data

Warehouse

enterprise-grade

Hadoop system

enhanced with

advanced text

analytics, data

visualization, tools, &

performance features

for analyzing massive

volumes of structured

and unstructured

data.

Hadoop

System

Page 11: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

BigInsights Content

Function VersionBasic Edition

Enterprise Edition

Integrated Install Inc Inc

Hadoop (including common utilities, HDFS, MapReduce framework) 1.0.3 Inc Inc

Jaql (programming / query language) 0.5.2 Inc Inc

Pig (programming / query language) 0.10.0 Inc Inc

Flume (data collection/aggregation) 0.9.4 Inc Inc

Hive (data summarization/querying) 0.9.0 Inc Inc

© 2013 IBM Corporation11

Lucene (text search)* 3.3.0 Inc Inc

Zookeeper (process coordination) 3.4.3 Inc Inc

Avro (data serialization) 1.6.3 Inc Inc

HBase (real time read/write) 0.94.0 Inc Inc

HCatalog (table and storage management service) 0.4.0 Inc Inc

Sqoop (RDBMS bulk data transfer) 1.4.1 Inc Inc

Oozie (workflow/ job orchestration) 3.2.0 Inc Inc

Online documentation Inc Inc

Integration with JDBC sources through general-purpose Jaql module Inc Inc

Integration with DB2 (sample functions to submit jobs, read data) Inc Inc

Page 12: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

BigInsights Content (cont’d)

FunctionBasic Edition

Enterprise Edition

Integration with R (Jaql module to invoke R statistical capabilities from BigInsights) n/a Inc

Integration with Netezza, DB2 LUW with DPF from Jaql n/a Inc

LDAP authentication, Guardium support, etc. n/a Inc

Integrated Web Console n/a Inc

Business process accelerators (social data, machine data analytics) n/a Inc

Platform performance enhancements (Adaptive MapReduce, large scale n/a Inc

© 2013 IBM Corporation12

Platform performance enhancements (Adaptive MapReduce, large scale indexing, efficient processing of compressed text files, flexible job scheduler, etc.)

n/a Inc

Text analytics n/a Inc

Eclipse tools for text analytic development, Jaql, Hive, Java n/a Inc

Applications for data import/export, Web crawl, machine learning, etc. n/a Inc

Web-based application catalog n/a Inc

Spreadsheet-like analytical tool n/a Inc

IBM support Opt Inc

Streams, Data Explorer, Cognos BI (limited use licenses) n/a Inc

Unlimited storage n/a Inc

Page 13: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Solutions

IBM Big Data Platform

Analytics and Decision Management

The IBM Big Data Platform

Govern data quality

and manage the

information lifecycle

� InfoSphere Information

Server –Cleanses data,

monitors quality and

integrates big data with

existing systems

© 2013 IBM Corporation13

Big Data Infrastructure

13

Information Integration & Governance

Hadoop

System

Stream

Computing

Data

Warehouse

� InfoSphere Optim –

manages business

information throughout its

lifecycle

� InfoSphere Master

Data Management –

manages and maintains

trusted views of master

and reference data

� InfoSphere Guardium

– real-time database

security and monitoring

Information Integration & Governance

Page 14: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Solutions

IBM Big Data Platform

Analytics and Decision Management

The IBM Big Data Platform

Speed time to value

with analytic and

application

accelerators

� Analytic

Accelerators – text

analytics, geospatial,

time-series, data

mining

© 2013 IBM Corporation14

Big Data Infrastructure

Accelerators

Information Integration & Governance

Hadoop

System

Stream

Computing

Data

Warehouse

mining

� Application

Accelerators –

Decence services

financial services,

machine data, social

data, Telco event data

� Industry Models

- comprehensive data

models based on

deep expertise and

industry best practice

Accelerators

Page 15: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Solutions

IBM Big Data Platform

Analytics and Decision Management

Systems

Management

Application

Development

Visualization

& Discovery

The IBM Big Data Platform

Discover,

understand, search,

and navigate

federated sources of

big data

Visualization

& Discovery

© 2013 IBM Corporation15

Big Data Infrastructure

Accelerators

Information Integration & Governance

Hadoop

System

Stream

Computing

Data

Warehouse

� InfoSphere Data

Explorer – Discovery

and navigation

software that provides

real-time access and

fusion of big data with

rich and varied data

from enterprise

applications for

greater insight

Page 16: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

The IBM Big Data Platform

� Process any type of data

– Structured, unstructured, in-motion, at-rest

� Built-for-purpose engines

– Designed to handle different

Solutions

IBM Big Data Platform

Analytics and Decision Management

Systems

Management

Application

Development

Visualization

& Discovery

© 2013 IBM Corporation16

requirements

� Analyze data in motion

� Manage and govern data in the ecosystem

� Enterprise data integration

� Grow and evolve on current infrastructure

16

Big Data Infrastructure

Accelerators

Information Integration & Governance

Hadoop

System

Stream

Computing

Data

Warehouse

Page 17: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Warehousing Zone

Enterprise Warehouse

An example of the big data platform in practice

Ingestion and Real-time Analytic Zone

Streams

Co

nn

ec

tors

BI & Reporting

Analytics and Reporting Zone

© 2013 IBM Corporation17

ETL, MDM, Data Governance

Metadata and Governance Zone

17

Warehouse

Data Marts

Co

nn

ec

tors

PredictiveAnalytics

Visualization & Discovery

Landing and Analytics Sandbox Zone

Hive/HBaseCol Stores

Documentsin variety of formats

MapReduce

Hadoop

Page 18: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Applied Research : International Technology Alliance (ITA)

� Strategic Goals:

� Enhance distributed, secure, and flexible decision-making for coalition operations

� Enable the rapid and secure formation of ad hoc teams

� Coalition Focus:

Agile Security/ Agile Security/ Network Network ManagementManagement

Secure Distributed Secure Distributed Information ServicesInformation Services

© 2013 IBM Corporation18 March 2011 18

� Coalition Focus:

� Develop interoperable data acquisition, processing, and management technologies

� Enable hybrid wireless networking among coalition partners

� Embed adaptable security in coalition networks and information services

� Techniques to represent, position, find, and link data/information to coalition decisions

Hybrid Hybrid Wireless Wireless

NetworkingNetworking

Information ServicesInformation Services

Information Representation, Information Representation, Aggregation, and FusionAggregation, and Fusion

Page 19: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

System Integration : UK Air Defence (UCCS Project)Project Goal:

� Monitor UK Airspace for terrorist or enemy incursions & initiative intercept

Solution:

� IBM (as prime contractor) implemented state-of-the-art air surveillance and interceptor command & control system

� Developed software applications, integrating multi-radar tracking and voice

© 2013 IBM Corporation19

integrating multi-radar tracking and voice systems and refurbishing entire computer facilities at two RAF bases.

Selected Benefits:

� Reduced Cost (by using Commercial Software)

� Intuitive Human Computer Interface boosts controller performance & reduces training

� New levels of availability & maintainability

Indicative Locations

Page 20: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Senor Fusion – Surveillance / Border ControlBig Data

© 2013 IBM Corporation20

Page 21: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

THINK

© 2013 IBM Corporation2121

Page 22: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

Get Started on Your Big Data Journey Today

Get Educated

– IBM Big Data: ibm.com/bigdata

– IBMBigDataHub.com

– BigDataUniversity.com

© 2013 IBM Corporation22

– BigDataUniversity.com

– IBV study on big data

– Books / analyst papers

Page 23: IBM Big Data Platform 2013-may-14 finalsesam.smart-lab.se/seminarier/sem130514/SS130514IBM.pdf · Hadoop Streaming Data Data Warehouse Traditional Approach Structured, analytical,

The End

© 2013 IBM Corporation23

The End