cebit big data 2012 - raanan dagan, big data product marketing, splunk

Post on 13-Jan-2015

1.069 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Real-time Analytics

from Small Data, Big

Data and Huge Data

Raanan Dagan, Big Data Solutions, Splunk

Copyright © 2012 Splunk Inc.

What I’ll Talk About

Machine Data

Splunk and Big Data, Real-time Analytics

Customer Use Cases

2

Big Data Comes from Machines

Volume | Velocity | Variety | Variability

GPS,

RFID,

Hypervisor,

Web Servers,

Email, Messaging

Clickstreams, Mobile,

Telephony, IVR, Databases,

Sensors, Telematics, Storage,

Servers, Security Devices, Desktops

Machine-generated data is one of the

fastest growing, most complex

and most valuable segments of big data

3

What Does Machine Data Look Like?

4

Sources

Twitter

Care IVR

Middleware Error

Order Processing

Machine Data Contains Critical Insights

5

Order ID

Customer’s Tweet

Time Waiting On Hold

Product ID

Company’s Twitter ID

Sources

Twitter

Care IVR

Middleware Error

Order Processing

Order ID

Customer ID

Twitter

ID

Customer ID

Customer ID

Splunk: The Platform for Machine Data

6

Insight and Visualizations

for Executives

Statistical Analysis

Proactive Monitoring

Search and Investigation

Machine Data Operational Intelligence

Splunk Index

Customer

Facing Data

Outside the

Datacenter

Applications

Web logsLog4J, JMS, JMX.NET eventsCode and scripts

Networking

ConfigurationssyslogSNMPnetflow

Databases

ConfigurationsAudit/query logsTablesSchemas

Virtualization

& Cloud

HypervisorGuest OS, AppsCloud

Linux/Unix

ConfigurationssyslogFile systemps, iostat, top

Windows

RegistryEvent logsFile systemsysinternals

Logfiles Configs Messages Traps

Alerts

Metrics Scripts TicketsChanges

Click-stream dataShopping cart dataOnline transaction data

Manufacturing, logistics…CDRs & IPDRsPower consumptionRFID dataGPS data

Splunk Collects and Indexes Machine DataNo upfront schema. No RDBMS. No custom connectors.

7

Operational Intelligence for IT and Business Users

Web Intelligence

Application Management Business Analytics

Security & Compliance

LOB Owners/

Executives

LOB Owners/

ExecutivesCustomer

Support

Customer

Support

System

Administrator

System

Administrator

IT Operations Management

Operations

Teams

Operations

Teams

Security

Analysts

Security

Analysts

IT

Executives

IT

ExecutivesDevelopment

Teams

Development

Teams AuditorsAuditors

Website/Business

Analysts

Website/Business

Analysts

8

The Technical part

Splunk Has Four Primary Functions

• Searching and Reporting (Search Head)

• Indexing and Search Services (Indexer)

• Local and Distributed Management (Deployment Server)

• Data Collection and Forwarding (Forwarder)

A Splunk install can be one or all roles…

10

Scalability to Tens of TBs/Day on Commodity Servers

Send data from 1000s of servers using combination of Splunk Forwarders, syslog, WMI, message queues, or other remote protocols

Auto load-balanced forwarding to as many Splunk Indexers as you need to index terabytes/day

Offload search load to Splunk Search Heads

11

Analyzing Heterogeneous Data

No data normalization

Automatically handles

timestamps

Parsers not required

Index every term &

pattern “blindly”

No attempt to

“understand” up front

Normalization as it’s

needed

Faster implementation

Easy search language

Multiple views into the

same data

Knowledge applied at

search-time

No brittle schema to work

around

Multiple views into the

same data

Find transactions, patterns

and trends

Universal

Indexing

Late Structure

Binding

Analysis and Visualization

Rapid time-to-deploy: hours or days

12

Real-time Analytics

Data

Pa

rsin

g Q

ue

ue Parsing Pipeline

• Source, event typing

• Character set

normalization

• Line breaking

• Timestamp identification

• Regex transforms

Indexing

Pipeline

Real-time

Buffer

Raw data

Index Files

Real-time Search Process

Real-time Search Process

Monitor Input

Ind

ex

Qu

eu

e

TCP/UDP Input

Scripted InputSplunk

Index

13

Splunk and Hadoop

14

Splunk Hadoop Connect

Reliable Data Export

Import Hadoop Data

Splunk App for HadoopOps

End-to-end monitoring,

troubleshooting , analysis of

Hadoop environment

>>>>

Real-time Collection and

Analysis

Dashboards, Reports,

Access Controls

>>

15

Splunk Hadoop Connect

Delivers reliable integration

between Splunk and Hadoop

Export events collected and

aggregated in Splunk to HDFS

Explore and browse HDFS

directories and files

Import and index data from HDFS

for secure searching, reporting,

analysis and visualizations in Splunk

Splunk App for HadoopOps

16

End-to-end monitoring and

troubleshooting for Hadoop

Monitoring of entire Hadoop

environment (Network, Switch,

Operating System and Database)

Integrated alerting to track and

respond to activities from MapReduce

to the individual node in the cluster

Centralized real-time view of Hadoop

nodes using intuitive heatmap display

Summary - Splunk Big Data Solution

Product-basedSolution

Performance at scale

Integrated and End-to-end

17

Thank You

top related