how to integrate splunk with any data solution

26
Copyright © 2012 Splunk Inc.

Upload: julian-hyde

Post on 26-May-2015

4.785 views

Category:

Technology


2 download

DESCRIPTION

A presentation Julian Hyde gave to the Splunk 2012 User conference in Las Vegas, Tue 2012/9/11. Julian demonstrated a new technology called Optiq, described how it could be used to integrate data in Splunk with other systems, and demonstrated several queries accessing data in Splunk via SQL and JDBC.

TRANSCRIPT

Page 1: How to integrate Splunk with any data solution

Copyright © 2012 Splunk Inc.

Page 2: How to integrate Splunk with any data solution

How to Integrate Splunkwith any Data Solution

Julian Hyde (Optiq) @julianhyde

http://github.com/julianhyde/optiqhttp://github.com/julianhyde/optiq-splunk

Splunk Worldwide UsersConference 2012

Page 3: How to integrate Splunk with any data solution

Why are we here?I'm going to explain how to use Splunk to access all of the data in your

enterprise.

And also to let people in your enterprise use data in Splunk.

This isn't easy. We'll be showing some raw technology – the new Optiq project and its Splunk adapter.

But it's open source, so you can all get your hands on it. :)

Page 4: How to integrate Splunk with any data solution

About meDatabase hacker

Open source hacker

Author of Mondrian (Pentaho Analysis)

Startup fiend

Page 5: How to integrate Splunk with any data solution

http://www.flickr.com/photos/torkildr/3462606643

Page 6: How to integrate Splunk with any data solution

http://www.flickr.com/photos/sylvar/31436961/

Page 7: How to integrate Splunk with any data solution

“Big Data”Right data, right time

Diverse data sources / Performance / Suitable format

Page 8: How to integrate Splunk with any data solution

ExampleAccessing Splunk data via SQL

Sqlline (a standard JDBC client)

Page 9: How to integrate Splunk with any data solution

How do it (wrong)

Splunk Optiq

SELECT “source”, “product_id”FROM “splunk”.”splunk”WHERE “action” = 'purchase'

“search”

filter

action ='purchase'

Page 10: How to integrate Splunk with any data solution

How do it (right)

Splunk Optiq

SELECT “source”, “product_id”FROM “splunk”.”splunk”WHERE “action” = 'purchase'

“searchaction=purchase”

Page 11: How to integrate Splunk with any data solution

Example #2Combining data from 2 sources (Splunk & MySQL)

Also possible: 3 or more sources; 3-way joins; unions

Page 12: How to integrate Splunk with any data solution

MySQL

Splunk

Expression tree SELECT p.“product_name”, COUNT(*) AS cFROM “splunk”.”splunk” AS s JOIN “mysql”.”products” AS p ON s.”product_id” = p.”product_id”WHERE s.“action” = 'purchase'GROUP BY p.”product_name”ORDER BY c DESC

join

Key: product_id

group

Key: product_nameAgg: count

filter

Condition:action =

'purchase'

sort

Key: c DESC

scan

scan

Table: splunk

Table: products

Page 13: How to integrate Splunk with any data solution

Splunk

Expression tree(optimized)

SELECT p.“product_name”, COUNT(*) AS cFROM “splunk”.”splunk” AS s JOIN “mysql”.”products” AS p ON s.”product_id” = p.”product_id”WHERE s.“action” = 'purchase'GROUP BY p.”product_name”ORDER BY c DESC

join

Key: product_id

group

Key: product_nameAgg: count

filter

Condition:action =

'purchase'

sort

Key: c DESC

scan

Table: splunk

MySQL

scan

Table: products

Page 14: How to integrate Splunk with any data solution

Optiq is not a database.

Page 15: How to integrate Splunk with any data solution

http://www.flickr.com/photos/torkildr/3462606643

Page 16: How to integrate Splunk with any data solution

http://www.flickr.com/photos/telstra-corp/5069403309/

Page 17: How to integrate Splunk with any data solution

Conventional database architecture

JDBC server

SQL parser /validatorQuery

optimizer

Metadata

DataData

Data-flowoperators

JDBC client

Page 18: How to integrate Splunk with any data solution

Optiq architecture

JDBC server

SQL parser /validatorQuery

optimizer

3rd partydata

3rd partydata

JDBC client

3rd

partyops

3rd

partyops

Optional

Pluggable

Core

MetadataSPI

Pluggablerules

Page 19: How to integrate Splunk with any data solution

What is Optiq?A really, really smart JDBC driver

Framework

Potential core of a data management system

Page 20: How to integrate Splunk with any data solution

Writing an adapterDriver – if you want a vanity URL like “jdbc:splunk:”

Schema – describes what tables exist (Splunk has just one)

Table – what are the columns, and how to get the data. (Splunk's table has any column you like... just ask for it.)

Operators (optional) – non-relational operations

Rules (optional, but recommended) – improve efficiency by changing the question

Parser (optional) – to query via a language other than SQL

Page 21: How to integrate Splunk with any data solution

Splunk AdapterRules for pushing down filters, projections

The tricky bit: changed the validator to allow tables to have any column

To be written: rules for pushing down aggregations, joins

(What you've seen today is in github.)

Would be really nice if... Splunk pushed down filters, projections, aggregations from its search pipeline to the MySQL connector. (Currently you have to hand-write a SQL statement.)

Page 22: How to integrate Splunk with any data solution

http://www.flickr.com/photos/walkercarpenter/4697637143/

Page 23: How to integrate Splunk with any data solution

Optiq roadmap ideasMondrian use Optiq to read from data sources such as Splunk

Kettle integration (read/write SQL to ETL)

Adapters: Cascading, MongoDB, Hbase, Apache Drill, …?

Front-ends: linq4j, Scala SLICK, Java8 streams

Contributions

Page 24: How to integrate Splunk with any data solution

ConclusionsLiberate your data!

Optiq is a framework

Build & share Optiq adapters

Page 25: How to integrate Splunk with any data solution

Questions?

@julianhyde

http://julianhyde.blogspot.com

http://github.com/julianhyde/optiq

http://github.com/julianhyde/optiq-splunk

Page 26: How to integrate Splunk with any data solution

Additional material: The following queries were used in the demo

select s."source", s."sourcetype" from "splunk"."splunk" as s;

select s."source", s."sourcetype", s."action" from "splunk"."splunk" as s

where s."action" = 'purchase';

select s."source", s."sourcetype", s."action" from "splunk"."splunk" as s

where s."action" = 'purchase';

select s."action", count(*)

from "splunk"."splunk" as s

group by s."action";

select s."action", s."method", count(*)

from "splunk"."splunk" as s

group by s."action", s."method";

select * from "mysql"."products";

select p."product_name", s."action"

from "splunk"."splunk" as s

join "mysql"."products" as p

on s."product_id" = p."product_id";

select p."product_name", s."action", COUNT(*) AS c

from "splunk"."splunk" AS s

join "mysql"."products" AS p

on s."product_id" = p."product_id"

where s."action" = 'purchase'

group by p."product_name", s."action"

order by c desc;