devops days tel aviv 2013: ignite talk: monitoring patterns with riemann - itai frenkel & eli...

Post on 10-May-2015

708 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Riemann aggregates events from your servers and applications with a powerful stream processing language, which enables concise monitoring rule declarations. This 5 minute ignite talk gives a taste of common monitoring pattern implementations: heartbeat, statistics, event enrichment, state based filters, multi-tenant monitoring, and reviews what you can do with Riemann after processing these patterns. Speakers: Itai Frenkel and Eli Polonski, GigaSpaces Eli Polonsky and Itai Frenkel work at GigaSpaces, developing the the Cloudify open source devops and cloud automation suite. Part of their work includes open source devops tool evaluation such as Riemann.

TRANSCRIPT

Built for monitoring distributed systems

Event Stream Processing (like ESPER/Drools Fusion) Shared-State (index)

Open Source (written by aphyr (Kyle Kingsbury))

riemann.io

Concepts

host ‘A’

service ‘req_latency’

state ‘ok’

metric 1

ttl 60

tags ‘important’

event

Concepts

(host, service) last event

(‘A’, ‘req_latency’)

(‘B’, ‘req_latency’)

(‘C’, ‘req_latency’)

(‘D’, ‘req_latency’)

index

Heartbeat Trigger

(expired (tagged “keep_alive” (email "alert@devops.tlv")))

Heartbeat Trigger

Threshold Trigger

Threshold Trigger

(where (and (service "req_latency") (> metric 10)) (email "alert@devops.tlv"))

Change State(host, service) metric state

('A', 'req_latency') 20 error

('B', 'req_latency') 1 ok

('C', 'req_latency') 5 error

('D', 'req_latency') 5 ok

Change State

(where (service “req_latency”) (split (< metric 2) (with :state "ok" index) (> metric 10) (with :state "error" index)))

(changed-state {:init “ok”} (email alert@devops.tlv))

Time Window Statistics

ClusterStatistics

Cluster Statistics

(by [:host] (where (service "req_latency") (percentiles 60 [0.5] index-max-of-median)))

(def index-max-of-median (smap folds/maximum index))

Event Storm Filtering

Event Storm Filtering

(def alert-devops (throttle 100 3600

(rollup 3 3600 (email "alert@devops.tlv"))))

(where (tagged "db-connection-exception") alert-devops)

Event Enrichment host ‘A’

service ‘req_latency’

state ‘ok’

metric 1

ttl 60

tags ‘important’

tenant 1

Event Enrichment

(defn change-event [my-key my-value & children](fn [event] (let [my-event (assoc event :my-key :my-value)] (call-rescue my-event children))))

(change-event 'tenant' '1' index)

Tenant 1 Tenant 2 Tenant 3

Multi-Tenancy

(def riemann-agg (tcp-client :host "agg-hostname"))

(changed-state (change-event 'tenant' '1') (forward riemann-agg))

http://riemann.io

top related