devops days tel aviv 2013: ignite talk: monitoring patterns with riemann - itai frenkel & eli...

20
Built for monitoring distributed systems Event Stream Processing (like ESPER/Drools Fusion) Shared-State (index) Open Source (written by aphyr (Kyle Kingsbury)) riemann.io

Upload: devops-israel

Post on 10-May-2015

708 views

Category:

Technology


3 download

DESCRIPTION

Riemann aggregates events from your servers and applications with a powerful stream processing language, which enables concise monitoring rule declarations. This 5 minute ignite talk gives a taste of common monitoring pattern implementations: heartbeat, statistics, event enrichment, state based filters, multi-tenant monitoring, and reviews what you can do with Riemann after processing these patterns. Speakers: Itai Frenkel and Eli Polonski, GigaSpaces Eli Polonsky and Itai Frenkel work at GigaSpaces, developing the the Cloudify open source devops and cloud automation suite. Part of their work includes open source devops tool evaluation such as Riemann.

TRANSCRIPT

Page 1: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Built for monitoring distributed systems

Event Stream Processing (like ESPER/Drools Fusion) Shared-State (index)

Open Source (written by aphyr (Kyle Kingsbury))

riemann.io

Page 2: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky
Page 3: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Concepts

host ‘A’

service ‘req_latency’

state ‘ok’

metric 1

ttl 60

tags ‘important’

event

Page 4: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Concepts

(host, service) last event

(‘A’, ‘req_latency’)

(‘B’, ‘req_latency’)

(‘C’, ‘req_latency’)

(‘D’, ‘req_latency’)

index

Page 5: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Heartbeat Trigger

Page 6: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

(expired (tagged “keep_alive” (email "[email protected]")))

Heartbeat Trigger

Page 7: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Threshold Trigger

Page 8: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Threshold Trigger

(where (and (service "req_latency") (> metric 10)) (email "[email protected]"))

Page 9: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Change State(host, service) metric state

('A', 'req_latency') 20 error

('B', 'req_latency') 1 ok

('C', 'req_latency') 5 error

('D', 'req_latency') 5 ok

Page 10: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Change State

(where (service “req_latency”) (split (< metric 2) (with :state "ok" index) (> metric 10) (with :state "error" index)))

(changed-state {:init “ok”} (email [email protected]))

Page 11: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Time Window Statistics

Page 12: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

ClusterStatistics

Page 13: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Cluster Statistics

(by [:host] (where (service "req_latency") (percentiles 60 [0.5] index-max-of-median)))

(def index-max-of-median (smap folds/maximum index))

Page 14: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Storm Filtering

Page 15: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Storm Filtering

(def alert-devops (throttle 100 3600

(rollup 3 3600 (email "[email protected]"))))

(where (tagged "db-connection-exception") alert-devops)

Page 16: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Enrichment host ‘A’

service ‘req_latency’

state ‘ok’

metric 1

ttl 60

tags ‘important’

tenant 1

Page 17: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Enrichment

(defn change-event [my-key my-value & children](fn [event] (let [my-event (assoc event :my-key :my-value)] (call-rescue my-event children))))

(change-event 'tenant' '1' index)

Page 18: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Tenant 1 Tenant 2 Tenant 3

Page 19: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Multi-Tenancy

(def riemann-agg (tcp-client :host "agg-hostname"))

(changed-state (change-event 'tenant' '1') (forward riemann-agg))

Page 20: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

http://riemann.io