Transcript
Page 1: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Built for monitoring distributed systems

Event Stream Processing (like ESPER/Drools Fusion) Shared-State (index)

Open Source (written by aphyr (Kyle Kingsbury))

riemann.io

Page 2: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky
Page 3: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Concepts

host ‘A’

service ‘req_latency’

state ‘ok’

metric 1

ttl 60

tags ‘important’

event

Page 4: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Concepts

(host, service) last event

(‘A’, ‘req_latency’)

(‘B’, ‘req_latency’)

(‘C’, ‘req_latency’)

(‘D’, ‘req_latency’)

index

Page 5: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Heartbeat Trigger

Page 6: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

(expired (tagged “keep_alive” (email "[email protected]")))

Heartbeat Trigger

Page 7: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Threshold Trigger

Page 8: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Threshold Trigger

(where (and (service "req_latency") (> metric 10)) (email "[email protected]"))

Page 9: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Change State(host, service) metric state

('A', 'req_latency') 20 error

('B', 'req_latency') 1 ok

('C', 'req_latency') 5 error

('D', 'req_latency') 5 ok

Page 10: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Change State

(where (service “req_latency”) (split (< metric 2) (with :state "ok" index) (> metric 10) (with :state "error" index)))

(changed-state {:init “ok”} (email [email protected]))

Page 11: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Time Window Statistics

Page 12: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

ClusterStatistics

Page 13: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Cluster Statistics

(by [:host] (where (service "req_latency") (percentiles 60 [0.5] index-max-of-median)))

(def index-max-of-median (smap folds/maximum index))

Page 14: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Storm Filtering

Page 15: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Storm Filtering

(def alert-devops (throttle 100 3600

(rollup 3 3600 (email "[email protected]"))))

(where (tagged "db-connection-exception") alert-devops)

Page 16: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Enrichment host ‘A’

service ‘req_latency’

state ‘ok’

metric 1

ttl 60

tags ‘important’

tenant 1

Page 17: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Event Enrichment

(defn change-event [my-key my-value & children](fn [event] (let [my-event (assoc event :my-key :my-value)] (call-rescue my-event children))))

(change-event 'tenant' '1' index)

Page 18: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Tenant 1 Tenant 2 Tenant 3

Page 19: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

Multi-Tenancy

(def riemann-agg (tcp-client :host "agg-hostname"))

(changed-state (change-event 'tenant' '1') (forward riemann-agg))

Page 20: DevOps Days Tel Aviv 2013: Ignite Talk: Monitoring Patterns with Riemann - Itai Frenkel & Eli Polonsky

http://riemann.io


Top Related