Download - Zendesk @ clj-melb

Transcript
Page 1: Zendesk @ clj-melb

Data @ ZendeskClojure, Cascalog, Hadoops and Datas

Page 2: Zendesk @ clj-melb
Page 3: Zendesk @ clj-melb
Page 4: Zendesk @ clj-melb

Web

Data

Page 5: Zendesk @ clj-melb

But…● There is (too?) much of it● I t ’ s s p r e a d o u t ● Optimised for other stuff

We has Data!

Page 6: Zendesk @ clj-melb

Lower barrier to entry for analytics

What we want from our data

Add value for our customers

Page 7: Zendesk @ clj-melb

Understandable & Concise

not

Open Source

What we want from our solution

Extensible

Customisable

Page 8: Zendesk @ clj-melb

Headphones

Page 9: Zendesk @ clj-melb

We settled on

Page 10: Zendesk @ clj-melb

(def cascalog “Pretty”)(ns impatient.core

(:use [cascalog.api]

[cascalog.more-taps :only (hfs-delimited)])

(:require [clojure.string :as s]

[cascalog.ops :as c])

(:gen-class))

(defmapcatop split [line]

"reads in a line of string and splits it by regex"

(s/split line #"[\[\]\\\(\),.)\s]+"))

(defn -main [in out & args]

(?<- (hfs-delimited out)

[?word ?count]

((hfs-delimited in :skip-header? true) _ ?line)

(split ?line :> ?word)

(c/count ?count)))

Page 11: Zendesk @ clj-melb

It was a journey, we learnt lots

Page 12: Zendesk @ clj-melb

● Taps & Sinks● Group By, Aggregation & Filters● Joins & Function Calls

Cascalog Basics in Gorilla

Page 13: Zendesk @ clj-melb

(def review-scores (repeatedly 5000 rand))

(defn grab-score [x] {:score [x]})

; BAD - stack overflow(def combine-score (partial merge-with concat)); BETTER - no stack overflow, but wait for GC(def combine-score (partial merge-with (comp doall concat))); BEST - snappy fast(def combine-score (partial merge-with into))

(defparallelagg bucket-scores :init-var #'grab-score :combine-var #'combine-score)

(defn median-scores [bucketed-scores] {:median-score (median (:score bucketed-scores))})

(??<- [?median-score] (review-scores :> ?score) (bucket-scores :< ?score :> ?bucketed-scores) (median-scores :< ?bucketed-scores :> ?median-score))

Learnings

Lazy sequences are not always your friend

Page 14: Zendesk @ clj-melb

Midje for Testing. And why it’s good

Page 15: Zendesk @ clj-melb

The Result

Page 16: Zendesk @ clj-melb

Bonus!

Clojure from python ( for prettier graphs)


Top Related