bitdeli - a platform for creating custom analytics in your browser (pydata sv 2013)

28
Create Custom Analytics in Your Browser PyData 2013 Ville Tuulos CEO, Co-Founder

Upload: pydata

Post on 10-May-2015

525 views

Category:

Technology


4 download

DESCRIPTION

Video can be found here: https://vimeo.com/63298686

TRANSCRIPT

Page 1: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Create Custom Analytics in Your Browser

PyData 2013

Ville TuulosCEO, Co-Founder

Page 2: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Everybody (Click & Play)

Business Analysts (Excel)

IT / DBAs (SQL, Python)

Data Hackers (MapReduce)

People who implement theirown infrastructure

Page 3: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Everybody (Click & Play)

Business Analysts (Excel)

IT / DBAs (SQL, Python)

Data Hackers (MapReduce)

People who implement theirown infrastructure

Disco

Page 4: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Everybody (Click & Play)

Business Analysts (Excel)

IT / DBAs (SQL, Python)

Data Hackers (MapReduce)

People who implement theirown infrastructure

Page 5: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Python is great

Page 6: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Python is greatMapReduce is hard

Page 7: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Python is greatMapReduce is hard

Servers are annoying (cloud or not)

Page 8: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Python is greatMapReduce is hard

Servers are annoying (cloud or not)Everybody likes real-time

Page 9: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Python is greatMapReduce is hard

Servers are annoying (cloud or not)Everybody likes real-time

Support healthy workflows

Page 10: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Demo

Page 11: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)
Page 12: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

what makes some users very active?

Customer CCustomer B

how to reduce churn?

Customer A

why some users return?

Daily ActivityDaily Activity Daily Activity

Use

rs

Use

rs

Use

rs

Page 13: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Simple Complex

Discover

Explore

Page 14: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Simple Complex

Discover

Explore

Infographics

Basic Statistics

Reports

Page 15: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Simple Complex

Discover

Explore

Infographics

Basic Statistics

Reports

Segments

Funnels

Visualizations

Page 16: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Simple Complex

Discover

Explore

Infographics

Basic Statistics

Reports

Query

Segments

Funnels

Slice & Dice

Descriptive Models

Visualizations

Page 17: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Simple Complex

Discover

Explore

Infographics

Basic Statistics

Reports

Query

Segments

Funnels

Clustering

Slice & Dice

Descriptive Models

Visualizations

Predictive Models

Page 18: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

DiscoDBpersistent, immutable, compressed, lightning fast,

key-value(s) mappingthat supports lazy boolean queries.

Codehttps://github.com/discoproject/discodb

Docshttp://discoproject.org/doc/discodb/

Page 19: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

from discodb import DiscoDB

FILES = [‘a.txt’, ‘b.txt’, ‘c.txt’]

def extract_words():for fname in FILES:

for word in open(fname).read().split():yield word, fname

db = DiscoDB(extract_words())

db[‘dog’]db.keys()db.unique_values()db.items()

# files that mention ‘dog’# all distinct word# all distinct filenames# all (word, iter(fname)) pairs

Page 20: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Hash Map:hash(Key) → Key ID

Value Map:Key ID → [Value ID, ...]

Keys:Key ID → Key

Values:Value ID → Value

DiscoDB Chunk

Page 21: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Hash Map:hash(Key) → Key ID

Value Map:Key ID → [Value ID, ...]

Keys:Key ID → Key

Values:Value ID → Value

DiscoDB Chunk

Perfect hashing by CMPH,guaranteed O(1)

The list of Value IDsis delta-encoded

Values are compressed with a global Huffmancodebook

Page 22: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

DiscoDB Chunk

Node 1 Node 2 Node N

Disco Node

Python Worker

DDFS

Disco Node

Python Worker

Disco Node

Python Worker

DiscoDB Chunk

DiscoDB Chunk

DiscoDB Chunk

DiscoDB Chunk

DiscoDB Chunk

DiscoDB Chunk

DiscoDB Chunk

DiscoDB Chunk

Page 23: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

A → [Apple, Orange, Banana]B → [Apple, Banana]C → [Banana, Melon]

Q(“A & B”)Apple

Banana

Q(“A | B”)Apple

OrangeBanana

Q(“(A & B) | C”)Banana

DiscoDB

from discodb.query import QQuerying with Conjunctive Normal Form

Page 24: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Model:Event → Users

Query (sequence of events):Q(“Event A & Event B & ...”)

Funnelhttps://github.com/tuulos/bd3-mixpanel-funnel

Page 25: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Model:Day N → Users

Query (weekly cohorts):

Q(“(dayN | dayN+1) & (dayM | dayM+1...)”)

Cohort Analysishttps://github.com/tuulos/bd3-mixpanel-cohort

Page 26: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Model:Day N → Users

Query (one time series):

[Q(Day K) for K in range(start, end)]

Time Serieshttps://github.com/tuulos/bd3-mixpanel-trends

Page 27: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)
Page 28: Bitdeli - A Platform for Creating Custom Analytics in Your Browser (PyData SV 2013)

Thank You!

TRENDINGTRENDINGTRENDINGTRENDING

https://bitdeli.com/free

Interested?Contact [email protected]

Free analytics for your GitHub repos: