scaling real-time visualisations for elections 2014

47
S Anand, Chief Data Scientist, Gramener Scaling Real- time Visualisation s for Elections 2014 @sanand0

Upload: gramener

Post on 19-Aug-2014

491 views

Category:

Education


6 download

DESCRIPTION

How Gramener scaled up the Indian Elections 2014 live website

TRANSCRIPT

Page 1: Scaling real-time visualisations for Elections 2014

S Anand, Chief Data Scientist, Gramener

Scaling Real-time

Visualisations for Elections

2014

@sanand0

Page 2: Scaling real-time visualisations for Elections 2014
Page 3: Scaling real-time visualisations for Elections 2014

https://gramener.com/election/story.ddp

Page 4: Scaling real-time visualisations for Elections 2014

What’s the largest number of people that stood in an

election?

Page 5: Scaling real-time visualisations for Elections 2014
Page 6: Scaling real-time visualisations for Elections 2014

“We’ll cross 5 million visitors tomorrow”

Page 7: Scaling real-time visualisations for Elections 2014
Page 8: Scaling real-time visualisations for Elections 2014
Page 9: Scaling real-time visualisations for Elections 2014

Nielsen’s server

ETL

Candidate Votes

Visualisationtemplate

1 2 3 4

Azure Ubuntu serverSingapore

GramenerVisualisation server

Real time

nginx

1 2 3 4

SQL Server

CNN Windows serverNoida, India

ETL

rsync

Candidate Votes

CNN WinXP laptopNoida, India

Every 10seconds

Every 10s

Let’s optimize backwards

Page 10: Scaling real-time visualisations for Elections 2014

WHY NGINX?

http://wiki.dreamhost.com/Web_Server_Performance_Comparison

Page 11: Scaling real-time visualisations for Elections 2014

Split load

Cache it

Page 12: Scaling real-time visualisations for Elections 2014

Serve static filesdirectly

Page 13: Scaling real-time visualisations for Elections 2014

Compress content

Page 14: Scaling real-time visualisations for Elections 2014

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●1,518 KB 379 KBgzipped to

Page 15: Scaling real-time visualisations for Elections 2014

wiki.nginx.org

Page 16: Scaling real-time visualisations for Elections 2014

h5bp.github.io

Page 17: Scaling real-time visualisations for Elections 2014

Only 1 image

… but a 3MB SVG

Page 18: Scaling real-time visualisations for Elections 2014

Kraken.io

Page 19: Scaling real-time visualisations for Elections 2014
Page 20: Scaling real-time visualisations for Elections 2014

Inkscape

Page 21: Scaling real-time visualisations for Elections 2014

2 decimal places 3 decimal places 4 decimal places

●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

95KB 145KB 613KB

SVG Compression

Page 22: Scaling real-time visualisations for Elections 2014

Nielsen’s server

ETL

Candidate Votes

Visualisationtemplate

1 2 3 4

Azure Ubuntu serverSingapore

GramenerVisualisation server

Real time

nginx

1 2 3 4

SQL Server

CNN Windows serverNoida, India

ETL

rsync

Candidate Votes

CNN WinXP laptopNoida, India

Every 10seconds

Every 10s

Now, optimize the rendering

Page 23: Scaling real-time visualisations for Elections 2014

We need these filters to work instantly

We cannot afford a server request for every filter change

We need client-side content generation, driven by data

Page 24: Scaling real-time visualisations for Elections 2014

HTMLXMLProlog

JavascriptPythonJava

How content is written

Declarative

Procedural

Page 25: Scaling real-time visualisations for Elections 2014

How data is used to write it

Map attributes to functions

TemplatesBinding

Create HTML strings

Page 26: Scaling real-time visualisations for Elections 2014

Declarative

ProceduralTemplates

BindingUnderscore

knockout

jQuery

d3

Let’s make a bar chart

with each of these

Examples of representative libraries

https://github.com/sanand0/fifthel-2014

Page 27: Scaling real-time visualisations for Elections 2014

underscore: declare a template

Page 28: Scaling real-time visualisations for Elections 2014

jQuery: procedurally create the HTML

Page 29: Scaling real-time visualisations for Elections 2014

knockout: declaratively bind data to HTML

Page 30: Scaling real-time visualisations for Elections 2014

d3: procedurally bind data to elements and attributes

Page 31: Scaling real-time visualisations for Elections 2014

Nielsen’s server

ETL

Candidate Votes

Visualisationtemplate

1 2 3 4

Azure Ubuntu serverSingapore

GramenerVisualisation server

Real time

nginx

1 2 3 4

SQL Server

CNN Windows serverNoida, India

ETL

rsync

Candidate Votes

CNN WinXP laptopNoida, India

Every 10seconds

Every 10s

Finally, optimize data

Page 32: Scaling real-time visualisations for Elections 2014

1.5 MB of data every second

but some of it is staticsome is redundant

and some misspelt or wrong

Page 33: Scaling real-time visualisations for Elections 2014

Correct mis-spellings

Load just what you need (query time reduced by 70%)

Page 34: Scaling real-time visualisations for Elections 2014

Normalise static data

Page 35: Scaling real-time visualisations for Elections 2014

Refresh only the changed dataWhen gzipped, JSON is no larger than CSV

JSON is natively parsed and more flexibleJSON?Redundanc

y

27KB

Page 36: Scaling real-time visualisations for Elections 2014

“We’ll cross 5 million visitors tomorrow”

Page 37: Scaling real-time visualisations for Elections 2014

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230

200000

400000

600000

800000

1000000

1200000

1400000

Half a million just in the first hour

Page 38: Scaling real-time visualisations for Elections 2014
Page 39: Scaling real-time visualisations for Elections 2014

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230

200000

400000

600000

800000

1000000

1200000

1400000 Over 1.3 million in the next!

Page 40: Scaling real-time visualisations for Elections 2014

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 230

200000

400000

600000

800000

1000000

1200000

1400000 10 million visits election day

Page 41: Scaling real-time visualisations for Elections 2014
Page 42: Scaling real-time visualisations for Elections 2014

Does age make a difference?Do old candidates win less

often?

Page 43: Scaling real-time visualisations for Elections 2014

25-30 30-35 35-40 40-45 45-50 50-55 55-60 60-65 65-70 70-75 75-80 80-85 85-900%

10%

20%

30%

40%

1% 2%4%

6%

9%

11%

14%

11%

16%18%

22% 22%

33%

0

500

1000

1500

2000

2500

Win %The number of winning candidates as a % of candidates in the age group

CandidatesThe number of candidates in

each age group

Lok

Sabh

a (2

004

onw

ards

)

Page 44: Scaling real-time visualisations for Elections 2014
Page 45: Scaling real-time visualisations for Elections 2014
Page 46: Scaling real-time visualisations for Elections 2014

Name length

Page 47: Scaling real-time visualisations for Elections 2014