business intelligence barista: what dataviz tool to use, and when?
TRANSCRIPT
Datavis Barista: How to choose what dataviz tool, and when
Jen StirrupFounder, Data Relish
Level: Intermediate
Who am I? Jen Stirrup
What dataviz tool to choose, and when?• SSRS• Excel• Tableau• Power BI• Datazen• Kibana
3
What makes a good Visualisation?
• Effective• Accurate• Efficient• Aesthetics• Adaptable
https://www.quora.com/What-are-the-worst-infographics-youve-ever-seen
http://www.designyourway.net/blog/inspiration/when-infographics-go-bad-or-how-not-to-design-data-visualization/
What makes a good Visualisation?
• Effective• Accurate• Efficient• Aesthetics• Adaptable
Excel, Datazen, SSRS, Power BI
SQL Server 2016
SQL Server 2016
SQL Server 2016
Datazen
Treemaps in Power BI
Waterfall Charts in Power BI
Example in Power BI, SSRS• Demo
Why R?• most widely used data analysis software - used by 2M + data scientist,
statisticians and analysts• Most powerful statistical programming language• flexible, extensible and comprehensive for productivity• Create beautiful and unique data visualisations - as seen in New York Times,
Twitter and Flowing Data• Thriving open-source community - leading edge of analytics research• Fills the talent gap - new graduates prefer R.
Growth in Demand• Rexer Data Mining survey, 2007 - 2013
• R is the highest paid IT skill Dice.com, Jan 2014
• R most used-data science language after SQL - O'Reilly, Jan 2014
• R is used by 70% of data miners. Rexer, Sept 2013
Growth in Demand• R is #15 of all programming languages. REdMonk, Jan
2014
• R growing faster than any other data science language. KDNuggs.
• R is in-memory and limited in the size of data that you can process.
What do I need to install?• Install R – www.r-project.org• Install Rstudio – www.rstudio.com
• Handy Shortcuts • Tab – autocomplete of available functions• Control and Up Arrow – History• Control and enter – executes the line of code
What tools do we have in R?• 80% of your time will be spent preparing and wrangling data• The remainder of your time will be spent complaining about it.
• dplyr: the essential data manipulation toolset• In data wrangling, what are the main tasks? • – Filtering rows
– Selecting columns of data– Adding new variables– Sorting– Aggregating
Example in R• Demo
The Big Data problem• Reaction Time• Enrichment• Insights• Optimize for query, not for storage.
Can you check the errors between 12.02 and 12.04 yesterday?
Can you check the errors between 12.02 and 12.04 last Friday?
…. Are you kidding me?
What is Kibana?
Kibana• It is highly customizable dashboarding• It is constituted of panels:
– Time picker / Query / Filtering– Charts / Table / Text
Flexible analytics and visualization platformReal-time summary and charting of streaming dataIntuitive interface for a variety of usersInstant sharing and embedding of dashboards
To better understand large volumes of data..
• easily create bar charts• line and scatter plots• Histograms• pie charts• maps.
To better understand large volumes of data..
• easily create bar charts• line and scatter plots• Histograms• pie charts• maps.
Kibana DataViz Types
Default Chart TypesChart Type Basis Values Types PurposeHistogram Timestamp based Count, Mean, Total Barlines, stacks,
percentagesQueries
Table Paging Fields list Highlighting, sorting
Fine grained analysis
Pie Charts Terms Missing terms, other
Doughnut, legends, tables
Proportion
Summary• SSRS• Excel• Tableau• Power BI• Datazen• Kibana
44