high-performance calculations
DESCRIPTION
High-Performance Calculations. Simple tricks to make some Tableau calculations execute hundreds of times faster. PRESENTED BY. Overview. Why do we need fast calculations? Real-life examples: Computing dates from Unix time stamps Computing dates from “yyyymmdd” columns - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/1.jpg)
PRESENTED BY
Richard WesleySenior Software Engineer
High-Performance Calculations
Simple tricks to make some Tableau calculations execute hundreds of times fasterPRESENTED BY
![Page 2: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/2.jpg)
©2012 Tableau Software Inc. All rights reserved.
©2012 Tableau Software Inc. All rights reserved.
Overview
• Why do we need fast calculations?• Real-life examples:
• Computing dates from Unix time stamps• Computing dates from “yyyymmdd”
columns• Displaying numbers as text in a viz• Computing a nested set (combined field)
• Coming attractions• How version 8 speeds up some common
calculations
![Page 3: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/3.jpg)
©2012 Tableau Software Inc. All rights reserved.
Overview: The Need for Speed
Your organisation can respond fasterYou can stay in a “flow” state longer
![Page 4: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/4.jpg)
©2012 Tableau Software Inc. All rights reserved.
Overview: One…Billion…Rows!
All calculations run against 1 billion rowsAmplifies differences to human scale
• Range is from 6 seconds to 5 hours
![Page 5: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/5.jpg)
©2012 Tableau Software Inc. All rights reserved.
Computers Compute!A case study in calculation performance
![Page 6: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/6.jpg)
©2012 Tableau Software Inc. All rights reserved.
Unix Times: The Problem
Customer Task: • Convert a column of Unix timestamps to
datesTimestamps are 64-bit integers
• Contain the number of milliseconds since 1970-01-01
Need to convert timestamps to dates for analysis
• Human-style units like years, months, days (“binning”)
![Page 7: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/7.jpg)
©2012 Tableau Software Inc. All rights reserved.
Unix Times: Original Version
Meaning:• Convert the number to a string• Take the left 10 characters• Change that to an integer (10s of
seconds)• Divide by 8640 to get days• Add to the “zero date”
Computing one billion values takes 3 hours and 45 minutes!
• Version 8 still takes about 30 minutes.
DATE("1/1/1970") + INT( INT( LEFT( STR( [unix]),10 ) ) / 8640 )
![Page 8: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/8.jpg)
©2012 Tableau Software Inc. All rights reserved.
Unix Times: Numeric Version
Meaning:• Convert the number to seconds by
dividing• Add those seconds to the zero date• Remove the time part
Computing one billion values takes 45 seconds
• That is 13,000x faster than version 7!• Still 40x faster than version 8.
DATE( DATEADD( 'seconds', INT([unix] / 1000), #1970-01-01# ) )
![Page 9: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/9.jpg)
©2012 Tableau Software Inc. All rights reserved.
Unix Times: Strings are Slow
Need to look at each characterNeed to figure out how many characters there areNeed to find space for the answerNeed to copy each character…and so onOften takes 10-100 instructions per value
![Page 10: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/10.jpg)
©2012 Tableau Software Inc. All rights reserved.
Unix Times: Numbers are Fast
Computers are good at arithmetic• They “compute”!
Many arithmetic operations take only one instruction
• 2.66 GHz processor = 2.66 billion instructions / second
The more arithmetic you use, the faster they will be
![Page 11: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/11.jpg)
©2012 Tableau Software Inc. All rights reserved.
Unix Times: When Does it Help?
These numbers are for the Tableau Data EngineShould also work for most analytic databases:
• Vertica, ParAccel, VectorWise, etc.May not help on slow databases
• MySQL, Text Files, Excel• If you extract, it should help
![Page 12: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/12.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating DatesUse date arithmetic instead of string parsing
![Page 13: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/13.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: The Problem
Customer Task: • Convert a column of numbers to dates
The numbers are in the form yyyymmddNeed to convert them to dates for analysis
• Time series• Binning (weeks, quarters)
![Page 14: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/14.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: U.S.A. Strings
Answer taken from an old in-house training manualMeaning:
• Build a string in “mm/dd/yyyy” format• Cast it to a date
Problems:• One billion values takes 5 hours• Only works in the U.S.
DATE( MID( STR( [yyyymmdd] ), 4, 2 ) + “/” + RIGHT( STR( [yyyymmdd] ), 2 ) + “/” + LEFT( STR( [yyyymmdd] ), 4 ) )
![Page 15: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/15.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: ISO Strings
Meaning:• Build a string in “yyyy-mm-dd” format• Cast it to a date
Works in any country• Data engine tries this format first
One billion values still takes 5 hours
DATE( LEFT( STR( [yyyymmdd] ), 4 ) + “-” + MID( STR( [yyyymmdd] ), 4, 2 ) + “-” + RIGHT( STR( [yyyymmdd] ), 2 ) )
![Page 16: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/16.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: Date Arithmetic
Meaning:• Get date parts using division (/) and
remainder (%)• Use date arithmetic to add the parts to a
date constant• Division gives real numbers, INT fixes
this.One billion values takes 64 seconds
• 280x faster• This is the difference between stretching
your legs and coming back in the morning!
DATEADD( 'day', [yyyymmdd] % 100 - 1, DATEADD( 'month', INT( ( [yyyymmdd] % 10000 ) / 100 ) - 1, DATEADD( 'year', INT( [yyyymmdd] / 10000 ) - 1900, #1900-01-01# ) ) )
![Page 17: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/17.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: Strings are Slow
The original calculation has four concatenations
• Each one needs different amounts of memory
• Each one needs to copy the characters• …and so on
Changing a number to a string has similar problemsReading dates from text is tricky
• What country are we in?• 5/3/1983
![Page 18: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/18.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: Numbers are Fast
Numbers all have the same sizeCopying numbers is fastDate arithmetic is still arithmetic
• Not as simple as addition BUT
• Still only a few instructions• Computers are good at
arithmetic!
![Page 19: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/19.jpg)
©2012 Tableau Software Inc. All rights reserved.
Creating Dates: Useful Numeric Functions
LOG• How many digits are there?
ABS / SIGN• Remove / extract the sign of the number
MIN / MAX• Smallest / largest of two values
![Page 20: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/20.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting NumbersMove display formatting out of your calculations
![Page 21: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/21.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting Numbers: The Problem
Customer Task: • Data only has the day of the quarter• User wants to group data by the week of
a quarter• Weeks should be labeled nicely
Need a calculation to convert the day to the weekNeed to format it for display
![Page 22: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/22.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting Numbers: If Then Else
Meaning:• Check all 14 possible ranges one after
another• Label out of bounds values as “Other”
Lots of typing means lots of mistakesA billion rows takes several minutes
• ~7 minutes in 7.0• ~4 minutes in 8.0
IF [Day Of Quarter] < 7 THEN "Week #1"ELSEIF [Day Of Quarter] < 14 THEN "Week #2"…ELSEIF [Day Of Quarter] < 91THEN "Week #13"ELSE "Other" END
![Page 23: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/23.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting Numbers: Aliases
Solution 1: Use aliases• Rewrite calculation to return numbers• Create aliases for the values: “Week #1”
etc.Only takes 36s on a billion rows
• 12x faster than 7.0• 6x faster in 8.0
Problems:• Typing aliases is still error prone• Dialogue is slow because Tableau must
find all the values
INT( [DayOfQuarter] /7 ) + 1
![Page 24: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/24.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting Numbers: Formatting
Solution 2: Use column formatting• Rewrite the calculation to return numbers• Apply number formatting to the column:
“Week #”0Still only takes 36s on a billion rowsViz updates live as you edit!
• Much easier to correct mistakes• Formatting editor doesn’t need to run
queries
INT( [DayOfQuarter] /7 ) + 1
![Page 25: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/25.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting Numbers: Strings are Slow
Databases can format output for historical reasons
• Remember teletypes? Line printers?Database formatting has to be done on every rowGrouping by string calculations can be much slower than grouping by numbers
• Need to compare entire strings instead of string identifiers
![Page 26: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/26.jpg)
©2012 Tableau Software Inc. All rights reserved.
Presenting Numbers: Presentation is Fast
Grouping by numbers only compares numbers
• 10x-100x faster than stringsGrouping reduces the number of rows returnedAliases and formatting are applied after the query
• Changing the formatting in Tableau does not run queries
![Page 27: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/27.jpg)
©2012 Tableau Software Inc. All rights reserved.
Combined FieldsUse Sets instead of concatenated strings
![Page 28: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/28.jpg)
©2012 Tableau Software Inc. All rights reserved.
Combined Fields: The Problem
Customer Task: • User wants to create a multi-
column set from two or more string columns
• The user may want to change the column separator
![Page 29: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/29.jpg)
©2012 Tableau Software Inc. All rights reserved.
Combined Fields: Concatenation
Taken from in-house training as an alternative to nested fields (called “combined fields” in version 8)Meaning:
• Concatenate two strings together • Use a separator string that can be
changedProblems:
• One billion rows takes almost 9 minutes• Changing the separator requires re-
running the query
[Month] + “, “ + [Weekday]
![Page 30: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/30.jpg)
©2012 Tableau Software Inc. All rights reserved.
Combined Fields: Set
Using “Combine Fields…” menu item, create a field that shows both fields with a user specified separator
• “Create Set…” in v7Changing the separator does not run a new queryPerformance is extremely fast
• 6 seconds on one billion rows• 90x faster
![Page 31: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/31.jpg)
©2012 Tableau Software Inc. All rights reserved.
Combined Fields: Strings are Slow
String concatenation is very hard to make fast
• Must build all the combinations from every row
Grouping by calculated strings is slow• Calculations don’t have string identifiers
![Page 32: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/32.jpg)
©2012 Tableau Software Inc. All rights reserved.
Combined Fields: Numbers are Fast
Unmodified string columns are really numbers
• One number per unique string• Grouping by them is like grouping by
numbersTableau formats combined fields after the query
• Changing the formatting doesn’t run another query
![Page 33: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/33.jpg)
©2012 Tableau Software Inc. All rights reserved.
Coming AttractionsSome things we have made faster in the version 8 data engine
![Page 34: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/34.jpg)
©2012 Tableau Software Inc. All rights reserved.
Coming Attractions: If/Then/Else
Version 7 evaluated both sides of IFs• Computed the ELSE side even when true• Computed the THEN side even when false
Especially bad when Tableau nested many of themFixed!
![Page 35: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/35.jpg)
©2012 Tableau Software Inc. All rights reserved.
Coming Attractions: Case
Version 7 did not have a CASE statement• Made us build huge if/then/else
statements• If the nesting was deep enough, we would
crashFixed!
• Also computes only the outputs it needs• THEN “string” much faster too
![Page 36: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/36.jpg)
©2012 Tableau Software Inc. All rights reserved.
Coming Attractions: String Functions
Version 7 strings were computed one at a timeVersion 8, many functions have been “chunked”
• Compute 1000 values at a timeConverting to/from strings is much faster in Version 8
![Page 37: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/37.jpg)
©2012 Tableau Software Inc. All rights reserved.
Coming Attractions: Parallel Execution
Version 7 computed values on only one processorVersion 8 tries to spread calculations across processors
• If you have 4 cores, calculations can be 4x faster
![Page 38: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/38.jpg)
©2012 Tableau Software Inc. All rights reserved.
Coming Attractions: Combined Fields
Version 7 could not edit the column order or the separatorVersion 8 lets you edit the column order and the separator
![Page 39: High-Performance Calculations](https://reader036.vdocument.in/reader036/viewer/2022062302/56816934550346895de08ab0/html5/thumbnails/39.jpg)
©2012 Tableau Software Inc. All rights reserved.
Questions?