using bigquery as a main big data solution
TRANSCRIPT
Nikolay Novozhilov Wego.com
Using BigQuery as a main Big Data solution
About Wego
Wego.com is Asia Pacific and the Middle East’s leading flight/hotel metasearch engine used by millions of travelers.
Wego was founded in 2005 in Singapore
Introducing BigQuery
Service for interactive analysis of massive datasets (TBs)
Query billions of rows: seconds to write, seconds to return
Uses a SQL-style query syntax
It's a service, accessed by a RESTful API
Pay only for what you use
Based on internal Google tool - Dremel
Column oriented, append only…
Data architecture in Wego
...
Why did we do it?
MySQL
“Zoo”
BigQuery
Why Hadoop is more popular?
My collection of concernsYour data goes to cloud
Not open-source, Google can stop the service
“Strange” pricing model
Hadoop is trending, has bigger community
Append only database
???
Costs: storage + cost per query
Same fallacy again: “I want to launch a mom@pop – let’s buy a
building” “I want to build a site – let’s by servers” “I want big data – let’s build a data-
warehouse”
Usual concerns: No realistic estimate upfront “Fear of running a query”
StackOverflow support
53 minutes
!
Append only…Slowly changing dimensions: daily re-load from MySQL daily upload from MySQL, keeping history
Absolutely necessary updates: do you really need it? BigQuery allows to save query to initial
table:
Your tabl
eQuery
Actually useful - “Discovery mode”
Actually useful
Huge joins
REGEXT_MATCH(), …
Rich SQL - window functions
Nested data
My answer
What is Big Data revolution?
There is no difference between big data and small data anymore
“Yes, Sir, I tired to build an ROI case for our BI project - but I couldn’t
access any reliable data!”TimoElliott.com