data warehousing and data analytics in a data-driven startupfile/ammendola_dwanddataanalysis.pdf4 /...
TRANSCRIPT
Data Warehousing and Data Analyticsin a Data-Driven Startup
Christian Ammendola
20th February 2014
Outline
2 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
● About 42Matters AG
● Data Analytics Requirements and Challenges in
Implementing a Data Analytics Platform
● Data Analytics Solution
● Lessons Learned and Conclusions
About
About 42Matters
4 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
Early pioneer in app discovery solutions for Android and iOS. Has developed various innovative app recommendation solutions for customers, developers and carriers.
Founded in May 2011 as a spin-off of the ETH Zurich. Consists of currently 9 employees based in Zurich.
Project A and its co-invenstors have invested a series A round in 42Matters.
The customers of 42Matters include major corporations like ProSiebenSat.1 Group, E-Plus Group, and Orange.
About Requirements Solution Conclusions
42Matters Products
5 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
PlayboardApp
DeveloperAPI
Ad-Network
PlayboardWhite-Labels
About Requirements Solution Conclusions
Data Analytics Requirements
What Kind of Data do we Have?
● Web usage data ● API usage data
Playboard App APIPlayboard Web SDK
● Mobile user data● Mobile ads data
About Requirements Solution Conclusions
7 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
All data is anonymised
● Usage data● Installations data
Data Analytics Requirements in 42matters
PerformanceAnalytics
EmbeddedAnalytics
Playboard App Performance Playboard Web
Performance
Ads Performance
Recommendations
AdsTargeting
Top Computations
8 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
How can we Implement those Analytics?
Dashboards
Data Mining
Notifications
Data Warehouse
Roll-up / Drill-down
Real-timeAlgorithms
Data quality
Data Exploration
Extract Transform Load (ETL)
Scalability
System Performance
9 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Challenges in Implementing those Analytics
Small team
Low budget
Limited time
Changing Requirements
10 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Data Analytics Solution
Solutions for the Data Analytics Requirements
PerformanceAnalytics
EmbeddedAnalytics
Playboard App Performance
Playboard Web Performance
Ads Performance
Recommendations
AdsTargeting
Top Computations
Own Data Analytics Platform
About Requirements Solution Conclusions
12 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
Embedded Analytics Performance Analytics
Data Analytics Platform Overview
Apps SDK Web
Recommendations
User data
Ads data
Ads targeting
Online Offline
Top computations
Ads performance
Dashboards, Reports, Data Exploration, ...
OfflineAlgorithms
OnlineAlgorithms
OperationalData
Enriched andCleaned
Data? ?
13 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Embedded Analytics Performance Analytics
Data Analytics Platform Overview
Apps SDK Web
Recommendations
User data
Ads data
Ads targeting
Online Offline
Top computations
Ads performance
Dashboards, Reports, Data Exploration, ...
OfflineAlgorithms
OnlineAlgorithms
OperationalData
Enriched andCleaned
Data? ?
Important
Performance
Scalability
Flexibility
Maintenance
Costs
Time
14 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
15 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Data Analytics Platform
Version 1.0
Embedded Analytics Performance Analytics
Data Analytics Platform Version 1.0 - Overview
Apps SDK Web
Recommendations
User data
Ads data
Ads targeting
Online Offline
Top computations
Ads performance
Dashboards, Reports, Data Exploration, ...
16 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Embedded Analytics Performance Analytics
Data Analytics Platform Version 1.0 - Overview
Apps SDK Web
Recommendations
User data
Ads data
Ads targeting
Online Offline
Top computations
Ads performance
Dashboards, Reports, Data Exploration, ...
17 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
SQL
Embedded Analytics Performance Analytics
In-Memory
Clients
PHP
Dashboards, Data Exploration, Ad-hoc Reports
Data Analytics Platform Version 1.0 - Details
OperationalDatabase
DataWarehouse
In-Memory
ETL + Orchestration
Hadoop/Pig Cluster
18 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
SQL
Embedded Analytics Performance Analytics
In-Memory
Clients
PHP
Dashboards, Data Exploration, Ad-hoc Reports
Data Analytics Platform Version 1.0 - Details
OperationalDatabase
DataWarehouse
In-Memory
ETL + Orchestration
Hadoop/Pig Cluster
Important
Performance
Scalability
Flexibility
Maintenance X
Costs
Time
19 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
20 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Data Analytics Platform
Version 2.0
Data Analytics Platform Version 2.0 - Overview
Apps SDK Web
Recommendations
User data
Ads data
Ads targeting
Online Offline
Top computations
Ads performance
Dashboards, Reports, Data Exploration, ...
21 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Embedded Analytics Performance Analytics
In-Memory
PHP
SQL
Embedded Analytics Performance Analytics
Clients Dashboards, Data Exploration, Ad-hoc Reports
Data Analytics Platform Version 2.0 - Details
OperationalDatabase
DataWarehouse
In-Memory
ETL + Orchestration
Hadoop/Pig Cluster
22 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
In-Memory
PHP
SQL
Embedded Analytics Performance Analytics
Clients Dashboards, Data Exploration, Ad-hoc Reports
Data Analytics Platform Version 2.0 - Details
OperationalDatabase
DataWarehouse
In-Memory
ETL + Orchestration
Hadoop/Pig Cluster
23 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
About Requirements Solution Conclusions
Important
Performance
Scalability
Flexibility
Maintenance
Costs
Time
Lessons Learned and Conclusions
Lessons Learned
About Requirements Solution Conclusions
25 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
● Buy solutions can be cheaper then in-house development.
● Cloud technologies open new frontiers to small companies and
allow fast prototyping for validating ideas.
● Chosen technologies depend strongly from the requirements
and from the team (e.g. ETL tools are not always the perfect
solution).
● Hadoop and data warehousing fit well together.
Conclusions
About Requirements Solution Conclusions
26 / 27 Data Warehousing and Data Analytics in a Data-Driven Startup
Embedded analytics: Own solution
Performance analytics: Google Analytics, … + own solution
Data analytics platform: Online + offline
Next steps: Extend system to have a “nearline” part.
Thanks! Questions?