the big data revolution: “when bigger is better”...
TRANSCRIPT
The Big Data Revolution: “When Bigger is Better” Real Use Cases for the Online Gaming Business
Roberto Fenaroli & Brunino CrinitiBig Data Engineers
2
Chi è Ringmaster
Ringmaster nasce il 27 Ottobre 2011 come Joint Venture tra Lottomatica Group (ora IGT) e Reply S.p.A. con lo scopo di realizzare una software factoryper la realizzazione di piattaforme di gioco multi-utente nell'ambito media e gaming online.
In Ringmaster lavorano circa 60 ingegneri del software con una percentuale di laureati di oltre il 90% in un ambiente giovane, dinamico e informale.
3
Reply nel mondo
Americas
US(Chicago, Detroit)
Brazil(Belo Horizonte, Sao Paulo)
Europe
Germany (Berlin, Bremen, Dusseldorf, Frankfurt,
Gutersloh, Hamburg, Munich)Italy
(Bari, Milano, Padova, Roma,Torino, Trieste, Verona) The UK
(London, Basingstoke, Chester,Cockpole Green)
Benelux & France(Amsterdam, Brussels, Luxembourg, Paris)
Poland & Romania(Katowice, Bucharest)
Belarus (Minsk)
Asia
China(Beijing)
RUMENIA & POLANDNEAR SHORE
ITALYHOME COUNTRY
GERMANY & UKHUGE PRESENCE
6000 dipendenti(70% in Italia)
4
Introduction
• «The data volumes are exploding, more data has been created in the past twoyears than in the entire previous history of the human race.»
• «Within five years there will be over 50 billion smart connected devices in theworld, all developed to collect, analyze and share data.»
• «73% of organizations have already invested or plan to invest in Big Data by2016.»
At the moment less than 1% of all data is ever analyzed and used
5
«Big Data is like teenage sex. Everyone talks about it,
Nobody really knows how to do it,Everyone thinks everyone else is doing it,
So everyone claims they are doing it.»
Dan Ariely, Duke University
6
What is Big Data?
7
Customer’s Goals
Customers want to get the most out of their data• Analyze raw data in order to transform them into valuable
information• In other words..they want to make money!
8
Our Client’s Goals
Create smart applications based on Big Data technologies, for any kind of use cases such as:
• Analytic reports and dashboards• User Profiling• Games Management
9
JokeR* - IGT Recommendation Engine
A Recommendation Engine represents a perfect example of “smart” application that reach all these goals and returns valuable information for the client.
10
JokeR* - Overview
• Promote appropriate content to players• Provide similar content
• Smart targeting for player engagement
• Drive up player retention
• Use different techniques/algorithms• Collaborative filtering
• Content-based filtering
• Matrix Factorization algorithms
• Configurable Boosting Factor
Advanced analytics and event driven products for user
retention and revenue boosting
11
JokeR* – Solution
• We addressed the described challenges looking among the best open-source solutions available
• Our architecture relies on: • Cloudera CDH• Apache Spark• MongoDB• Apache Mahout• Spring Framework
12
Why Cloudera CDH?
• CDH is the most complete and popular distribution of Apache Hadoop (and related products), to ensure computational power and scalability
• CDH provides: • Scalability• Availability• Efficiency• Flexibility• Security• Usability• Integration
13
Cloudera Manager
14
JokeR* - High Level Architecture
JokeR* Core – Java, Spark, Mahout GameR Backoffice- AngularJS
REST Interface – Spring Rest APIs
JokeR Core
Games Catalog
Data Gathering Components – Spark, Flume, Kafka
Data Ingestion Data Processing Data Consumption
Big Data Environment – Cloudera CDH, Mongo DB
Game AttributesHandler
Data GatheringComponent
Big Data Environment
RESTInterfaceJokeR
Data Digest
JokeRML
Engine
NoSQLDatabase
Game Transactions
ExternalSystems
GameRBackoffice
15
JokeR* - Lifecycle
A Model is a combination of Dataset, Similarity Metrics and Algorithm/Parameters
CreateModel
Configure Parameters
Train the Model
Test and Validate
The Model
Model isReady For Production
A Dataset, a KPI, and an algorithm
are chosen from a set of available
ones
A specific configuration is defined based on
parameters offered by the algorithm
A training data setis needed to trainour algorithm
The model is tested to evaluate the
quality of produced outcomes
As a result a set of potential Active
Models is provided
Choose an Active Model
Schedule Recommendations
Refresh (daily, monthly,…)
Once a set of Active models is available we are ready to provide recommendations to any client system. The model can also be scheduled to be retrained based on new
data gathered.
Training and Validation Phase
Production Phase
16
JokeR* – Summary
• Information gathering from different sources• Games played and amount spent data
• Games database for game attributes
• Leverages Big Data technologies• Hadoop and Spark, and the Java Machine Learning Library
Mahout
• Transform raw data into valuable information• Provide “explorable” aggregated data to Business Analysts
• Generate Games Recommendations for final users