![Page 1: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/1.jpg)
Big & Fast: A quest for relevant and real-time analytics
Natalino Busa@natalinobusa
![Page 2: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/2.jpg)
Parallelism Mathematics Programming
Languages Machine Learning Statistics
Big Data Algorithms Cloud Computing
Natalino Busa@natalinobusa
www.natalinobusa.com
![Page 3: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/3.jpg)
Big and Fast. Methodology Architecture Roles and organization
![Page 4: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/4.jpg)
Conversion is the ultimate form of permission marketing
Permission marketing is about the honour of being heard.
How to earn it ? Provide the right suggestions, at the right time. This is what makes data analysis valuable
![Page 5: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/5.jpg)
When do you really know your customer ?
know about last unique:
5 songs?
100 songs?
10’000 songs?
![Page 6: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/6.jpg)
Old & New stuff.
We evolve slowly, our personality, our habits.
But events and trends can affect us on a short notice
How do you combine old with new?
![Page 7: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/7.jpg)
The customer’s contextComplex on many dimensions:
Personal history: amount of transactions ever done
Long term Interaction:how the users’ action correlate with others
Real time events:Trends and recent events
![Page 8: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/8.jpg)
The customer’s context
context is related to time:
slow changing: the defining characteristic of a person
fast changing: events which influence our lives, trends
Require very different technology solutions !!!
![Page 9: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/9.jpg)
Challenges
millions of billions of
Not much time to reactwindow of opportunity sometimes is just a few seconds
Load of information to processyou want to understand well the user history
![Page 10: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/10.jpg)
Slow and fast
ranking and preference analysis
segmentation and clustering
short term trending topics
rule-based recommendations
10’s Terabytes of Data. This can take hours ….
100’s of events per second.This must be fast ….
![Page 11: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/11.jpg)
Hadoop: Distributed Data OS
ReliableDistributed, Replicated File System
Low cost↓ Cost vs ↑ Performance/Storage
Computing Powerhouse
All clusters CPU’s working in parallel for running queries
![Page 12: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/12.jpg)
Scala / Akka / Spray: a WEB API reactive framework
ActorA Actor
B
ActorC
msg 1msg 2
msg 3
msg 4● it scales horizontally (can run in cluster mode)
● maximum use of the available cores/memory
1. processing is non-blocking, threads are re-used
2. can parallelize computing power across many actors
Very fast: 1000’s messages/sec
Very reliable: auto recovery
![Page 13: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/13.jpg)
Distributed computing: lambda architecture
BatchComputing
HTTP RESTful API
In-MemoryDistributed Database
In-memoryDistributed DB’s
Lambda ArchitectureBatch + Streaming
low-latencyWeb API services
StreamingComputing
Data Warehouses Messaging Busses
![Page 14: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/14.jpg)
Distributed computing: some techs
Hadoop
Cassandra
millions of billions of
λ= conversions
( lamda )
![Page 15: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/15.jpg)
All Things Distributed
Distributing computing and storage
more machines = more storage/computing
Open Source software solutions
mature enough for pragmatic adopters
Near realtime + big data technologies
Hadoop, Scala, Akka, Spray, Cassandra
![Page 16: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/16.jpg)
Science & Engineering
Statistics, Data Science
PythonRVisualization
IT InfraBig Data
JavaScalaSQL
Hadoop: Big Data Infrastructure, Data Science on large datasets
Big Data and Fast Data requires different profiles to be able to achieve the best results
![Page 17: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/17.jpg)
Parallelism Mathematics Programming
Languages Machine Learning Statistics
Big Data Algorithms Cloud Computing
Natalino Busa@natalinobusa
www.natalinobusa.com
Thanks !Any questions?
![Page 18: Big and fast a quest for relevant and real-time analytics](https://reader034.vdocument.in/reader034/viewer/2022051610/54820eb9b07959600c8b46bd/html5/thumbnails/18.jpg)
Natalino Busa@natalinobusa