patterns for deploying analytics in the real world
TRANSCRIPT
Patterns for Deploying Analytics in the Real World
Sriskandarajah Suhothayan
Associate Director / Architect,
WSO2
Problems to think about
• Can it handle my load ?• How costly it is ?• Agility ?• Adaptability ?• Can it analyse 3rd party systems ?• etc ...
Where to start ?
• Think Big !
But...
• Start simple !• Eat Your Own Dog Food • Analyse what you already have
Collect Data Internally
• Don’t worry about – Data formats – Data sources – Platforms – Protocols
Start with WSO2 DAS it has a unified data capturing framework !
Deployment for Interactive & Batch Analytics …
• Enable Searchability – Full text search– Drill down search
• See what has happened – Summarise the Data – Understand patterns and behaviors
Deployment for Interactive & Batch Analytics …
• Enable Searchability – Full text search– Drill down search
• See what has happened – Summarise the data – Understand patterns and behaviors
• Simple Deployment – 2 Nodes– Use RDBMS to store the data
Deployment for Realtime Analytics
• Keep informed – Dashboard – Alerts– Feedback loops
• High Availability – Zero downtime– Zero data loss
Analyse Business with API Analytics
• APIs involved • Who invokes the APIs• Extract business information from
– Payloads– Resources URIs
Monetize APIs !
Scaling Analytics Deployment… The Changes !
• Realtime – Supported by Apache Storm
• For High Memory Requirement or CPU Intensive Processing – No query changes
• Batch – Move from RDBMS to HBase/Cassandra
• WSO2 DAS have a Data Abstraction Layer • Independent of underlying Data Store
Seamless migration :)
Realtime Scalable Deployment ...
Event Processing offloaded to Siddhi Running on Apache Storm Seamlessly :)
Analytics Life Cycle
Predefined analytics
• Bundled as CApps• Allows migration and continues integration
Dev → Test → Preprod → Prod
Analytics on Production Environment
• Interactive Analytics• Personalizing Dashboards • Customised Alerts
Deployment Management
WSO2 Servers are already puppetized !
Less configuration hazard for Devops
https://github.com/wso2/puppet-modules
Summary
• Start small and scale as you grow• Minimum HA Deployment
– 2 Nodes • Fully Distributed Deployment
– 8+ Nodes – Scale based on need, horizontally and vertically
• Analyser, Indexer, Receiver, Realtime (With Apache Storm), Dashboard
• Use puppet to manage deployment