Download - Introduction to WSO2 Analytics Platform
Analytics is Growing Up▪It is no longer about doing your first analytics usecase.
▪It is about ▪How to do it everyday, efficiently?
▪How to recover?▪How to make decisions?
▪How to do other forms like real-time , Interactive, and predicative analytics
Analytics 2.0 Platform▪One platform for all four forms of analytics
▪Single consistent programming model
▪One analytics archive format)
▪Support for the lifecycle of analytics Apps
Integrate well with rest of the enterprise!!
Collect Data
▪One Sensor API to publish events - REST, Thrift, JMS,
Kafka- Java clients, java
script clients*▪First you define streams (think it as a infinite table in SQL DB)
▪Then send events via Sensor APICan send to batch pipeline, Realtime pipeline or both via
configuration!
Collecting Data: Example
Java example: create and send events Events send asynchronously See client given in http://goo.gl/vIJzqc for more info
Agent agent = new Agent(agentConfiguration);publisher = new AsyncDataPublisher("tcp://hostname:7612", .. );
StreamDefinition definition = new StreamDefinition(STREAM_NAME,VERSION);definition.addPayloadData("sid", STRING);... publisher.addStreamDefinition(definition);... Event event = new Event();event.setPayloadData(eventData);publisher.publish(STREAM_NAME, VERSION, event); Send events
Define Stream
Initialize Agent
Analytics logic with SQL like Queries
▪Both BAM and CEP provides a SQL like data processing language
▪Since many understands SQL, above languages made large scale data processing Big Data accessible to many
▪Expressive, short, and sweet.
▪Define core operations that covers 90% of problems
▪Lets experts dig in when they like! (via User Defined functions)
Scaling CEP Queries on top of Storm
▪Accepts CEP queries with hints about how to partition streams
▪Partition streams, build a Apache Storm topology running CEP nodes as Storm Sprouts, and run it. (see http://goo.gl/pP3kdX )
Predictive Analytics
▪Predictive Analytics learns a decision function (a model) using examples
▪Is this fraud?▪How to drive?▪Handwritten text
▪Build models and use them with WSO2 CEP, BAM and ESB using WSO2 Machine Learner Product ( 2015 Q3)
▪Build model using R, export them as PMML, and use within WSO2 CEP
WSO2 Machine Learner▪A wizard to sample, explore, and understand data through visualizations
▪A wizard to configure, train machine learning models, and select the best model
▪Find and use those models with WSO2 CEP, BAM and ESB
▪Powered by Apache Spark MLLib
Communicate: Dashboards
▪Idea is to give a “Overall idea” in a glance (e.g. car dashboard)
▪Support for personalization, you can build your own dashboard.
▪Also the entry point for Drill down▪How to build?
- Dashboard via Google Gadget and content via HTML5 + java scripts
- Use charting libraries like Vega or D3
Communicate: Alerts▪Detecting conditions can be done via CEP Queries
▪Key is the “Last Mile”- Email- SMS- Push notifications to
a UI- Pager - Trigger physical
Alarm
▪How?- Select Email sender “Output Adaptor” from CEP, or
send from CEP to ESB, and ESB has lot of connectors
Communicate: APIs▪With mobile Apps, most data are exposed and shared as APIs (REST/Json ) to end users.
▪Need to expose analytics results as API
▪Following are some challenges - Security and
Permissions- API Discovery - Billing, throttling,
quotas & SLA
▪How?- Write data to a database from CEP event tables- Build Services via WSO2 Data Service - Expose them as APIs via API Manager
Event Stream Store▪One stop place for all event stream definitions
▪Let users ▪ Publish and
consume though Multiple protocols like REST, JMS, Thrift, Web Sockets etc.
▪ Discover event streams
▪ Enforce security and authorization
▪ Per-pay subscriptions
▪ Effectively a Event Stream Market Place!!
▪This will automate APIs creation as discussed in the slide before.
What is it good for?
▪Batch Analytics▪Realtime Streaming analytics
▪Realtime Interactive analytics
▪Lambda Architecture ▪Train and use a ML model
▪Selective Detailed Analysis
http://tinybuddha.com/blog/a-simple-technique-to-solve-problems-before-they-get-bigger
/
Selective Detailed Analysis
• Too expensive to do detailed analysis on all the data
• Instead detect the condition, and dig into related data
• Fraud toolbox • Other usecases– Dynamic offers at
Retail Site– Weather
Lambda Architecture
• Same code in both batch and realtime layers
• Idea is to fill the time between two batch runs
• Batch layer writes the data to a DB• Realtime layer merge with batch data via
Event Tables
Real Life Use Cases▪Health, Smart Parking solutions
▪Financial Monitoring ▪Smart City project, Vehicle tracking, Building monitoring
▪Railway monitoring ▪Throttling and Anomaly Detection
▪API Analytics (13+ customers)
▪Connected Car
Case Study: DEBS Grand Challenges
▪DEBS ((Distributed Event Based Systems) Grand Challenge is a yearly event processing challenge.
▪2014 Challenge: ▪Smart Home electricity data: 2000 sensors, 40 houses, 4 Billion events. We posted (400K events/sec) and close to one million distributed throughput with 4 nodes.
▪one of the four finalists▪2015 Challenge:
▪Based on taxi activities collected from New York City over the year 2013. 14,144 taxis 173 million taxi trip records. We posted 300K/sec on a single node and one of the finalists.
https://www.flickr.com/photos/shedboy/3681317392/
Case Study: Realtime Soccer Analysis
Watch at: https://www.youtube.com/watch?v=nRI6buQ0NOM
Case Study: TFL Traffic Analysis
Built using TFL ( Transport for London) open data feeds.
http://goo.gl/04tX6khttp://goo.gl/9xNiCm
Select the Product
Product Features
WSO2 Data Analytics Server (DAS)
Everything : Batch, Realtime, Interactive, and Predictive Analytics
WSO2 Complex Event Processor (CEP)
Realtime Analytics only
WSO2 Machine Learner
Predictive Analytics only