improving mobile payments with real time spark

31
Improving Mobile Payments with Real time Spark

Upload: datamantra

Post on 15-Apr-2017

992 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: Improving Mobile Payments With Real time Spark

Improving Mobile Payments with Real time

Spark

Page 2: Improving Mobile Payments With Real time Spark

● Madhukara Phatak

● Big data consultant and trainer at datamantra.io

● Consult in Hadoop, Spark and Scala

● www.madhukaraphatak.com

Page 3: Improving Mobile Payments With Real time Spark

Agenda● Mobile as drive for big data● Our customer solution● Existing data solution● Improved solution● Technical details● Future enhancements● Q & A

Page 4: Improving Mobile Payments With Real time Spark

Mobile as Big data drive● Mobile has changed the way in which we interact with

world● Most of the buy/sell happens on mobile today

○ Myntra went fully mobile○ Flipkart and amazon say their 50% buy happens on

mobile○ Quikr and OLX is mobile based selling platform○ Ola etc

Page 5: Improving Mobile Payments With Real time Spark

Challenges in Mobile● Customers expect the service to available 24/7 ● Tiny screens make very challenging to typical software

flows● Flaky connectivity of mobile networks makes it tougher● Constant moving results in drop in interactions● No more downtime● Everything has to be done in realtime

Page 6: Improving Mobile Payments With Real time Spark

Mobile payments ● Almost every app earlier mentioned needs some kind of

payment● Getting payments right on mobile is very hard● Globally 21% of online shoppers abandon their basket

due to payment failures or delays● Some companies are building sdk’s to help the app

developers● Our customer is one of them

Page 7: Improving Mobile Payments With Real time Spark

Why mobile payments are hard?

Page 8: Improving Mobile Payments With Real time Spark

Too many inputs

Page 9: Improving Mobile Payments With Real time Spark

Terrible interface by Banks

Page 10: Improving Mobile Payments With Real time Spark

OTP vs Password

Page 11: Improving Mobile Payments With Real time Spark

Our customer solution

Page 12: Improving Mobile Payments With Real time Spark

Our customer solution● Mobile sdk for applications simplify the payments● SDK provides better user interface like big buttons to

generate OTP or other flows ● SDK also helps in filling up different kind of forms given

by different banks using consistent UI● Better user experience across applications ● Application sends anonymous payments details across

apps to our customer servers

Page 13: Improving Mobile Payments With Real time Spark

Some numbers● 40 + customers● Over 1 million transactions per month as per March● Around 55% success rate ( 5 % above average)● Supports major banks, payment gateways and wallet

providers● Soon will be available in other than mobile payment

space

Page 14: Improving Mobile Payments With Real time Spark

Why data matters?● As number of transaction increases, things will go

wrong● There are so many different combinations to go wrong● Example

○ Airtel OTP failing with state bank netbanking○ Customers stuck in password page○ Not able to read OTP from some specific

● Understanding customer pain and reacting to it is paramount

● Every help results in payment

Page 15: Improving Mobile Payments With Real time Spark

Initial BI solution

Events

Hourly Push

JSON Data

S3FS

Session Wise Aggregations

Page 16: Improving Mobile Payments With Real time Spark

Initial BI solution ● Phone sdk pushes events like transaction initiation,

payment complete to logging servers● Logging servers roll log for every one hour and push to

s3● A single node spark machine aggregates data by

sessions and pushes it to mysql● Google BigQuery is used for adhoc querying

Page 17: Improving Mobile Payments With Real time Spark

Challenges with BI solution● Batch processing

● Geared towards more of report generation oriented flow

● Very minimal use of Spark API’s as team was not well aware of it’s potential

● No integration with mobile sdk for feed back loop

Page 18: Improving Mobile Payments With Real time Spark

Requirements for consulting● Bring the same reporting calculations to real time● Understanding the user behaviour and tracking his/her

flow over a session● Closing the loop by providing automatic alerts based on

the metric calculations● Some new specific business cases like loyalty

management etc● Improving team expertise on spark

Page 19: Improving Mobile Payments With Real time Spark

Choosing Spark streaming● Company was already invested in Spark so spark

streaming was no brainer● Also porting spark batch code to streaming was mostly

straight forward as both talk same API● Company used python as Spark API language which

was supported by streaming also● So we didn't consider storm we went ahead with Spark

streaming

Page 20: Improving Mobile Payments With Real time Spark

First version

Events

Five Minute Push

JSON Data

FileStream

Session Wise Aggregations

Page 21: Improving Mobile Payments With Real time Spark

First version ● We used fileStream API of spark streaming which

allowed us to poll a s3 bucket for every few mins● A new rolling appender was added to log servers to

push logs to s3 every 5 mins● Exact same batch code was used for calculations which

made transition very easy● All downstream applications remained same

Page 22: Improving Mobile Payments With Real time Spark

Second version

Events

JSON Data

Session Wise Aggregations

HourlyPush Realtime

Page 23: Improving Mobile Payments With Real time Spark

Amazon Kinesis● A kafka like distributed message queue by Amazon● It’s used as managed kafka source on AWS web

services● Highly scalable and low latency support● Persistence with fault tolerance across multiple

availability zones● Great integration with Spark

Page 24: Improving Mobile Payments With Real time Spark

Second version● Amazon kinesis is added as real time stream source● Logging server push logs to kinesis as they arrive● Streaming application pulls the data from kinesis for

every few mins● Multiple partitions support added for parallel streams

Page 25: Improving Mobile Payments With Real time Spark

Challenges with Python● Spark streaming API for python was introduced in 1.2

whereas spark-streaming for Scala/Java is available from 0.8

● No aws kinesis connector was available as of March● Team has to write it’s own● No support for python in Spark job server

Page 26: Improving Mobile Payments With Real time Spark

Challenges from batch to streaming● Session typically last from 1-10 mins. Batch is easy

most of the time session is done for a one hour data but challenging for real time data

● Designing state for session● Designing checkpointing and deciding on interval● Weird checkpointing issues with s3 due to eventual

consistency

Page 27: Improving Mobile Payments With Real time Spark

Improvements to batch code● Most of the code was written in rdd paradigm as it was

only know to team● Team was trained on spark sql and spark streaming● Majority code was ported to Spark sql based solution to

improve readability and maintainability● Recently moved into Dataframe based code

Page 28: Improving Mobile Payments With Real time Spark

Third version

Events

JSON Data

Session Wise Aggregations

HourlyPush Realtime

Page 29: Improving Mobile Payments With Real time Spark

Choosing Mesos● Mesos is a great cluster manager for Spark only

workloads● Has specific coarse-grain mode which is dedicated for

the real time systems● Minimal overhead compared to YARN● Easy to setup on EC2

Page 30: Improving Mobile Payments With Real time Spark

Fourth version

Events

JSON Data

Session Wise Aggregations

HourlyPush Realtime

Page 31: Improving Mobile Payments With Real time Spark

Grafana● Added grafana for visualization and dashboards● Graphana = Graphite + influxDB● Moved away from mysql to time series database influx

DB● Scales much better compared to mysql● Data scientists or product managers can monitor

customers using these dashboards● Integrates with mobile sdk