improving video-on-demand (vod) application performance … · 2020-04-23 · kafka (queuer)...

8
Improving video-on-demand (VoD) application performance through robust monitoring toolchain Praveen C Vishwa Nigam Credits

Upload: others

Post on 20-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Improving video-on-demand (VoD) application performance through robust monitoring toolchain

Praveen C Vishwa Nigam Credits

Page 2: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Confidential & Restricted

DSP’s complex video-on-demand (VoD) ecosystem

2

More and more DSPs are moving towards microservices for their VoD services to handle huge number of requests with minimum response time and to maintain high availability (in the order of 5 9’s). Here are some of the key reasons for this transition.

DSP’s complex VoD ecosystem

Implementing the right toolchain is critical in effective performance monitoring of microservices-based applications. This insight focuses on building a robust toolchain to monitor microservices-based VoD application.

Deliver video at scale to meet customers’ demand -- billion requests per day!

Complexity of microservices-based applications A single application (several microservices) runs on multiple hosts in a very dynamic environment and needs to interact with multiple other systems that are also dynamic in nature.

Ensure reliable delivery of video content and high availability of service

Elasticity – handling spiky loads during special events (e.g. Premier league foot ball games)

Implementing auto-scaling algorithms - save cost by running at optimum capacity during quiet hours

DSPs are opting for microservices to reap its benefits. However, monitoring microservices are very challenging.

Page 3: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Confidential & Restricted

`

3

Video-on-Demand (VoD) application monitoring - Overview

VoD service/host uptime

System load, read & write latency

Cache get/put time for each category of service

Request latency to each VoD application (e.g. recommendation engine, purchase manager, etc.)

Success/failure rate at each touch points of VoD service

Trending requests by user/content/category/genre etc.

Request

Request

Request

Req

ues

t

Response

Response

Req

ues

t

Resp

on

se

Resp

on

se

Monitoring application performance through

HTTP responses at each touchpoint

Recommendation engine. e.g. Reng

Purchase & entitlement

manager. e.g.- Traxis

Video-on -Demand

Set-top box

Display

Video server

Key areas to monitor

To deliver video at scale meeting customers’ demand, the following aspects need to be monitored.

Database

Microservices-based VoD application architecture

With VoD application running on microservices, no single tool can provide effective monitoring capabilities. It is recommended to build a toolchain choosing the best-in-class tools.

Page 4: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Confidential & Restricted 4

Monitoring the VoD application: Types of tools required to monitor key VoD KPIs

Activities involved in VoD application monitoring VoD application has distributed touchpoints and requires a set of activities to capture KPIs like success/failure rate, request latency etc.

Scrape and ship the log files from different containers on which VoD applications are running

Sort and process the log files carrying request/response types, success/failure, and time of request/response for various touch points based on the flow (VoD architecture, Application ID) in order to get meaningful values for monitoring KPIs

Store the sorted log files in an external database for further analysis such as trend analysis, types of request failing, success/ failure ratio by time

Visualize and monitor the KPIs using dashboard

Log scrapper To scrape microservices log files containing request/response type, timestamp etc. from different hosts.

Tools required to support monitoring activities efficiently

Log shipper To collate the application logs collected by the scrappers from individual containers and ship to an external storage system.

Log aggregator To consolidate and sort microservices’ logs that come from multiple containers under one application in order to analyze performance metrics. Aggregator helps to identify touchpoints where requests/responses are getting stuck or failing.

Message Queuer To help in alleviating back pressure of host level disk space while large amount of data is being written to the database or while network issues and DB unavailability scenarios.

Database/storage An external centralized storage system for better accessibility of processed log files for root cause analysis, trend analysis etc.

Performance monitoring dashboard Rule-based dashboard view to show performance KPIs’ report from processed log files and send notifications in case of exceptions.

Page 5: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Confidential & Restricted 5

Recommended toolchain for efficient monitoring of VoD application

Kafka (Queuer)

Storm/Spark (Aggregator)

Kafka (Queuer) ElasticSearch/Influx DB/

Promethus/Graphite (Database)

Grafana (Dashboard)

Alert E-mail

Automation Action

Automated scripts to send notifications/

alerts and actions

To categorize information and save in a presentable

manner Queues large number of VoD log

files to avoid unnecessary pressure on database

Storm is a distributed real-time computation

system. However, for VoD application monitoring,

Spark is recommended as it is fast and general

purpose engine for large-scale data processing

applications.

Beats is a lightweight agent used for frequent

scrapping of logs. However, for VoD

application monitoring, Logstash is recommended

as it provides various plug-ins to transform the

log data that helps to monitor VoD applications

effectively.

Flume/Fluentd (Log shipper)

Identifying and stitching together right tools into one unified stack and making it work is the key to successful microservices monitoring

Fluentd is suitable for monitoring at the edge level. However, for VoD application monitoring, Flume is recommended as it works better when log files need to be

queued, aggregated, processed and stored in an external storage.

Page 6: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Confidential & Restricted 6

Once the toolchain is in place, leverage Grafana for visualization of key KPIs

Request latency to each VoD application

VoD system load Success failure rate at each touch point of VoD

Cache get/put time

Sample dashboard screens for the key VoD service KPIs monitored

Request latency to purchase manager

Failure request to recommendation engine

Page 7: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Confidential & Restricted

Key takeaways

7

DSPs can achieve improved VoD application performance with the help of effective application log and service flow monitoring

With the recommended toolchain discussed in this insight, DSPs can reduce monitoring overheads/ costs up to

20%

Data collected from monitoring can help DSPs in their trend analysis (user preferences, top trending content, busy/free slots etc.)

With microservices monitoring in place, it becomes easier for DSPs to manage and improve customer experience and resource utilization

Page 8: Improving video-on-demand (VoD) application performance … · 2020-04-23 · Kafka (Queuer) ElasticSearch/Influx DB/ Promethus/Graphite (Database) Grafana (Dashboard) Alert E-mail

Chennai

Johannesburg

New York

Dallas

Tualatin

Amsterdam

London

Bengaluru

Prodapt Solutions Pvt. Ltd.

INDIA

Chennai: 1. Prince Infocity II, OMR Ph: +91 44 4903 3000

2. “Chennai One” SEZ, Thoraipakkam Ph: +91 44 4230 2300

SOUTH AFRICA USA

Prodapt North America Tualatin: 7565 SW Mohawk St., Ph: +1 503 636 3737

Dallas: 222 W. Las Colinas Blvd., Irving Ph: +1 972 201 9009

New York: 1 Bridge Street, Irvington Ph: +1 646 403 8158

Prodapt SA (Pty) Ltd. Johannesburg: No. 3, 3rd Avenue, Rivonia Ph: +27 (0) 11 259 4000

THE NETHERLANDS

Prodapt Solutions Europe Amsterdam: Zekeringstraat 17A, 1014 BM Ph: +31 (0) 20 4895711

Prodapt Consulting BV Rijswijk: De Bruyn Kopsstraat 14 Ph: +31 (0) 70 4140722

UK

Prodapt (UK) Limited Reading: Davidson House, The Forbury, Reading RG1 3EU Ph: +44 (0) 11 8900 1068

Bangalore: “CareerNet Campus” No. 53, Devarabisana Halli, Outer Ring Road

THANK YOU!

[email protected]