transactional monitoring for loosely coupled service architectures

Post on 18-Jan-2017

263 Views

Category:

Internet

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

@dkhan

Transactional monitoring for loosely coupled service architectures Daniel KhanNode.js Technology Lead

Some BackgroundWho I am and what I do• Daniel Khan• @dkhan• daniel.khan@dynatrace.com• Technology lead @Dynatrace• Performance Monitoring

@dkhan

The ConsumersView

@dkhan

2000

@dkhan

2005

@dkhan

2016

@dkhan

The new world of Microservices

Teams choose their technologies freely Independent deployment Elastic scaling Service brokers Circuit breakers Unknown or obscure dependencies Randomly interwoven third party dependencies The monoliths are still somewhere

@dkhan

The website is

slow!

@dkhan

Find the Faulty Part

@dkhan

@dkhan

Find out before the User does

@dkhan

So we have to Monitor

@dkhan

Follow each Transaction

Complete Transaction CoverageBrowser / Native Mobile Java/

.NET

PerformanceWarehouse

PurePathCollector

DynatraceServer

DynatraceClient

SessionsStore

ExportedSession

OfflineSession Analysis

Web Server/ PHP

C++, VB, ADK

CICS

Mainframez/OS

MQ/ESB

Database

@dkhan

@dkhan

@dkhan

2016

@dkhan

3 Metrics per Service

5 Metrics per Host

5 Metrics per Runtime

40 Services = 120 Metrics

20 Hosts = 100 Metrics

40 Runtimes = 200 Metrics

420 Metrics

@dkhan

We cannot watch 400+ metrics So we need to find ways to automate finding anomalies

@dkhan

Response Times

Error Rates

Load

Anomaly Detection

Historic

Data

“Normal”

Model

New Data

Hypothesis

Likeliness

Judgement

update

calculate derive

testproduces

Anomaly?

defines

Anomaly Detection Workflow

@dkhan

Distinguish Impact from Cause

Automated Analysis of ProblemsService slowdown

Automated Analysis of ProblemsService slowdown

Dependent services slow down

Automated Analysis of ProblemsService slow down

Dependent service slow down

Users are affected

Automated Analysis of ProblemsService slow down

Dependent service slow down

Users are affected

Analyze Dependencies

Automated Analysis of ProblemsService slow down

Dependent service slow down

Users are affected

Analyze Dependencies

Exclude non-relevant services

Automated Analysis of ProblemsService slow down

Dependent service slow down

Users are affected

Analyze Dependencies

Exclude non-relevant services

Follow causality chain

Automated Analysis of ProblemsService slow down

Dependent service slow down

Users are affected

Analyze Dependencies

Exclude non-relevant services

Follow causality chain

@dkhan

Productized

@dkhan

@dkhan

@dkhan

@dkhan

Thank You! | Daniel Khan | @dkhan | daniel.khan@dynatrace.com

top related