monitor and manage 9000+ servers 30+ azure hosted services 10 global data center facilities & 6...

58

Upload: dante-fothergill

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases
Page 2: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

How Microsoft Monitors Applications Using APM, Global Service Monitor, and Microsoft Visual Studio Web Testing Charlie SatterfieldSenior Program Manager

MDC-B317

Page 3: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Session Objectives And TakeawaysSession Objective(s): Review capabilities and use of:

Web Availability Monitors using Global Service Monitor (GSM)Visual Studio Web Tests using GSMApplication Performance Monitoring (APM)

Benefits of GSM web testing featuresBenefits of APM for real world web applications

Page 4: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

AgendaAbout Monitoring and Management (M&M)How M&M thinks about monitoringSystem Center 2012 SP1 app monitoringApp monitoring in actionChallenges and changes in M&M implementation

Page 5: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

About Monitoring and ManagementDatacenter Monitoring &

Management

Monitor and manage 9000+ servers30+ Azure Hosted Services10 global data center facilities & 6 domains110+ internet web sites & 6,900+ databases

Generate 3,000+ OpsMgr alerts and Service Manager incidents per day

Our Customers

Microsoft.comWindows Update / Microsoft UpdateThe Windows StoreMSDNTechnetWindows IntuneSystem Center AdvisorVisual Studio OnlineTofinoGSMAnd more….

Page 6: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

ABOUT M&M – TEAM STRUCTURE• Tier 1

• 40 vendors split between Redmond and India 24 / 7• 15-minute SLA to resolve or escalate• 1 Full Time Manager

• Tier 2• 4 vendor service engineers 24 / 7• SLA varies by severity • 1 Full time Tier 2 Manager

• Tier 3 / 4• 4 Full time Service Engineers

• PM / Service Manager• 1 Full time PM/Architect• 1 Full time Service Manager

Page 7: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

How M&M thinks about monitoringInside Out

App level monitors based on events and/or counters

Web WS DBMonitor 1

Monitor 2

Monitor 3

Monitor 1

Monitor 2

Monitor 3

Monitor 1

Monitor 2

Monitor 3

Custom MPs for Unique application events

HW, OS, and service component monitoring through retail MPs

Page 8: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

How M&M thinks about monitoring

Web Application

IIS

SQL

Windows

Hardware Infra

Operations Manager2012

OpsMgr Agent

Inside OutApp level monitors based on events and/or

counters

Web WS DBMonitor 1

Monitor 2

Monitor 3

Monitor 1

Monitor 2

Monitor 3

Monitor 1

Monitor 2

Monitor 3

Custom MPs for Unique application events

HW, OS, and service component monitoring through retail MPs

Page 9: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

How M&M thinks about monitoringOutside In

External probes / Synthetic Trans

HTTP Probes(SCOM)

Uses same tools as SynTran

Synthetic Transactions

(SCOM)

Test core user paths in

UI with Synthetic

Transactions

Web Service with Client UI

Web Service Only

S1S2S3S4

S1S2S3S4

Expose secured web

page that performs API

level tests and returns result

codes. Test for event codes with HTTP

Probes

Page 10: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

How M&M thinks about monitoringOutside In

External probes / Synthetic Trans

HTTP Probes(SCOM)

Uses same tools as SynTran

Synthetic Transactions

(SCOM)

Test core user paths in

UI with Synthetic

Transactions

Web Service with Client UI

Web Service Only

S1S2S3S4

S1S2S3S4

Expose secured web

page that performs API

level tests and returns result

codes. Test for event codes with HTTP

Probes

Web Application

IIS

SQL

Windows

Hardware Infra

Operations Manager2012

OpsMgr Agent

3rd Party URL Monitor

Custom Dev URL Monitor

HTTP Probes

HTTP Probes

Page 11: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

How M&M thinks about monitoringOutside In

External probes / Synthetic Trans

HTTP Probes(SCOM)

Uses same tools as SynTran

Synthetic Transactions

(SCOM)

Test core user paths in

UI with Synthetic

Transactions

Web Service with Client UI

Web Service Only

S1S2S3S4

S1S2S3S4

Expose secured web

page that performs API

level tests and returns result

codes. Test for event codes with HTTP

Probes

Web Application

IIS

SQL

Windows

Hardware Infra

Operations Manager2012

OpsMgr Agent

3rd Party URL Monitor

Custom Dev URL Monitor

HTTP Probes

HTTP Probes

Page 12: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

System Center 2012 SP1 App MonitoringWeb Application Transaction MonitoringWeb Availability MonitoringVisual Studio Web TestsApplication Performance Monitoring

Page 13: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability Monitoring and GSMUsed by application owners/engineers to:

Measure web application availability and performancePinpoint resource failures for quick resolution

Application owners/engineers leverage several test types:

Test Type Executed by Alerting Availability Reporting

URL GSM A failure of this test indicates a problem with one or more networks, sites, or servers

Provides availability % seen by users of that region

VIP Internal Watcher Nodes(Management Pools)

A failure of this test indicates a problem with one or more servers belonging to this site

Provides availability % specific to this site.

DIP Internal Watcher Nodes(Management Pools)

A failure of this test indicates a problem with a specific server

Provides availability % specific to this server.

Page 14: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Demo: Web Availability Monitors and GSMCharlie Satterfield

Page 15: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringBenefitsAbility to provision an application test with multiple URLs

Page 16: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringBenefitsAbility to provision an application test with multiple URLsEasy to provision tests to internal and external (GSM) watcher nodes

Page 17: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringBenefitsAbility to provision an application test with multiple URLsEasy to provision tests to internal and external (GSM) watcher nodesAutomatic health rollups

Page 18: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringBenefitsAbility to provision an application test with multiple URLsEasy to provision tests to internal and external (GSM) watcher nodesAutomatic health rollupsAvailability and Performance dashboards

Page 19: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringBenefitsAbility to provision an application test with multiple URLsEasy to provision tests to internal and external (GSM) watcher nodesAutomatic health rollupsAvailability and Performance dashboardsAvailability reporting

Page 20: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringBenefitsAbility to provision an application test with multiple URLsEasy to provision tests to internal and external (GSM) watcher nodesAutomatic health rollupsAvailability and Performance dashboardsAvailability reporting It’s Free! Saving our customers as much as $600/month per base URL test as compared to 3rd Party

Page 21: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Web Availability MonitoringChallengesNo way to create or modify multiple web application test settings at onceNo way to view alerts corresponding to outages from OpsMgr reportsNo breakdown of performance data for page components

Page 22: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMUsed by Application owners/engineers to determine the health of end to end user scenarios

Search, validate expected resultsLogin, validate content, logoutAnd more…

ConsiderationsThe web test file size is less than 100 KB.Number of steps in the test cannot be more than 100.The test overall must execute faster than 30 seconds.There are no loop statements, plugins, or references to other tests.ThinkTime parameters in the test must be set to 0.Each subscription cannot have more than 3 tests per location, or 45 tests total.

Page 23: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Demo: Visual Studio Web Tests and GSM

Charlie Satterfield

Page 24: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMBenefitsAbility to record web application user actions

Page 25: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMBenefitsAbility to record web application user actionsTransactional and authentication capabilities

Page 26: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMBenefitsAbility to record web application user actionsTransactional and authentication capabilitiesAbility to add specific validation to determine success or failure of test

Page 27: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMBenefitsAbility to record web application user actionsTransactional and authentication capabilitiesAbility to add specific validation to determine success or failure of testValidation and performance collectionfor each test step

Page 28: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMBenefitsAbility to record web application user actionsTransactional and authentication capabilitiesAbility to add specific validation to determine success or failure of testValidation and performance collection for each test stepIt’s Free!

Page 29: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Visual Studio Web Tests and GSMChallengesNo way to run a Visual Studio web test from an internal OpsMgr watcher node

Page 30: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Microsoft.com and APM

Page 31: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

About Microsoft.com What we run

www.Microsoft.comDownload.Microsoft.comProfile.Microsoft.comCareers.Microsoft.comPlus a bunch more….

By the numbers20K to 28K Web requests per second to WWW.Microsoft.com~1.6B Requests per day from 57M unique IP’s550K concurrent connections #9 Corporate web site on the web in terms of reach

Page 32: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

About Microsoft.com By the numbers - WWW

24 WWW Front End Application Request Routing servers 64 WWW Backend ServersMultiple other clusters serving sites like /surface, /licensing/servicecenter, etc.SLA of Global 99.90% of platform availability as measured by GSM Objective of Global 99.80% for page delivery as measured by Keynote.

IIS Config WWWWindows Server 2012/IIS83100+ Web Applications26 Application Pools

Page 33: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceWhat debugging in production used to look like:

Page 34: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceDebugging Today

APM is now the primary tool for debugging on WWW With 3100+ applications we cannot target all of them all of the timeWe push out an APM MP for targeted apps and collect data Tight integration with our development team through TFS and APM data exchange

Page 35: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Identify most problematic application(s)AppAdvisor Console – Summary Failure Analysis

Graphical view of event count over time for APM monitored applicationsAllows for quick identification of the most problematic applications over a time period

Page 36: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Identify most problematic application(s)AppAdvisor Console – Summary Failure Analysis

Graphical view of event count over time for APM monitored applicationsAllows for quick identification of the most problematic applications over a time periodDrill in to get more information on top 5exception events

Page 37: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Identify most problematic application(s)AppAdvisor Console – Summary Failure Analysis

Graphical view of event count over time for APM monitored applicationsAllows for quick identification of the most problematic applications over a time periodDrill in to get more information on top 5exception eventsDrill in further to get the exception call stack details

Page 38: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Health of my application over timeAppAdvisor Console – Application Status

Quick review of app health over time

Page 39: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Health of my application over timeAppAdvisor Console – Application Status

Quick review of app health over timeCompares current, previous, and average events and performance

Page 40: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Health of my application over time

• AppAdvisor Console – Application Status• Quick review of app health over time• Compares current, previous,

and average events and performance• Highlights top 10 New Problems

Page 41: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Health of my application over time

• AppAdvisor Console – Application Status• Quick review of app health over time• Compares current, previous,

and average events and performance• Highlights top 10 New Problems• Displays top 10 most frequent problems

with trending

Page 42: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceA real world example – Health of my application over time

• AppAdvisor Console – Application Status• Quick review of app health over time• Compares current, previous,

and average events and performance• Highlights top 10 New Problems• Displays top 10 most frequent problems

with trending• Drill in to get exception call stack

details

Page 43: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Demo: Escalate APM exceptions

Charlie Satterfield

Page 44: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceBenefitsAbility to quickly configure monitors using IIS application inventory

Page 45: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceBenefitsAbility to quickly configure monitors using IIS application inventoryAbility to profile applications for exceptions or performance events without modifying code

Page 46: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceBenefitsAbility to quickly configure monitors using IIS application inventoryAbility to profile applications for exceptions or performance events without modify codeNear Zero touch non-intrusive debugging and Integration with Intellitrace data

Page 47: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceBenefitsAbility to quickly configure monitors using IIS application inventoryAbility to profile applications for exceptions or performance events without modify codeNear Zero touch non-intrusive debugging and Integration with Intellitrace data Statistical views and analysis of top failures and worst performance

Page 48: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceBenefitsAbility to quickly configure monitors using IIS application inventoryAbility to profile applications for exceptions or performance events without modify codeNear Zero touch non-intrusive debugging and Integration with Intellitrace data Statistical views and analysis of Top Failure and worst performanceTight Integration with TFS for seamless bug fix integration between Operations and Development

Page 49: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Using APM in practiceChallengesHelps to have IT Operations resources with code analysis skills to read the APM dataIIS reset required to enable APM profilingSome apps not discovered with defaults – handlers that may not have .aspx files on disk require extra workDoes not provide insight into memory leak investigations

Page 50: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Challenges in M&M ImplementationSome application monitoring features noteasily leveraged in a large multi-tenantmanagement group.

Product constraints400 APM Agents / 700 APM monitored applicationsSingle GSM account per Management Group

Implementation constraintsRestricted Operations Manager Console access for security, performance, and reliability

Operations Manager 2012

Single Mgmt Group9 Business Units~7000 agents

Service Manager 2012

Ale

rts

Page 51: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Resolution to M&M ChallengesImplementation Changes

Provision new dedicated Operations Manager Mgmt Groups per logical business unitEach management group sized by business unit monitoring as many as 3000 agents

EnablesOperations Manager Console AccessGSM account per business unitAPM configuration by application ownerWeb testing via GSM by application owner

Service Manager 2012

Ale

rts

OpsMgr 2012 SP1Mgmt Group

1 Business Unit< 3000 agents

Page 52: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

In Review: Session Objectives And TakeawaysSession Objective(s): Review capabilities and use of:

Web Availability Monitors using GSMVisual Studio Web Tests using GSMApplication Performance Monitoring (APM)

Benefits of GSM web testing featuresBenefits of APM for real world web applications

Page 53: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Track resourcesLearn more about Windows Server 2012 R2 Preview, download the datasheet and evaluation bits on http://aka.ms/WS2012R2Learn more about System Center 2012 R2 Preview, download the datasheet and evaluation bits on http://aka.ms/SC2012R2

Page 54: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Related contentBreakout Sessions (session codes and titles)DEV-B312 DevOps: Increasing Application Lifecycle Efficiencies with Microsoft Visual Studio and System Center MDC-H209 Microsoft System Center 2012: Application Performance MonitoringMDC-B208 Microsoft System Center 2012 SP1 – Operations Manager: Overview and What’s New

Page 55: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

msdn

Resources for Developers

http://microsoft.com/msdn

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

TechNet

Resources

Sessions on Demand

http://channel9.msdn.com/Events/TechEd

Resources for IT Professionals

http://microsoft.com/technet

Page 56: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Complete an evaluation on CommNet and enter to win!

Page 57: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

Evaluate this session

Scan this QR code to evaluate this session and be automatically entered in a drawing to win a prize

Page 58: Monitor and manage 9000+ servers 30+ Azure Hosted Services 10 global data center facilities & 6 domains 110+ internet web sites & 6,900+ databases

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.