windows 2000 - the rhetoric and the reality for performance managers des atkinson metron technology...

42
Windows 2000 - The Rhetoric and Windows 2000 - The Rhetoric and the Reality for Performance the Reality for Performance Managers Managers Des Atkinson Des Atkinson Metron Technology Ltd Metron Technology Ltd [email protected] [email protected]

Upload: spencer-mccormick

Post on 01-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Windows 2000 - The Rhetoric and the Windows 2000 - The Rhetoric and the Reality for Performance ManagersReality for Performance Managers

Des AtkinsonDes Atkinson

Metron Technology LtdMetron Technology Ltd

[email protected]@metron.co.uk

ContentsContents

• NT 4.0 Performance MonitoringNT 4.0 Performance Monitoring– the shortcomingsthe shortcomings

• Windows 2000Windows 2000– System MonitorSystem Monitor

– New MetricsNew Metrics

– Event TraceEvent Trace

• Windows 2000 Scalability and Windows 2000 Scalability and other issuesother issues

NT 4.0 PM ShortcomingsNT 4.0 PM Shortcomings

• All tools were snapshot-basedAll tools were snapshot-based

• Process/thread termination Process/thread termination between sampling intervalsbetween sampling intervals

– no accounting function unlike UNIX or a no accounting function unlike UNIX or a mainframe OSmainframe OS

– threads come and go much more than threads come and go much more than processes, so loss not too greatprocesses, so loss not too great

– dependent on sampling granularitydependent on sampling granularity

NT 4.0 PM ShortcomingsNT 4.0 PM Shortcomings

• Per Process I/O CountsPer Process I/O Counts– Counters in NT 3.1 returned crazy valuesCounters in NT 3.1 returned crazy values

– In NT 3.51 values were consistently zeroIn NT 3.51 values were consistently zero

– In NT 4.0 they disappeared!In NT 4.0 they disappeared!

• Can attempt some inferences using PF DeltaCan attempt some inferences using PF Delta

• If server dedicated to a single app, e.g. If server dedicated to a single app, e.g. Oracle, then data available at that levelOracle, then data available at that level

– I/O reads per user sessionI/O reads per user session

– I/O writes by the DBWRI/O writes by the DBWR

NT 4.0 PM ShortcomingsNT 4.0 PM Shortcomings

• Per Process/Thread per Device I/OPer Process/Thread per Device I/O– a dream for capacity planning/modellinga dream for capacity planning/modelling

– not present even on most mainframe or mid-not present even on most mainframe or mid-range systemsrange systems

• Workarounds for the aboveWorkarounds for the above– work it out once then rememberwork it out once then remember

– infer by file to device mappings, e.g. from Oracle infer by file to device mappings, e.g. from Oracle statisticsstatistics

NT 4.0 PM ShortcomingsNT 4.0 PM Shortcomings

• Transaction BoundariesTransaction Boundaries– Mainframes have TP Monitors such as CICS Mainframes have TP Monitors such as CICS

on MVS or TPMS on VMEon MVS or TPMS on VME

– Block mode terminals with associated message Block mode terminals with associated message types make transaction boundaries clear, both types make transaction boundaries clear, both conceptually and in their metricsconceptually and in their metrics

– Midrange systems such as OpenVMS or UNIX Midrange systems such as OpenVMS or UNIX lack this clarity, as does NTlack this clarity, as does NT

NT 4.0 PM ShortcomingsNT 4.0 PM Shortcomings

• Per Process/Thread Hard Page Per Process/Thread Hard Page Fault CountsFault Counts

– Which processes are the most memory Which processes are the most memory intensive?intensive?

– The work of which processes actually result in The work of which processes actually result in physical page faults as opposed to logical page physical page faults as opposed to logical page faults?faults?

– What is the size and rate of these hard page What is the size and rate of these hard page faults?faults?

NT 4.0 Performance MonitorNT 4.0 Performance Monitor

• Good for real-time monitoringGood for real-time monitoring

• Data logging is not sophisticatedData logging is not sophisticated

• NT Resource Kit has a data logging NT Resource Kit has a data logging service (MONITOR.EXE, service (MONITOR.EXE, DATALOG.EXE)DATALOG.EXE)

– does not pick up new processesdoes not pick up new processes

• No built-in data manipulation, No built-in data manipulation, trending or modellingtrending or modelling

Enhanced Disk Performance counters on this system are currently set to start atboot.Note that Logical Disk counters of striped disk sets may not be correct.

DISKPERF [-Y[E] | -N] [\\computername]

-Y[E] Sets the system to start disk performance counters when the system is restarted.

E Enables the disk performance counters used for measuring performance of the physical drives in striped disk set when the system is restarted. Specify -Y without the E to restore the normal disk performance counters.

-N Sets the system disable disk performance counters when the system is restarted.

\\computername Is the name of the computer you want to see or set disk performance counter use.

Enhanced Disk Performance counters on this system are currently set to start atboot.Note that Logical Disk counters of striped disk sets may not be correct.

DISKPERF [-Y[E] | -N] [\\computername]

-Y[E] Sets the system to start disk performance counters when the system is restarted.

E Enables the disk performance counters used for measuring performance of the physical drives in striped disk set when the system is restarted. Specify -Y without the E to restore the normal disk performance counters.

-N Sets the system disable disk performance counters when the system is restarted.

\\computername Is the name of the computer you want to see or set disk performance counter use.

New PM Features in W2KNew PM Features in W2K

• System MonitorSystem Monitor– Supersedes the NT 4.0 Performance MonitorSupersedes the NT 4.0 Performance Monitor

– An embeddable componentAn embeddable component

– May be programmed/configuredMay be programmed/configured

– Performance logs and alerts servicePerformance logs and alerts service

• Event Trace as “additional data Event Trace as “additional data collection technology”collection technology”

– OS Kernel instrumented since Beta 1OS Kernel instrumented since Beta 1

– Active Directory, Exchange etc. “in progress” Active Directory, Exchange etc. “in progress”

PerformancePerformanceExtensionExtension

DLLDLL

SystemSystemPerformancePerformance

DLLDLL

Windows NTWindows NTPerformance Performance

MonitorMonitor

User defined VB or User defined VB or HTML applicationHTML application

System Monitor System Monitor graph controlgraph control

Custom Performance Custom Performance ToolTool

RegQueryValueEx()RegQueryValueEx()

PerflibPerflibWMIWMI

PDH.DLLPDH.DLL

Hi-Perf Data Hi-Perf Data Provider Provider ObjectObject

Sysmon Log Sysmon Log Service Service

FilesFiles

Sysmon log Sysmon log and alert and alert serviceservice

SystemSystemPerformancePerformance

DLLDLLSystemSystem

PerformancePerformanceDLLDLL

PerformancePerformanceExtensionExtension

DLLDLL

System Monitor ArchitectureSystem Monitor Architecture

PerformancePerformanceExtensionExtension

DLLDLL

System Monitor InterfacesSystem Monitor Interfaces

• MethodsMethods– e.g. typical user interface tasks such as adding e.g. typical user interface tasks such as adding

counterscounters

• PropertiesProperties– e.g. data source properties or those of counter e.g. data source properties or those of counter

displaysdisplays

• EventsEvents– e.g. where a control has been changed, such as e.g. where a control has been changed, such as

when a counter has been addedwhen a counter has been added

New W2K System MetricsNew W2K System Metrics

• New PnP DiskPerfNew PnP DiskPerf– Correct logical and physical counters for FT Correct logical and physical counters for FT

devices at the same time!devices at the same time!

– Counters on a per disk or per volume basisCounters on a per disk or per volume basis

• Disk countersDisk counters– Idle timeIdle time

– Split I/O countSplit I/O count»Count of I/Os that were sub-divided internally

»Provides an indication of disk fragmentation

System Monitor Counter System Monitor Counter “Fixes”“Fixes”

• For disk counters, Microsoft say For disk counters, Microsoft say you no longer have to run you no longer have to run “diskperf -y” to switch these on“diskperf -y” to switch these on

– confirmed this to be trueconfirmed this to be true

• Microsoft claim to have fixed the Microsoft claim to have fixed the I/O by process counters in beta 3I/O by process counters in beta 3

– confirm these metrics do return values, but confirm these metrics do return values, but they look rather strange (see earlier slide) they look rather strange (see earlier slide)

What is Event Tracing?What is Event Tracing?

• Event trace “is a recorded and Event trace “is a recorded and ordered set of events”ordered set of events”

• Provides information to Provides information to supplement the standard counterssupplement the standard counters

• Events traced may include disk Events traced may include disk I/O, TCP/IP traffic, thread I/O, TCP/IP traffic, thread creation/deletion, file I/O creation/deletion, file I/O

Exploiting Trace LogsExploiting Trace Logs

• Running a trace creates an event log file Running a trace creates an event log file (e.g. test_000001.etl) in the PerfLogs (e.g. test_000001.etl) in the PerfLogs directory (6MB in 30 minutes!)directory (6MB in 30 minutes!)

• Microsoft talk of “detailed analysis Microsoft talk of “detailed analysis tools or enterprise tools”tools or enterprise tools”

• BUT nothing in beta 3 will actually look BUT nothing in beta 3 will actually look at these .etl files (Jee Pang’s tracedmp.c at these .etl files (Jee Pang’s tracedmp.c files not on the CDs!)files not on the CDs!)

Tracing Active DirectoryTracing Active Directory

• Microsoft claim that every request Microsoft claim that every request or transaction may be traced for or transaction may be traced for the following:the following:

– LDAP (Lightweight Directory Access Protocol)LDAP (Lightweight Directory Access Protocol)

– ReplicationReplication

– Kerberos authenticationKerberos authentication

• I have not yet been able to confirm I have not yet been able to confirm this or measure any overheadthis or measure any overhead

System Monitor vs Event System Monitor vs Event Trace DataTrace Data

• System Monitor CountersSystem Monitor Counters– very low overheadvery low overhead

– best for continuous monitoringbest for continuous monitoring

– problems relating system to userproblems relating system to user

• Event Trace DataEvent Trace Data– higher overheadhigher overhead

– best for detailed analysis or capacity planningbest for detailed analysis or capacity planning

– good for relating system to user good for relating system to user

Note

Trace logging of file I/O and page faults can generate an extremely large amount of data. It is recommended that you limit trace logging using the file I/O and page fault options to a maximum of two hours.

Microsoft Help Extract:Microsoft Help Extract:

What Overhead does Event What Overhead does Event Tracing Impose?Tracing Impose?

• Jee Pang claims that the overhead of the Jee Pang claims that the overhead of the Kernel tracing is “very low”Kernel tracing is “very low”

• Decided to test this by running a set of Decided to test this by running a set of benchmarks and measuring elapsed times benchmarks and measuring elapsed times before and after.before and after.

• 3 benchmarks were run (Perl scripts):3 benchmarks were run (Perl scripts):– I/O intensiveI/O intensive

– Memory intensiveMemory intensive

– Compute intensiveCompute intensive

## I/O INTENSIVE BENCHMARK#print "Enter required number of iterations: ";## Typical number of iterations was 250, with avg elapsed time of 53 secs#chomp($input_var = <STDIN>);$fname = time;$start_t = time;for ($x = 1; $x <= $input_var ; $x++) {

for ($y = 1; $y <= $input_var; $y++) {$z = $x * $y;

## Open and close the file each time to increase the I/O overhead#

open (OUTF, ">>$fname");print OUTF "$z\n";close (OUTF);

}}$end_t = time;$tot_t = $end_t - $start_t;print "Start time : $start_t\n";print "End time : $end_t\n";print "Elapsed time: $tot_t secs for $input_var iterations\n";print "Output file : $fname\n"

1 2 3 4 5 6 7 8 9 10 11

No Trace

Std Trace

Plus PFs

Plus FIOs

47

48

49

50

51

52

53

54

55

56

Benchmark Iterations

I/O Intensive Benchmark (elapsed time in seconds)

No Trace

Std Trace

Plus PFs

Plus FIOs

## MEMORY INTENSIVE BENCHMARK#print "Enter number of iterations: ";## Typical number of iterations was 3000000 with average elapsed time of 19 secs#chomp ($iters = <STDIN>);$start_t = time;for ($num=0; $num < $iters; $num++) {

$grocerylist[$num]= $num;}$end_t = time;$tot_t = $end_t - $start_t;print "Start time : $start_t\n";print "End time : $end_t\n";print "Elapsed time: $tot_t secs for $iters iterations\n";

1 2 3 4 5 6 7 8 9 10 11

No Trace

Std Trace

Plus PFsPlus FIOs

0

5

10

15

20

25

Benchmark Iterations

Memory Intensive Benchmark (elapsed time in seconds)

No Trace

Std Trace

Plus PFs

Plus FIOs

## COMPUTE INTENSIVE BENCHMARK#print "Enter required number of iterations: ";## Typical number of iterations was 5000 with average elapsed time of 45 seconds#chomp($input_var = <STDIN>);$start_t = time;$blob = 0;for ($x = 1; $x <= $input_var ; $x++) {

for ($y = 1; $y <= $input_var; $y++) {$z = $x * $y;

}}$end_t = time;$tot_t = $end_t - $start_t;print "\nStart time : $start_t\n";print "End time : $end_t\n";print "Elapsed time: $tot_t secs for $input_var iterations\n";

1 2 3 4 5 6 7 8 9 10 11

No Trace

Std TracePlus PFs

Plus FIOs

43.4

43.6

43.8

44

44.2

44.4

44.6

44.8

45

Benchmark Iterations

Compute Intensive Benchmark (elapsed time in seconds)

No Trace

Std Trace

Plus PFs

Plus FIOs

Event Trace and ApplicationsEvent Trace and Applications

• Event Trace has an API that can be Event Trace has an API that can be embedded in a user-written embedded in a user-written applicationapplication

• Per process buffer pool etc.Per process buffer pool etc.

• Can be used to measure transaction Can be used to measure transaction throughputs/response timesthroughputs/response times

• Hey - what about ARM?Hey - what about ARM?

Clustering Technologies on Clustering Technologies on WindowsWindows

• MSCSMSCS– 2-node failover cluster2-node failover cluster

– no scalability featuresno scalability features

• Load Balancing ServerLoad Balancing Server– a step-up from DNS or IP load balancinga step-up from DNS or IP load balancing

– designed for the middle-tier on HTTP, FTP etc.designed for the middle-tier on HTTP, FTP etc.

• 3rd party solutions such as Oracle 3rd party solutions such as Oracle Parallel ServerParallel Server

DatabaseDatabaseDatabaseDatabase DatabaseDatabaseDatabaseDatabase

Two-node failover database solutionTwo-node failover database solutionTwo-node failover database solutionTwo-node failover database solution

Oracle Parallel Server on Windows NT Oracle Parallel Server on Windows NT ArchitectureArchitecture

W2K and ClustersW2K and Clusters

• W2K Datacenter Server will be shipped “90 W2K Datacenter Server will be shipped “90 to 180 days” after rest of W2Kto 180 days” after rest of W2K

• Beta 3 issued early May 1999 so probable Beta 3 issued early May 1999 so probable shipment date of final release of W2K is shipment date of final release of W2K is October 1999October 1999

• Assume therefore that Datacenter Server Assume therefore that Datacenter Server will ship around April 2000will ship around April 2000

• Microsoft still ambivalent about what Microsoft still ambivalent about what extensions to clustering will be in itextensions to clustering will be in it

Terminal Server CountersTerminal Server Counters

• Terminal Services ObjectTerminal Services Object– Session counts (active, inactive, total)Session counts (active, inactive, total)

• Terminal Services SessionTerminal Services Session– 75 counters available75 counters available

– CPU and memory usageCPU and memory usage

– Many relating to transmission of dataMany relating to transmission of data

– ““Protocol Glyph Cache Hit Ratio”!Protocol Glyph Cache Hit Ratio”!

Windows 2000 - The Rhetoric and the Windows 2000 - The Rhetoric and the Reality for Performance ManagersReality for Performance Managers

Des AtkinsonDes Atkinson

Metron Technology LtdMetron Technology Ltd

[email protected]@metron.co.uk