millsap - how to make an application easy to diagnose (slides)

15
Slide 1 Copyright © 1999–2005 by Hotsos Enterprises, Ltd. www.hotsos.com Cary Millsap ([email protected]) Hotsos Enterprises, Ltd. Hotsos Symposium 2005 3:00pm–4:00pm Wednesday 9 March 2005 How to Make an Application Easy to Diagnose Slide 2 Copyright © 1999–2005 by Hotsos Enterprises, Ltd. www.hotsos.com Agenda Motives Instrumenting your Oracle db calls Instrumenting everything else

Upload: rockerabc123

Post on 15-Nov-2014

114 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 1Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Cary Millsap ([email protected])Hotsos Enterprises, Ltd.Hotsos Symposium 20053:00pm–4:00pm Wednesday 9 March 2005

How to Make an Application Easy to Diagnose

Slide 2Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Agenda

• Motives• Instrumenting your Oracle db calls• Instrumenting everything else

Page 2: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 3Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

“If you can’t measure it, you can’t manage it.” —Peter Drucker

• Software performance is measured by its speed• Speed = Result ÷ Time

• If you canÊt measure the time it takesfor an application to produce a result,

then you canÊt manage its performance.

Slide 4Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Software developers use profilers and tracers to determine how long their code runs. And why.

• Example: GNU gprof% cumulative self self total

time seconds seconds calls us/call us/call name 60.37 0.49 0.49 62135400 0.01 0.01 step39.63 0.82 0.33 499999 0.65 1.64 nseq

• Example: GNU stracetimes(NULL) = 53821310

gettimeofday({1105483456, 234638}, NULL) = 0

_llseek(11, 6971392, [6971392], SEEK_SET) = 0

readv(11, [{"\6\242\0\0S\3@\0\247\274\0\0\0"..., 8192}], 6) = 49152

gettimeofday({1105483456, 253209}, NULL) = 0

times(NULL) = 53821312

write(5, "WAIT #5: nam=\'db file scattered read\' ela="..., 65) = 65

write(5, "\n", 1) = 1

Page 3: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 5Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

But you can do much more if you instrument your application

• There are things a developer knows that an OS tool cannot– Aggregate by unit of business work– Reveal context-specific application information

• With instrumentation inside your application– Better, faster code– Easier to diagnose and repair

The result: happier customers, lower support costs.

Slide 6Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Instrumenting your Oracle db calls

Page 4: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 7Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Oracle has provided profiler-ready timing instrumentation for the database kernel since version 6.

• Oracle kernel instrumentation– Version 6: database call timings– Version 7: non-dbcall timed events– Versions 8–10: enhanced code path coverage

Slide 8Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

The design of your application largely determines how easy or difficult it is to collect Oracle trace files.

• Conceptually, data collection is simple– DBMS_SUPPORT (v7,8,9)– DBMS_MONITOR (v10)

• Practically, data collection can be quite difficult– Business task to Oracle trace

file is not 1-to-1

Page 5: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 9Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Instrumenting your Oracle application is pretty easy.

• Instrumentation is minimal extra work➊exec dbms_monitor.session_trace_enable(null,null,true,true);➋exec dbms_application_info.set_module('demo','greeting');select 'hello world' from dual;

➌exec dbms_application_info.set_module('demo','real business');

select count(*) from dba_objects where owner='SYSTEM';

disconnect;

• Difficulty comes when it’s not your application– How do you instrument someone

else’s compiled code?

Slide 10Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

This kind of instrumentation gives you what you need to profile the response time of a business task.

• DBMS_MONITOR call enables the trace• DBMS_APPLICATION_INFO calls identify business tasks

…BEGIN dbms_application_info.set_module('demo','greeting'); END;…*** ACTION NAME:(greeting) 2005-02-03 15:23:47.189*** MODULE NAME:(demo) 2005-02-03 15:23:47.189…select 'hello world' from dual…*** ACTION NAME:(real business) 2005-02-03 15:23:47.193*** MODULE NAME:(demo) 2005-02-03 15:23:47.193…select count(*) from dba_objects where owner='SYSTEM'…

Page 6: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 11Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Instrumenting everything else

Slide 12Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

So, what if most of your user’s response time is spent outside the Oracle tier?

• It happens a lot more these days– Fancier user interfaces– Fancier post-retrieval processing– More tiers

Your custom application probably hasa lot more bad code in it

than your Oracle kernel does.

Page 7: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 13Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

My ideas about how to design trace data are influenced by having studied Oracle trace files for so many years.

• Oracle’s trace diagnostics are tremendous!• But…

– Difficult to understand– Very difficult to profile

• You can do it better– Some proposed requirements…

Slide 14Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: files and file identification…

• Trace data must be written to a file.• The application user gets to decide where this file should be

written and what its name shall be.• The application user gets to decide whether to run a program

with tracing turned on, or with tracing turned off.• The file has a version number in it and whatever additional

information is required (such as a field key) so the applicationuser (and his profiler software) can understand how to interpretthe particular version of the data he’s looking at. This allows the format of trace to improve over time without breaking older profilers.

Page 8: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 15Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

An example trace file…

version=1.1key=time ela usr sys dep caller callee p1 p2 p31107275831.899634=2005/02/01/10:37:11.8996341107275831.899634 0.000472 0.000000 0.000000 0 <> open-trace STDOUT 1107275833.456488 1.556308 1.520000 0.000000 1 dad randomer TX 4 1107275833.456690 1.556574 1.520000 0.000000 0 <> dad 1 1107275833.673210 0.216378 0.000000 0.000000 0 <> sleeper 0.202584 1107275835.307857 1.634442 1.500000 0.000000 1 dad randomer TX 4 1107275836.901840 1.593879 1.510000 0.000000 1 dad randomer TX 4 1107275836.902033 3.228636 3.010000 0.000000 0 <> dad 2

Slide 16Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: vendor support…

• The application vendor must fully support the application’s trace data. The vendor must fully document the format of the trace file and the meaning of its content.

Page 9: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 17Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: business task orientation…

• A trace file “event” line maps to a logical unit of work—usually a subroutine. The unit of work must be small enough that the reader of the trace data doesn’t require more detail about the unit of work than is rendered in the trace file. The unit of work must be large enough to minimize the measurement intrusion effect of the instrumentation.

• Every time a business-level task begins or ends, the application must emit information to the trace file to signify the business task boundary.

Slide 18Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

An example trace file…

version=1.1key=time ela usr sys dep caller callee p1 p2 p31107275831.899634=2005/02/01/10:37:11.8996341107275831.899634 0.000472 0.000000 0.000000 0 <> open-trace STDOUT 1107275833.456488 1.556308 1.520000 0.000000 1 dad randomer TX 4 1107275833.456690 1.556574 1.520000 0.000000 0 <> dad 1 1107275833.673210 0.216378 0.000000 0.000000 0 <> sleeper 0.202584 1107275835.307857 1.634442 1.500000 0.000000 1 dad randomer TX 4 1107275836.901840 1.593879 1.510000 0.000000 1 dad randomer TX 4 1107275836.902033 3.228636 3.010000 0.000000 0 <> dad 2

Note that my example doesnÊt yet demonstratethe second point (task begin/end markers).

Page 10: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 19Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: coverage…

• The collection of “event” lines in the trace file must provide complete coverage of the application’s code path.

• If a business task is permitted to execute piecewise across two or more OS processes, then the trace data must contain markers sufficient to assemble the relevant fragments of trace data intoone contiguous time-sequential description of the task’s response time.

• Each tier must be instrumented so that a user can compute end-to-end response time for the measured task.

Slide 20Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: timestamps...

• Each “event” line must have a timestamp. The trace documentation must explain to what event that timestamp refers. (Typically, it’s the time of the event’s conclusion.)

• If the trace file’s timestamp values aren’t human-readable, then the trace file must provide information that allows for easy conversion of timestamps into human-readable wall-clock values.

Page 11: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 21Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

An example trace file…

version=1.1key=time ela usr sys dep caller callee p1 p2 p31107275831.899634=2005/02/01/10:37:11.8996341107275831.899634 0.000472 0.000000 0.000000 0 <> open-trace STDOUT 1107275833.456488 1.556308 1.520000 0.000000 1 dad randomer TX 4 1107275833.456690 1.556574 1.520000 0.000000 0 <> dad 1 1107275833.673210 0.216378 0.000000 0.000000 0 <> sleeper 0.202584 1107275835.307857 1.634442 1.500000 0.000000 1 dad randomer TX 4 1107275836.901840 1.593879 1.510000 0.000000 1 dad randomer TX 4 1107275836.902033 3.228636 3.010000 0.000000 0 <> dad 2

Slide 22Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: event attributes…

• Each “event” line must show an elapsed time consumption.• Each “event” line must show resource consumption for both

kernel mode and user mode CPU usage.• Each “event” line must show the name of the “event,” its call

stack depth, and the name of its caller.• Each “event” line must have the provision for displaying context-

sensitive values about the instrumented event.

Page 12: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 23Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

An example trace file…

version=1.1key=time ela usr sys dep caller callee p1 p2 p31107275831.899634=2005/02/01/10:37:11.8996341107275831.899634 0.000472 0.000000 0.000000 0 <> open-trace STDOUT1107275833.456488 1.556308 1.520000 0.000000 1 dad randomer TX 4 1107275833.456690 1.556574 1.520000 0.000000 0 <> dad 1 1107275833.673210 0.216378 0.000000 0.000000 0 <> sleeper 0.202584 1107275835.307857 1.634442 1.500000 0.000000 1 dad randomer TX 4 1107275836.901840 1.593879 1.510000 0.000000 1 dad randomer TX 4 1107275836.902033 3.228636 3.010000 0.000000 0 <> dad 2

Slide 24Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: un-buffered output...

• The application must flush trace lines to the trace file as events complete. If the application can buffer its trace emissions, then there must exist a user-selectable option to produce un-buffered output.

Page 13: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 25Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: conservation of storage...

• Trace data should be reasonably conservative about space consumption (and the time it takes to write the trace data). Forexample, a single key defining the meaning of delimiter-separated fields is more efficient than using a name=value style syntax for every field throughout the trace file.

Slide 26Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Requirement: minimized invasiveness…

• The application instrumentation must be minimally invasive upon the response time of the application.

• The application instrumentation must be minimally invasive upon the author of the application code.

Page 14: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 27Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

Wrap-up

• You can’t manage what you can’t measure• Instrumented code is faster, better code that’s easier and

cheaper to support• You can learn a lot about instrumentation by watching how

Oracle does it• The “requirements” proposed in my paper will help you create

trace files that are easier to use than Oracle’s

If you will ever be responsible for the performanceof your application, then youÊll thank yourself later if you

instrument it today.

So will your support staff.

Slide 28Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

• Products– Hotsos Profiler– Laredo– Interceptor technologies

• Education– Oracle performance curriculum– Hotsos Symposium

Hotsos: Come see us…

• Thought leadership– Optimizing Oracle Performance– Oracle Insights– Method R

• Services– 1-week performance assessment– On-site consulting and education– Remote consulting

Page 15: Millsap - How to Make an Application Easy to Diagnose (Slides)

Slide 29Copyright © 1999–2005 by Hotsos Enterprises, Ltd.www.hotsos.com

References

Finnigan, P. 2004. “How to set trace for others’ sessions, for your own session, and at instance level.”www.petefinnigan.com

Millsap, C. 2005. “Profiling Oracle: how it works.” Hotsos Symposium 2005Millsap, C. 2004. “How to activate extended SQL trace.” www.hotsos.comMillsap, C.; Holt, J. 2003. Optimizing Oracle Performance. Sebastopol CA: O’Reilly & AssociatesNorgaard, M.; et al. 2004. Oracle Insights: Tales of the Oak Table. Berkeley CA: Apress

A collection of stories about experiences with Oracle performance, including a history of Oracle’s extended SQL trace mechanism.

The Open Group 1988. ARM 2.0 Technical Standard. www.opengroup.org/tech/management/arm/A description of the “Application Response Measurement (ARM) API,” an application measurement system implemented in C and Java.