diagnosing websphere thread dumps

27
IBM Software Group ® Ricky Marley, Advisory Software Engineer, WebSphere Application Server L2 Support Team Lead WebSphere® Support Technical Exchange Diagnosing WebSphere Application Server Hangs on AIX

Upload: hgentyala

Post on 22-Nov-2014

1.185 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Diagnosing WebSphere Thread Dumps

IBM Software Group

®

Ricky Marley,Advisory Software Engineer, WebSphere Application Server L2 Support Team Lead

WebSphere® Support Technical Exchange

Diagnosing WebSphere Application Server Hangs on AIX

Page 2: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 2

AgendaBackground/OverviewWhat Data Should be Collected to Diagnose the ProblemJavacores

- Basics- Thread Analysis- Lock Analysis- Understanding Javacores from a WebSphere

AppServerSummaryQuestions?

Page 3: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 3

What is a Hang or Degradation in Performance?

Hang is a condition where the Java Virtual Machine (JVM) become becomes unresponsive for client requests

When a client complains about hang, first thing to understand is “What type of requests to WebSphere are unresponsive?”

Starting V5.1.1, the below message will be written intoSystemOut file when a hung thread is detected:

[7/15/04 15:03:11:502 EDT] 3c3b4e37 ThreadMonitor W WSVR0605W: Thread "Servlet.Engine.Transports : 0" (37c18e37) has been active for 680,839 milliseconds and may be hung. There are 1 threads in total in the server that may be hung.

Page 4: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 4

What Can Cause a Hang?Circular dependency in application code causing a deadlock in JVMBottleneck caused by:

Improper tuning of the webserver, Web Container (JVM), or databaseSynchronization of Java code in the JVMWeb Container waiting for a response from an external resourceLimitation of resources

CPUMemory (typically Java heap)

Page 5: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 5

Threadpools and Their Role in the AppserverWithin the WebSphere Application Server process, there are various types of thread pools that serve client requests:

WebContainer Thread poolObject Request Broker (ORB) Thread poolData Replication Service (DRS) Thread poolJava Message Service (JMS) Thread poolAlarm Thread pool that will be reused for various purposesSOAP Connector Thread poolApplication defined Thread pool

The following link illustrates the various states in which you may find aWebContainer thread:http://www-1.ibm.com/support/docview.wss?uid=swg21137491

Page 6: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 6

Data Needed to Debug an AppServer Hang Issue

Application logs alone cannot be used to debug this type of problem. Documentation that needs to be collected at the time of the problem can be found by clicking on the link for “Hangs / Performance Degradation” at the following website:

http://www-1.ibm.com/support/docview.wss?uid=swg21145599These documents include instructions for gathering:

Javacores• Will be found in the AppServer Working Directory

WebSphere 5.X – /usr/WebSphere/AppServer WebSphere 6.X - /usr/WebSphere/AppServer/profiles/server1

netstatCPU utilization - tprofMemory utilization – verbose GC data

Page 7: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 7

Introduction to Javacores – The Basics

A javacore , also known as a Java™ dump, Java™thread dump or a thread dump, is a file that contains a Large Amount of useful information about the JVM. The data presented in the following sections:

All of the threads that run on a Java™ Virtual Machine (JVM).All of the monitors on a JVM.Some useful information about the system that the JVM runs under.

Page 8: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 8

Introduction to Javacores – Dump RoutinesThe following chart presents the logical subcomponents in which a Javacore presents information from the JVM. To debug a performance degradation, focus should be placed in the LK and XM subcomponents.

CL SubcomponentXM SubcomponentLK SubcomponentCI Subcomponent

XHPI SubcomponentTitle

Tags

Page 9: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 9

Introduction to Javacores – Dump Routines

The following figure is an example of the “Title” dump routine which presents the reason the JVM produced the Javacore file. The TITLE subcomponent contains the date and time of occurrence and the location within your file system. You can use this information to determine if the Java dump was created by a user or was system generated.

NULL ------------------------------------------------------------------------0SECTION TITLE subcomponent dump routineNULL ===============================1TISIGINFO signal 3 received 1TIDATETIME Date: 2006/08/29 at 14:47:041TIFILENAME Javacore filename: /usr/WebSphere/AppServer/javacore29534.1156877224.txtNULL ------------------------------------------------------------------------

Page 10: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 10

Introduction to Javacores – Dump RoutinesThe following figure is an example of the “XHPI subcomponent” dump routine. This section will identify the operating environment, memory information, user limits, and other operating system related details.

NULL ------------------------------------------------------------------------0SECTION XHPI subcomponent dump routine

NULL ==============================

1XHTIME Tue Aug 29 14:47:04 2006

1XHSIGRECV SIGQUIT received at 0x0 in <unknown>.

1XHFULLVERSION J2RE 1.4.2 IBM AIX build ca1420-20040626

1XHOPENV Operating Environment

NULL ---------------------2XHHOSTNAME Host : aixwas3.rtp.raleigh.ibm.com:9.42.115.17

2XHOSLEVEL OS Level : AIX 5.3.0.0

1XHENVVARS Environment Variables

NULL ---------------------

2XHENVVAR MANPATH=/usr/dt/man:/usr/share/man….

Page 11: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 11

Introduction to Javacores – Dump routinesThe following figure is an example of the “CI subcomponent” dump routine. Use this section to identify the JVM build, the class path that the JVM uses, the JVM system property variables, and other system information related to the JVM.

NULL ----------------------------------------------------------------0SECTION CI subcomponent dump routine NULL ============================ 1CIJAVAVERSION J2RE 1.4.2 IBM AIX build ca1420-20040626 1CIRUNNINGAS Running as a standalone JVM 1CICMDLINE /usr/WebSphere/AppServer/java/bin/java -Djava.net.preferIPv4Stac1CIJAVAHOMEDIR Java Home Dir: /usr/WebSphere/AppServer/java/bin/../jre 1CIJAVADLLDIR Java DLL Dir: /usr/WebSphere/AppServer/java/bin/../jre/bin 1CISYSCP Sys Classpath: /usr/WebSphere/AppServer/java/bin/../jre/lib/co1CIUSERARGS UserArgs: 2CIUSERARG vfprintf 0x30000EF4 2CIUSERARG -Djava.net.preferIPv4Stack=true 2CIUSERARG -Dwas.status.socket=35676 2CIUSERARG -Xbootclasspath/p:/usr/WebSphere/AppServer/java/jre/li2CIUSERARG -Xms50m 2CIUSERARG -Xmx256m

Page 12: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 12

Introduction to Javacores – Dump RoutinesThe following figure is an example of the “LK subcomponent” dump routine. This Component provides information for all of the monitors in the JVM. You can use this section to analyze how resources are being used by various threads.

0SECTION LK subcomponent dump routine NULL ============================ NULL 1LKPOOLINFO Monitor pool info: 2LKPOOLINIT Initial monitor count: 32 2LKPOOLEXPNUM Minimum number of free monitors before expansion: 52LKPOOLEXPBY Pool will next be expanded by: 182 2LKPOOLTOTAL Current total number of monitors: 364 2LKPOOLFREE Current number of free monitors: 163 NULL 1LKMONPOOLDUMP Monitor Pool Dump (flat & inflated object-monitors):2LKMONINUSE sys_mon_t:0x44F79558 infl_mon_t: 0x00000000: 3LKMONOBJECT java.lang.Object@359F1130/359F1138: Flat locked by thread ident 0x263LKNOTIFYQ Waiting to be notified: 3LKWAITNOTIFY "Servlet.Engine.Transports : 0" (0x450914A0)

Page 13: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 13

Introduction to Javacores – Dump RoutinesThe following figure is an example of the “XM subcomponent”dump routine. This section provides a detailed list of all the threads and their states. It also provides the current stack trace for each thread listed. You can use the stack trace to relate the current activity of thread with the source code.

0SECTION XM subcomponent dump routineNULL ============================NULL 1XMCURTHDINFO Current Thread DetailsNULL ----------------------3XMTHREADINFO "Signal dispatcher" (TID:0x300FB960, sys_thread_t:0x412CDAA0, state:R3XHNATIVESTACK Native StackNULL ------------3XHSTACKLINEERR unavailable - stack address not valid1XMTHDINFO All Thread DetailsNULL ------------------2XMFULLTHDDUMP Full thread dump Classic VM (J2RE 1.4.2 IBM AIX build ca1420-200406263XMTHREADINFO "Servlet.Engine.Transports : 3" (TID:0x30285098, sys_thread_t:0x4660CD204XESTACKTRACE at TE02Servlet.doLock02(TE02Servlet.java:77)4XESTACKTRACE at TE02Servlet.doPost(TE02Servlet.java:36)4XESTACKTRACE at TE02Servlet.doGet(TE02Servlet.java:18)4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:853). . .

Page 14: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 14

Introduction to javacores – Monitor analysisThe following table shows how the LK Dump routine is broken down into five subsections:

Provides an association between the XM dump routine and the monitor pool dump. You can use this section as a pointer to the location of the problem identified in the monitor pool dump.

Thread Identifiers

The same as the monitor pool dump except that additional JVM internal information is provided.

Java Object Monitor Dump

Similar to the monitor pool dump except that instead of listing any monitor, it lists all of the system monitors on the JVM for which the Java dump was captured.

JVM System Monitor Dump

Lists the monitors that exist in the JVM at the time of the Java dump along with the threads waiting for that particular monitor and the owner of each monitor. Monitors can also be owned or not owned which will be indicated in the file. You can use this section to identify any problems, such as, too many threads waiting on a monitor.

Monitor Pool Dump

Provides basic monitor information, such as, the current total number of monitors, and so on.

Monitor Pool Info

Page 15: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 15

Introduction to javacores – Thread analysis

Thread suspended.SuspendedS

Thread waiting on a monitor lockMonitor Wait

MW

Thread waiting on a condition variable. Typically threads in this state are waiting for a certain condition to occur, for example, a thread waiting for a resource to become available.

Conditional Wait

CW

Thread that has the ability to run or is running.

RunnableR

DescriptionNameState

Thread States

Page 16: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 16

Analyzing JavacoresThere are many ways to approach analyzing the Javacores to determine the cause of a hang or performance degradation. It is suggested to use the following methodology in the order listed:

Is there a deadlock? Is there a monitor that has a long list of threads waiting for it?What are the Servlet.Engine.Transport threads doing?A. Are they idle waiting for requests?B. Are they busy and many of them have the same or similar

stacks?C. Are some or most of the threads busy and they all appear to

be doing something different?Javacores can be analyzed manually or by using the ThreadAnalyzertool (http://www-128.ibm.com/developerworks/websphere/downloads/thread_analyzer.html)

Page 17: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 17

Analyzing javacores - Deadlock

The Javacore will report a deadlock as follows:

1LKDEADLOCK Deadlock detected !!!NULL ---------------------NULL 2LKDEADLOCKTHR Thread "Servlet.Engine.Transports : 3" (0x4660CD20)3LKDEADLOCKWTR is waiting for:4LKDEADLOCKMON sys_mon_t:0x4532E7C8 infl_mon_t: 0x45170CC0:4LKDEADLOCKOBJ java.lang.Object@359F1120/395F1128:3LKDEADLOCKOWN which is owned by:2LKDEADLOCKTHR Thread "Servlet.Engine.Transports : 0" (0x450914A0)3LKDEADLOCKWTR which is waiting for:4LKDEADLOCKMON sys_mon_t:0x44F79558 infl_mon_t: 0x00000000:4LKDEADLOCKOBJ java.lang.Object@359F1130/395F1138:3LKDEADLOCKOWN which is owned by:2LKDEADLOCKTHR Thread "Servlet.Engine.Transports : 3" (0x4660CD20)

Page 18: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 18

Analyzing Javacores – Threads waiting on MonitorThe following output from a Javacore shows several Servlet.Engine.Transport threads waiting on a monitor named java.lang.Object@359F1140/359F1148This monitor is owned by thread identifier 0x25

2LKMONINUSE sys_mon_t:0x44F75D48 infl_mon_t: 0x00000000:3LKMONOBJECT java.lang.Object@359F1140/359F1148: Flat locked by threadident 0x25, entry count 13LKNOTIFYQ Waiting to be notified:3LKWAITNOTIFY "Servlet.Engine.Transports : 1" (0x450EBAA0)3LKWAITNOTIFY "Servlet.Engine.Transports : 0" (0x450914A0)3LKWAITNOTIFY "Servlet.Engine.Transports : 3" (0x4660CD20)3LKWAITNOTIFY "Servlet.Engine.Transports : 4" (0x4670CA14)3LKWAITNOTIFY "Servlet.Engine.Transports : 5" (0x4760CF50)

Page 19: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 19

Analyzing Javacores – Threads waiting on Monitor

Examine the “Thread identifiers” section of the LK subcomponent dump to determine the name of the thread associated with “thread ident 0x25.” In this case, threadident 0x25 is “Servlet.Engine.Transports : 2”

1LKFLATMONDUMP Thread identifiers (as used in flat monitors):2LKFLATMON ident 0x26 "Servlet.Engine.Transports : 3" (0x4660CD20) 2LKFLATMON ident 0x25 "Servlet.Engine.Transports : 2" (0x464900A0)2LKFLATMON ident 0x14 "Servlet.Engine.Transports : 1" (0x450EBAA0) 2LKFLATMON ident 0x13 "Servlet.Engine.Transports : 0" (0x450914A0)

Page 20: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 20

Analyzing Javacores – Threads waiting on Monitor

“Servlet.Engine.Transports : 2” is found in the XM component section of the Javacore. The Java stack trace for thread is as follows:

3XMTHREADINFO "Servlet.Engine.Transports : 2" (TID:0x3018A658, sys_thread_t:0x464900A0,

state:CW, native ID:0x2438) prio=5

4XESTACKTRACE at java.lang.Thread.sleep(Native Method)

4XESTACKTRACE at TE02Servlet.doLock01(TE02Servlet.java:64)

4XESTACKTRACE at TE02Servlet.doPost(TE02Servlet.java:33)4XESTACKTRACE at TE02Servlet.doGet(TE02Servlet.java:18)

4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)

4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)

The TE02Servlet has captured the lock and has gone to sleep.Until this thread releases the monitor, all threads waiting for this monitor will be blockedReview subsequent Javacores and monitor the state of this thread

Page 21: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 21

Analyzing Javacores – Servlet Engine Transport Threads

A. Threads are idle waiting for requests

Idle threads could indicate that requests are not making it into the JVM. At this point, you need to determine if any HTTP clients have been able to establish connections to the webserver/plugin or WebSphere transport.

Is the transport thread present in the JVM?Are there any established connections to the webserver or application server? Review the webserver’s plugin log for potential errors connecting to the application server

Page 22: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 22

Analyzing Javacores – Servlet Engine Transport ThreadsB. Threads are busy and many of them have the

same or similar stacks1. Most of the threads are Runnable (State: R) and the top of their stack

is java.net.SocketInputStream.socketRead

This indicates that there is not a bottleneck regarding synchronized java resources. If the threads have similar stacks and Runnable, they are most likely in socketRead waiting for a response from a remote server.

What does the stack of the thread look like? What is the thread waiting on (LDAP, Oracle, HttpURLConnection, etc)?

Page 23: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 23

Analyzing Javacores – Servlet Engine Transport Threads

2. Most of the threads are in monitor wait or conditional wait (State: MW or CW) indicating a potential bottleneck in synchronized code

What does the stack look like for the threads that are in MW or CW?What thread holds the monitor that is blocking these threads? This thread should be reviewed in each thread dump.

– Does the stack of this thread remain the same? If so, none of the threads that are waiting on that monitor have been able to continue processing.

– Does the stack of this thread change? If enough threads are attempting to pass through the code that is synchronized and the thread that holds the monitor takes a relatively long time to release it, this could cause a hang or performance degradation

B. Threads are busy and many of them have the same or similar stacks

Page 24: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 24

Analyzing Javacores – Servlet Engine Transport Threads

C. Threads are not idle, some or most of the available threads are in use, and they all appear to be doing something different. This can indicate a resource issue.

Is CPU utilization high? High CPU will cause threads to contend for CPU cycles

What does memory utilization look like?

What does VerboseGC data look like?When GC runs, all other threads are suspended.

Page 25: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 25

Summary

Identifying when a JVM is encountering a performance degradation

Collecting the documentation required to debug the problem

Locating the data collected

Introduction to the content of javacores

Analyzing javacores for a WebSphere Application Server JVM

Page 26: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 26

Additional WebSphere Product Resources

Discover the latest trends in WebSphere Technology and implementation, participate in technically-focused briefings, webcasts and podcasts at: www.ibm.com/developerworks/websphere/community/Learn about other upcoming webcasts, conferences and events: www.ibm.com/software/websphere/events_1.htmlJoin the Global WebSphere User Group Community: www.websphere.org Access key product show-me demos and tutorials by visiting IBM Education Assistant: ibm.com/software/info/education/assistantLearn about the Electronic Service Request (ESR) tool for submitting problems electronically: www.ibm.com/software/support/viewlet/probsub/ESR_Overview_viewlet_swf.htmlSign up to receive weekly technical My support emails: www.ibm.com/software/support/einfo.html

Page 27: Diagnosing WebSphere Thread Dumps

IBM Software Group

WebSphere® Support Technical Exchange 27

Questions and Answers