diagnosing websphere thread dumps
TRANSCRIPT
IBM Software Group
®
Ricky Marley,Advisory Software Engineer, WebSphere Application Server L2 Support Team Lead
WebSphere® Support Technical Exchange
Diagnosing WebSphere Application Server Hangs on AIX
IBM Software Group
WebSphere® Support Technical Exchange 2
AgendaBackground/OverviewWhat Data Should be Collected to Diagnose the ProblemJavacores
- Basics- Thread Analysis- Lock Analysis- Understanding Javacores from a WebSphere
AppServerSummaryQuestions?
IBM Software Group
WebSphere® Support Technical Exchange 3
What is a Hang or Degradation in Performance?
Hang is a condition where the Java Virtual Machine (JVM) become becomes unresponsive for client requests
When a client complains about hang, first thing to understand is “What type of requests to WebSphere are unresponsive?”
Starting V5.1.1, the below message will be written intoSystemOut file when a hung thread is detected:
[7/15/04 15:03:11:502 EDT] 3c3b4e37 ThreadMonitor W WSVR0605W: Thread "Servlet.Engine.Transports : 0" (37c18e37) has been active for 680,839 milliseconds and may be hung. There are 1 threads in total in the server that may be hung.
IBM Software Group
WebSphere® Support Technical Exchange 4
What Can Cause a Hang?Circular dependency in application code causing a deadlock in JVMBottleneck caused by:
Improper tuning of the webserver, Web Container (JVM), or databaseSynchronization of Java code in the JVMWeb Container waiting for a response from an external resourceLimitation of resources
CPUMemory (typically Java heap)
IBM Software Group
WebSphere® Support Technical Exchange 5
Threadpools and Their Role in the AppserverWithin the WebSphere Application Server process, there are various types of thread pools that serve client requests:
WebContainer Thread poolObject Request Broker (ORB) Thread poolData Replication Service (DRS) Thread poolJava Message Service (JMS) Thread poolAlarm Thread pool that will be reused for various purposesSOAP Connector Thread poolApplication defined Thread pool
The following link illustrates the various states in which you may find aWebContainer thread:http://www-1.ibm.com/support/docview.wss?uid=swg21137491
IBM Software Group
WebSphere® Support Technical Exchange 6
Data Needed to Debug an AppServer Hang Issue
Application logs alone cannot be used to debug this type of problem. Documentation that needs to be collected at the time of the problem can be found by clicking on the link for “Hangs / Performance Degradation” at the following website:
http://www-1.ibm.com/support/docview.wss?uid=swg21145599These documents include instructions for gathering:
Javacores• Will be found in the AppServer Working Directory
WebSphere 5.X – /usr/WebSphere/AppServer WebSphere 6.X - /usr/WebSphere/AppServer/profiles/server1
netstatCPU utilization - tprofMemory utilization – verbose GC data
IBM Software Group
WebSphere® Support Technical Exchange 7
Introduction to Javacores – The Basics
A javacore , also known as a Java™ dump, Java™thread dump or a thread dump, is a file that contains a Large Amount of useful information about the JVM. The data presented in the following sections:
All of the threads that run on a Java™ Virtual Machine (JVM).All of the monitors on a JVM.Some useful information about the system that the JVM runs under.
IBM Software Group
WebSphere® Support Technical Exchange 8
Introduction to Javacores – Dump RoutinesThe following chart presents the logical subcomponents in which a Javacore presents information from the JVM. To debug a performance degradation, focus should be placed in the LK and XM subcomponents.
CL SubcomponentXM SubcomponentLK SubcomponentCI Subcomponent
XHPI SubcomponentTitle
Tags
IBM Software Group
WebSphere® Support Technical Exchange 9
Introduction to Javacores – Dump Routines
The following figure is an example of the “Title” dump routine which presents the reason the JVM produced the Javacore file. The TITLE subcomponent contains the date and time of occurrence and the location within your file system. You can use this information to determine if the Java dump was created by a user or was system generated.
NULL ------------------------------------------------------------------------0SECTION TITLE subcomponent dump routineNULL ===============================1TISIGINFO signal 3 received 1TIDATETIME Date: 2006/08/29 at 14:47:041TIFILENAME Javacore filename: /usr/WebSphere/AppServer/javacore29534.1156877224.txtNULL ------------------------------------------------------------------------
IBM Software Group
WebSphere® Support Technical Exchange 10
Introduction to Javacores – Dump RoutinesThe following figure is an example of the “XHPI subcomponent” dump routine. This section will identify the operating environment, memory information, user limits, and other operating system related details.
NULL ------------------------------------------------------------------------0SECTION XHPI subcomponent dump routine
NULL ==============================
1XHTIME Tue Aug 29 14:47:04 2006
1XHSIGRECV SIGQUIT received at 0x0 in <unknown>.
1XHFULLVERSION J2RE 1.4.2 IBM AIX build ca1420-20040626
…
1XHOPENV Operating Environment
NULL ---------------------2XHHOSTNAME Host : aixwas3.rtp.raleigh.ibm.com:9.42.115.17
2XHOSLEVEL OS Level : AIX 5.3.0.0
…
1XHENVVARS Environment Variables
NULL ---------------------
2XHENVVAR MANPATH=/usr/dt/man:/usr/share/man….
IBM Software Group
WebSphere® Support Technical Exchange 11
Introduction to Javacores – Dump routinesThe following figure is an example of the “CI subcomponent” dump routine. Use this section to identify the JVM build, the class path that the JVM uses, the JVM system property variables, and other system information related to the JVM.
NULL ----------------------------------------------------------------0SECTION CI subcomponent dump routine NULL ============================ 1CIJAVAVERSION J2RE 1.4.2 IBM AIX build ca1420-20040626 1CIRUNNINGAS Running as a standalone JVM 1CICMDLINE /usr/WebSphere/AppServer/java/bin/java -Djava.net.preferIPv4Stac1CIJAVAHOMEDIR Java Home Dir: /usr/WebSphere/AppServer/java/bin/../jre 1CIJAVADLLDIR Java DLL Dir: /usr/WebSphere/AppServer/java/bin/../jre/bin 1CISYSCP Sys Classpath: /usr/WebSphere/AppServer/java/bin/../jre/lib/co1CIUSERARGS UserArgs: 2CIUSERARG vfprintf 0x30000EF4 2CIUSERARG -Djava.net.preferIPv4Stack=true 2CIUSERARG -Dwas.status.socket=35676 2CIUSERARG -Xbootclasspath/p:/usr/WebSphere/AppServer/java/jre/li2CIUSERARG -Xms50m 2CIUSERARG -Xmx256m
IBM Software Group
WebSphere® Support Technical Exchange 12
Introduction to Javacores – Dump RoutinesThe following figure is an example of the “LK subcomponent” dump routine. This Component provides information for all of the monitors in the JVM. You can use this section to analyze how resources are being used by various threads.
0SECTION LK subcomponent dump routine NULL ============================ NULL 1LKPOOLINFO Monitor pool info: 2LKPOOLINIT Initial monitor count: 32 2LKPOOLEXPNUM Minimum number of free monitors before expansion: 52LKPOOLEXPBY Pool will next be expanded by: 182 2LKPOOLTOTAL Current total number of monitors: 364 2LKPOOLFREE Current number of free monitors: 163 NULL 1LKMONPOOLDUMP Monitor Pool Dump (flat & inflated object-monitors):2LKMONINUSE sys_mon_t:0x44F79558 infl_mon_t: 0x00000000: 3LKMONOBJECT java.lang.Object@359F1130/359F1138: Flat locked by thread ident 0x263LKNOTIFYQ Waiting to be notified: 3LKWAITNOTIFY "Servlet.Engine.Transports : 0" (0x450914A0)
IBM Software Group
WebSphere® Support Technical Exchange 13
Introduction to Javacores – Dump RoutinesThe following figure is an example of the “XM subcomponent”dump routine. This section provides a detailed list of all the threads and their states. It also provides the current stack trace for each thread listed. You can use the stack trace to relate the current activity of thread with the source code.
0SECTION XM subcomponent dump routineNULL ============================NULL 1XMCURTHDINFO Current Thread DetailsNULL ----------------------3XMTHREADINFO "Signal dispatcher" (TID:0x300FB960, sys_thread_t:0x412CDAA0, state:R3XHNATIVESTACK Native StackNULL ------------3XHSTACKLINEERR unavailable - stack address not valid1XMTHDINFO All Thread DetailsNULL ------------------2XMFULLTHDDUMP Full thread dump Classic VM (J2RE 1.4.2 IBM AIX build ca1420-200406263XMTHREADINFO "Servlet.Engine.Transports : 3" (TID:0x30285098, sys_thread_t:0x4660CD204XESTACKTRACE at TE02Servlet.doLock02(TE02Servlet.java:77)4XESTACKTRACE at TE02Servlet.doPost(TE02Servlet.java:36)4XESTACKTRACE at TE02Servlet.doGet(TE02Servlet.java:18)4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:853). . .
IBM Software Group
WebSphere® Support Technical Exchange 14
Introduction to javacores – Monitor analysisThe following table shows how the LK Dump routine is broken down into five subsections:
Provides an association between the XM dump routine and the monitor pool dump. You can use this section as a pointer to the location of the problem identified in the monitor pool dump.
Thread Identifiers
The same as the monitor pool dump except that additional JVM internal information is provided.
Java Object Monitor Dump
Similar to the monitor pool dump except that instead of listing any monitor, it lists all of the system monitors on the JVM for which the Java dump was captured.
JVM System Monitor Dump
Lists the monitors that exist in the JVM at the time of the Java dump along with the threads waiting for that particular monitor and the owner of each monitor. Monitors can also be owned or not owned which will be indicated in the file. You can use this section to identify any problems, such as, too many threads waiting on a monitor.
Monitor Pool Dump
Provides basic monitor information, such as, the current total number of monitors, and so on.
Monitor Pool Info
IBM Software Group
WebSphere® Support Technical Exchange 15
Introduction to javacores – Thread analysis
Thread suspended.SuspendedS
Thread waiting on a monitor lockMonitor Wait
MW
Thread waiting on a condition variable. Typically threads in this state are waiting for a certain condition to occur, for example, a thread waiting for a resource to become available.
Conditional Wait
CW
Thread that has the ability to run or is running.
RunnableR
DescriptionNameState
Thread States
IBM Software Group
WebSphere® Support Technical Exchange 16
Analyzing JavacoresThere are many ways to approach analyzing the Javacores to determine the cause of a hang or performance degradation. It is suggested to use the following methodology in the order listed:
Is there a deadlock? Is there a monitor that has a long list of threads waiting for it?What are the Servlet.Engine.Transport threads doing?A. Are they idle waiting for requests?B. Are they busy and many of them have the same or similar
stacks?C. Are some or most of the threads busy and they all appear to
be doing something different?Javacores can be analyzed manually or by using the ThreadAnalyzertool (http://www-128.ibm.com/developerworks/websphere/downloads/thread_analyzer.html)
IBM Software Group
WebSphere® Support Technical Exchange 17
Analyzing javacores - Deadlock
The Javacore will report a deadlock as follows:
1LKDEADLOCK Deadlock detected !!!NULL ---------------------NULL 2LKDEADLOCKTHR Thread "Servlet.Engine.Transports : 3" (0x4660CD20)3LKDEADLOCKWTR is waiting for:4LKDEADLOCKMON sys_mon_t:0x4532E7C8 infl_mon_t: 0x45170CC0:4LKDEADLOCKOBJ java.lang.Object@359F1120/395F1128:3LKDEADLOCKOWN which is owned by:2LKDEADLOCKTHR Thread "Servlet.Engine.Transports : 0" (0x450914A0)3LKDEADLOCKWTR which is waiting for:4LKDEADLOCKMON sys_mon_t:0x44F79558 infl_mon_t: 0x00000000:4LKDEADLOCKOBJ java.lang.Object@359F1130/395F1138:3LKDEADLOCKOWN which is owned by:2LKDEADLOCKTHR Thread "Servlet.Engine.Transports : 3" (0x4660CD20)
IBM Software Group
WebSphere® Support Technical Exchange 18
Analyzing Javacores – Threads waiting on MonitorThe following output from a Javacore shows several Servlet.Engine.Transport threads waiting on a monitor named java.lang.Object@359F1140/359F1148This monitor is owned by thread identifier 0x25
2LKMONINUSE sys_mon_t:0x44F75D48 infl_mon_t: 0x00000000:3LKMONOBJECT java.lang.Object@359F1140/359F1148: Flat locked by threadident 0x25, entry count 13LKNOTIFYQ Waiting to be notified:3LKWAITNOTIFY "Servlet.Engine.Transports : 1" (0x450EBAA0)3LKWAITNOTIFY "Servlet.Engine.Transports : 0" (0x450914A0)3LKWAITNOTIFY "Servlet.Engine.Transports : 3" (0x4660CD20)3LKWAITNOTIFY "Servlet.Engine.Transports : 4" (0x4670CA14)3LKWAITNOTIFY "Servlet.Engine.Transports : 5" (0x4760CF50)
IBM Software Group
WebSphere® Support Technical Exchange 19
Analyzing Javacores – Threads waiting on Monitor
Examine the “Thread identifiers” section of the LK subcomponent dump to determine the name of the thread associated with “thread ident 0x25.” In this case, threadident 0x25 is “Servlet.Engine.Transports : 2”
1LKFLATMONDUMP Thread identifiers (as used in flat monitors):2LKFLATMON ident 0x26 "Servlet.Engine.Transports : 3" (0x4660CD20) 2LKFLATMON ident 0x25 "Servlet.Engine.Transports : 2" (0x464900A0)2LKFLATMON ident 0x14 "Servlet.Engine.Transports : 1" (0x450EBAA0) 2LKFLATMON ident 0x13 "Servlet.Engine.Transports : 0" (0x450914A0)
IBM Software Group
WebSphere® Support Technical Exchange 20
Analyzing Javacores – Threads waiting on Monitor
“Servlet.Engine.Transports : 2” is found in the XM component section of the Javacore. The Java stack trace for thread is as follows:
3XMTHREADINFO "Servlet.Engine.Transports : 2" (TID:0x3018A658, sys_thread_t:0x464900A0,
state:CW, native ID:0x2438) prio=5
4XESTACKTRACE at java.lang.Thread.sleep(Native Method)
4XESTACKTRACE at TE02Servlet.doLock01(TE02Servlet.java:64)
4XESTACKTRACE at TE02Servlet.doPost(TE02Servlet.java:33)4XESTACKTRACE at TE02Servlet.doGet(TE02Servlet.java:18)
4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)
4XESTACKTRACE at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
The TE02Servlet has captured the lock and has gone to sleep.Until this thread releases the monitor, all threads waiting for this monitor will be blockedReview subsequent Javacores and monitor the state of this thread
IBM Software Group
WebSphere® Support Technical Exchange 21
Analyzing Javacores – Servlet Engine Transport Threads
A. Threads are idle waiting for requests
Idle threads could indicate that requests are not making it into the JVM. At this point, you need to determine if any HTTP clients have been able to establish connections to the webserver/plugin or WebSphere transport.
Is the transport thread present in the JVM?Are there any established connections to the webserver or application server? Review the webserver’s plugin log for potential errors connecting to the application server
IBM Software Group
WebSphere® Support Technical Exchange 22
Analyzing Javacores – Servlet Engine Transport ThreadsB. Threads are busy and many of them have the
same or similar stacks1. Most of the threads are Runnable (State: R) and the top of their stack
is java.net.SocketInputStream.socketRead
This indicates that there is not a bottleneck regarding synchronized java resources. If the threads have similar stacks and Runnable, they are most likely in socketRead waiting for a response from a remote server.
What does the stack of the thread look like? What is the thread waiting on (LDAP, Oracle, HttpURLConnection, etc)?
IBM Software Group
WebSphere® Support Technical Exchange 23
Analyzing Javacores – Servlet Engine Transport Threads
2. Most of the threads are in monitor wait or conditional wait (State: MW or CW) indicating a potential bottleneck in synchronized code
What does the stack look like for the threads that are in MW or CW?What thread holds the monitor that is blocking these threads? This thread should be reviewed in each thread dump.
– Does the stack of this thread remain the same? If so, none of the threads that are waiting on that monitor have been able to continue processing.
– Does the stack of this thread change? If enough threads are attempting to pass through the code that is synchronized and the thread that holds the monitor takes a relatively long time to release it, this could cause a hang or performance degradation
B. Threads are busy and many of them have the same or similar stacks
IBM Software Group
WebSphere® Support Technical Exchange 24
Analyzing Javacores – Servlet Engine Transport Threads
C. Threads are not idle, some or most of the available threads are in use, and they all appear to be doing something different. This can indicate a resource issue.
Is CPU utilization high? High CPU will cause threads to contend for CPU cycles
What does memory utilization look like?
What does VerboseGC data look like?When GC runs, all other threads are suspended.
IBM Software Group
WebSphere® Support Technical Exchange 25
Summary
Identifying when a JVM is encountering a performance degradation
Collecting the documentation required to debug the problem
Locating the data collected
Introduction to the content of javacores
Analyzing javacores for a WebSphere Application Server JVM
IBM Software Group
WebSphere® Support Technical Exchange 26
Additional WebSphere Product Resources
Discover the latest trends in WebSphere Technology and implementation, participate in technically-focused briefings, webcasts and podcasts at: www.ibm.com/developerworks/websphere/community/Learn about other upcoming webcasts, conferences and events: www.ibm.com/software/websphere/events_1.htmlJoin the Global WebSphere User Group Community: www.websphere.org Access key product show-me demos and tutorials by visiting IBM Education Assistant: ibm.com/software/info/education/assistantLearn about the Electronic Service Request (ESR) tool for submitting problems electronically: www.ibm.com/software/support/viewlet/probsub/ESR_Overview_viewlet_swf.htmlSign up to receive weekly technical My support emails: www.ibm.com/software/support/einfo.html
IBM Software Group
WebSphere® Support Technical Exchange 27
Questions and Answers