1.1查看CPU占用值
通常发生该类故障的时候,会反映在用户响应时间长,weblogic服务器运行速度异常缓慢,请求或者操作出现超时等。在接到故障通知后,登陆问题机器,执行查看进程命令:
ps –ef | grep java
在这里我们要根据具体的告警内容来选出需要查看的进程:
sxydfw 9391 9342 99 20:10 pts/1 01:00:22 /app/wls10/jdk1.6.0_45/bin/java -server -Xms1536m -Xmx1536m -XX:PermSize=128m -XX:MaxPermSize=256m -Dweblogic.Name=ydfwServer2 -Djava.security.policy=/app/wls10/Oracle/Middleware/wlserver_10.3/server/lib/weblogic.policy -Dweblogic.ProductionModeEnabled=true -Dweblogic.security.SSL.trustedCAKeyStore=/app/wls10/Oracle/Middleware/wlserver_10.3/server/lib/cacerts -Dport=7001 -DappName=sxydfw_ydfwServer2 -da -Dplatform.home=/app/wls10/Oracle/Middleware/wlserver_10.3 -Dwls.home=/app/wls10/Oracle/Middleware/wlserver_10.3/server -Dweblogic.home=/app/wls10/Oracle/Middleware/wlserver_10.3/server -Dweblogic.management.discover=false -Dweblogic.management.server=t3://10.190.118.154:37001 -Dfile.encoding=GBK -Dwlw.iterativeDev=false -Dwlw.testConsole=false -Dwlw.logErrorsToConsole=false -Dweblogic.ext.dirs=/app/wls10/Oracle/Middleware/patch_wls1035/profiles/default/sysext_manifest_classpath:/app/wls10/Oracle/Middleware/patch_ocp360/profiles/default/sysext_manifest_classpath weblogic.Server
1.2 获取进程号
从进程信息中能获知该进程的PID为9391,是由应用账号sxydfw启动的。接下来我们要具体看一下该进程对CPU的具体占用情况:
执行查看该进程CPU占用情况的命令:
top -Hp 9391 -d 1 -n 1
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
9509 sxydfw 25 0 2213m 815m 29m R 59.7 10.4 6:57.44 java
9422 sxydfw 25 0 2213m 815m 29m R 57.7 10.4 7:12.52 java
10017 sxydfw 19 0 2213m 815m 29m R 41.8 10.4 5:26.36 java
9507 sxydfw 19 0 2213m 815m 29m R 39.8 10.4 7:26.53 java
9508 sxydfw 25 0 2213m 815m 29m R 39.8 10.4 9:11.57 java
9632 sxydfw 19 0 2213m 815m 29m R 39.8 10.4 3:03.78 java
10116 sxydfw 25 0 2213m 815m 29m R 39.8 10.4 1:44.88 java
10010 sxydfw 19 0 2213m 815m 29m R 31.8 10.4 4:28.82 java
10014 sxydfw 25 0 2213m 815m 29m R 25.8 10.4 3:04.17 java
9419 sxydfw 25 0 2213m 815m 29m R 19.9 10.4 12:03.31 java
该命令可以间隔5秒中左右重复执行几次,重复收集信息来确定占用线程。从上面的截图能看到,排在前面的就是占用CPU高的线程信息。其中线程9509和9422的均占CPU超过百分之五十以上,我们要具体分析这两个线程。
2.1 监控日志
现在生产环境中的weblogic进程均是nohup标准启停脚本来启动的,所以现在进入该问题server的nohup日志目录,用tail –f 命令在后台监控日志的情况。
本次实例中的后台日志为/applog/sxydfwlog下nohup_start_ydfwServer2_XXXX.log,故执行命令:
tail –f nohup_start_ydfwServer2_XXXX.log
通知当日值班的应用支持组的老师,请他用应用账号在后台执行kill -3 进程号 操作,注意要每隔15秒执行一次,执行3次。
本次实例中取进程号进行操作:
Kill -3 9391
此时在监控的日志nohup_start_ydfwServer2_XXXX.log中就会出现关于该进程的Thread Dump信息,然后将之前取得的两个线程号由十进制转换为十六进制,在threaddump信息中查找对应的内容:
9509 -> 2525:
"[ACTIVE] ExecuteThread: ‘4‘ for queue: ‘weblogic.kernel.Default (self-tuning)‘" daemon prio=10 tid=0x00002aaab4ebb000 nid=0x2525 ru
nnable [0x00000000442a8000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:482)
at.com.portal.struts.action.common.UpLoadFileServlet.mergeFile(UpLoadFileServlet.java:224)
at com.portal.struts.action.common.UpLoadFileServlet.doPost(UpLoadFileServlet.java:152)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:300)
at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:26)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
at com.portal.filter.EncodeFilter.doFilter(EncodeFilter.java:48)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3715)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3681)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2277)
at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2183)
at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1454)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:178)
9422- > 24ce:
"[ACTIVE] ExecuteThread: ‘1‘ for queue: ‘weblogic.kernel.Default (self-tuning)‘" daemon prio=10 tid=0x00002aaab4bfb800 nid=0x24ce ru
nnable [0x0000000042389000]
java.lang.Thread.State: RUNNABLE
at java.io.RandomAccessFile.writeBytes(Native Method)
at java.io.RandomAccessFile.write(RandomAccessFile.java:482)
at com.portal.struts.action.common.UpLoadFileServlet.mergeFile(UpLoadFileServlet.java:224)
at com.portal.struts.action.common.UpLoadFileServlet.doPost(UpLoadFileServlet.java:152)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at weblogic.servlet.internal.StubSecurityHelper$ServletServiceAction.run(StubSecurityHelper.java:227)
at weblogic.servlet.internal.StubSecurityHelper.invokeServlet(StubSecurityHelper.java:125)
at weblogic.servlet.internal.ServletStubImpl.execute(ServletStubImpl.java:300)
at weblogic.servlet.internal.TailFilter.doFilter(TailFilter.java:26)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
at com.portal.filter.EncodeFilter.doFilter(EncodeFilter.java:48)
at weblogic.servlet.internal.FilterChainImpl.doFilter(FilterChainImpl.java:56)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.wrapRun(WebAppServletContext.java:3715)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:3681)
at weblogic.security.acl.internal.AuthenticatedSubject.doAs(AuthenticatedSubject.java:321)
at weblogic.security.service.SecurityManager.runAs(SecurityManager.java:120)
at weblogic.servlet.internal.WebAppServletContext.securedExecute(WebAppServletContext.java:2277)
at weblogic.servlet.internal.WebAppServletContext.execute(WebAppServletContext.java:2183)
at weblogic.servlet.internal.ServletRequestImpl.run(ServletRequestImpl.java:1454)
at weblogic.work.ExecuteThread.execute(ExecuteThread.java:209)
at weblogic.work.ExecuteThread.run(ExecuteThread.java:178)
分析上述信息,检查是什么原因引起的CPU高利用率。本次事例中是执行到某个函数时造成了故障,故需要联系开发组老师重点排查标亮部分的方法调用情况。如果weblogic自身的问题,则可以联系BEA客服支持。
原文:http://www.cnblogs.com/rubeitang/p/5740416.html