詹金斯奴隶挂/詹金斯楔入

时间:2012-03-30 18:36:30

标签: jenkins

我们有一个间歇性的问题,奴隶在工作完成后就会挂起。在后处理步骤(?)中,我们看到控制台日志具有以下行:

Description set: vap_current_iter-2012_03_29_19_01_03

然后什么都没有。通常,它看起来像这样:

Description set: prod_pull-2012_03_28_19_01_03
Notifying upstream build armada_Launch_prod_pull #13 of job completion
Project armada_Launch_prod_pull still waiting for 1 builds to complete
Notifying upstream projects of job completion
Notifying upstream of completion: armada_Launch_prod_pull #13
Finished: SUCCESS

我为hudson.model.Run设置了一个记录器,目前它有:

    at java.lang.Thread.run(Thread.java:619)

Mar 30, 2012 12:44:00 PM hudson.model.Run run
INFO: galleon_allUnit #1134 main build action completed: SUCCESS
Mar 30, 2012 12:44:00 PM hudson.model.Run setResult
FINE: galleon_allUnit #1134 : result is set to SUCCESS
java.lang.Exception
    at hudson.model.Run.setResult(Run.java:352)
    at hudson.model.Run.run(Run.java:1410)
    at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:46)
    at hudson.model.ResourceController.execute(ResourceController.java:88)
    at hudson.model.Executor.run(Executor.java:238)

为每个挂起的奴隶重复。

主哈德森日志没有任何其他信息。

断开从属设备无效。

尝试按顺序关闭Jenkins没有任何效果(jenkins实际上似乎在关机时挂起)。

我们发现恢复的唯一方法是杀死-9 tomcat进程。

其中一个奴隶的胎面倾卸(它们都是一样的)是:

Thread Dump
Channel reader thread: channel

"Channel reader thread: channel" Id=9 Group=main RUNNABLE (in native)
    at java.io.FileInputStream.readBytes(Native Method)
    at java.io.FileInputStream.read(FileInputStream.java:199)
    at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
    -  locked java.io.BufferedInputStream@1ae615a
    at java.io.ObjectInputStream$PeekInputStream.peek(ObjectInputStream.java:2249)
    at java.io.ObjectInputStream$BlockDataInputStream.peek(ObjectInputStream.java:2542)
    at java.io.ObjectInputStream$BlockDataInputStream.peekByte(ObjectInputStream.java:2552)
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1297)
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351)
    at hudson.remoting.Channel$ReaderThread.run(Channel.java:1030)


main

"main" Id=1 Group=main WAITING on hudson.remoting.Channel@e1d5ea
    at java.lang.Object.wait(Native Method)
    -  waiting on hudson.remoting.Channel@e1d5ea
    at java.lang.Object.wait(Object.java:485)
    at hudson.remoting.Channel.join(Channel.java:766)
    at hudson.remoting.Launcher.main(Launcher.java:420)
    at hudson.remoting.Launcher.runWithStdinStdout(Launcher.java:366)
    at hudson.remoting.Launcher.run(Launcher.java:206)
    at hudson.remoting.Launcher.main(Launcher.java:168)


Ping thread for channel hudson.remoting.Channel@e1d5ea:channel

"Ping thread for channel hudson.remoting.Channel@e1d5ea:channel" Id=10 Group=main TIMED_WAITING
    at java.lang.Thread.sleep(Native Method)
    at hudson.remoting.PingThread.run(PingThread.java:86)


Pipe writer thread: channel

"Pipe writer thread: channel" Id=12 Group=main WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@14263ed
    at sun.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@14263ed
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925)
    at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358)
    at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
    at java.lang.Thread.run(Thread.java:619)


pool-1-thread-267

"pool-1-thread-267" Id=285 Group=main RUNNABLE
    at sun.management.ThreadImpl.dumpThreads0(Native Method)
    at sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:374)
    at hudson.Functions.getThreadInfos(Functions.java:872)
    at hudson.util.RemotingDiagnostics$GetThreadDump.call(RemotingDiagnostics.java:93)
    at hudson.util.RemotingDiagnostics$GetThreadDump.call(RemotingDiagnostics.java:89)
    at hudson.remoting.UserRequest.perform(UserRequest.java:118)
    at hudson.remoting.UserRequest.perform(UserRequest.java:48)
    at hudson.remoting.Request$2.run(Request.java:287)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:619)

    Number of locked synchronizers = 1
    - java.util.concurrent.locks.ReentrantLock$NonfairSync@1186f88


Finalizer

"Finalizer" Id=3 Group=system WAITING on java.lang.ref.ReferenceQueue$Lock@1798fdd
    at java.lang.Object.wait(Native Method)
    -  waiting on java.lang.ref.ReferenceQueue$Lock@1798fdd
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116)
    at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132)
    at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159)


Reference Handler

"Reference Handler" Id=2 Group=system WAITING on java.lang.ref.Reference$Lock@1d40442
    at java.lang.Object.wait(Native Method)
    -  waiting on java.lang.ref.Reference$Lock@1d40442
    at java.lang.Object.wait(Object.java:485)
    at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116)


Signal Dispatcher

"Signal Dispatcher" Id=4 Group=system RUNNABLE

如何更好地恢复或防止这种情况的任何想法将不胜感激。

1 个答案:

答案 0 :(得分:0)

老实说,我们刚刚写了一个脚本,每天晚上4点重启jenkins。我们发现我们的破损发生在凌晨3点或者需要半小时左右。由于此时重新启动服务器,我们还没有看到任何进一步的挂起。这是一种方法,可以像你要求的那样阻止它,虽然它没有明显“解决”这个问题!