Debugginng一个意外终止的守护进程

时间:2009-08-22 19:27:47

标签: c debugging signals daemon coredump

我在linux上用c编写一个守护进程。它捕获信号SIGHUP,SIGTERM,SIGINT和SIGQUIT,使用syslog记录它们并退出。如果它收到SIGSEGV核心转储。当这些发生时,一切都按预期发生,但偶尔会退出...不会干净地退出,不记录信号,也不会留下核心转储。我很难过,不知道如何调试问题。除了这些信号之外,它还能以什么方式退出?有一个明显的答案,我错过了什么?您建议在守护进程中调试这样一个看似零星的问题的其他调试实践是什么?

5 个答案:

答案 0 :(得分:3)

如果您的守护程序正在使用网络套接字,则很可能是SIGPIPE - 如果您尝试写入另一方已关闭的套接字(或管道),则会得到此信息。请注意,即使您在写入套接字之前检查套接字是否可写(例如,使用select()),它仍然可以在该检查和写入本身之间关闭。

答案 1 :(得分:2)

您可以让守护程序的父级驻留并等待它,然后让父级记录守护程序退出的原因(即,是否已发出信号或退出)。

答案 2 :(得分:2)

使用

gdb -p <pid>
将gdb附加到它 确保使用-g标志进行编译,并在退出后立即进行回溯。 祝你好运!

答案 3 :(得分:1)

嗯,还有很多其他信号会导致它退出,当然包括SIGKILL,你无法做任何事情。基本上来自man 7 signal Action TermCore First the signals described in the original POSIX.1-1990 standard. Signal Value Action Comment ------------------------------------------------------------------------- SIGHUP 1 Term Hangup detected on controlling terminal or death of controlling process SIGINT 2 Term Interrupt from keyboard SIGQUIT 3 Core Quit from keyboard SIGILL 4 Core Illegal Instruction SIGABRT 6 Core Abort signal from abort(3) SIGFPE 8 Core Floating point exception SIGKILL 9 Term Kill signal SIGSEGV 11 Core Invalid memory reference SIGPIPE 13 Term Broken pipe: write to pipe with no readers SIGALRM 14 Term Timer signal from alarm(2) SIGTERM 15 Term Termination signal SIGUSR1 30,10,16 Term User-defined signal 1 SIGUSR2 31,12,17 Term User-defined signal 2 SIGCHLD 20,17,18 Ign Child stopped or terminated SIGCONT 19,18,25 Cont Continue if stopped SIGSTOP 17,19,23 Stop Stop process SIGTSTP 18,20,24 Stop Stop typed at tty SIGTTIN 21,21,26 Stop tty input for background process SIGTTOU 22,22,27 Stop tty output for background process The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored. Next the signals not in the POSIX.1-1990 standard but described in SUSv2 and POSIX.1-2001. Signal Value Action Comment ------------------------------------------------------------------------- SIGBUS 10,7,10 Core Bus error (bad memory access) SIGPOLL Term Pollable event (Sys V). Synonym of SIGIO SIGPROF 27,27,29 Term Profiling timer expired SIGSYS 12,-,12 Core Bad argument to routine (SVr4) SIGTRAP 5 Core Trace/breakpoint trap SIGURG 16,23,21 Ign Urgent condition on socket (4.2BSD) SIGVTALRM 26,26,28 Term Virtual alarm clock (4.2BSD) SIGXCPU 24,24,30 Core CPU time limit exceeded (4.2BSD) SIGXFSZ 25,25,31 Core File size limit exceeded (4.2BSD) Up to and including Linux 2.2, the default behaviour for SIGSYS, SIGXCPU, SIGXFSZ, and (on architectures other than SPARC and MIPS) SIGBUS was to terminate the process (without a core dump). (On some other Unices the default action for SIGX- CPU and SIGXFSZ is to terminate the process without a core dump.) Linux 2.4 conforms to the POSIX.1-2001 requirements for these signals, terminating the process with a core dump. Next various other signals. Signal Value Action Comment -------------------------------------------------------------------- SIGIOT 6 Core IOT trap. A synonym for SIGABRT SIGEMT 7,-,7 Term SIGSTKFLT -,16,- Term Stack fault on coprocessor (unused) SIGIO 23,29,22 Term I/O now possible (4.2BSD) SIGCLD -,-,18 Ign A synonym for SIGCHLD SIGPWR 29,30,19 Term Power failure (System V) SIGINFO 29,-,- A synonym for SIGPWR SIGLOST -,-,- Term File lock lost SIGWINCH 28,28,20 Ign Window resize signal (4.3BSD, Sun) SIGUNUSED -,31,- Term Unused signal (will be SIGSYS) 的以下内容中的任何内容(尽管后者至少会留下核心转储):

{{1}}

答案 4 :(得分:1)

shell包装器可以捕获守护程序的退出状态。以下是它的工作原理:

$ ./waitstatus true
pid 1512: exit status 0 (success)

$ ./waitstatus false
pid 1514: exit status 1 (abnormal)

$ ./waitstatus perl -e 'exit 21'
pid 1518: exit status 21 (abnormal)

$ ./waitstatus perl -e 'kill TERM => $$'
pid 1520: terminated on signal 15

$ ./waitstatus no-such-command
pid 1522: command not found: no-such-command

$ ./waitstatus /sbin/EACCES.contrived
pid 1524: command not executable: /sbin/EACCES.contrived

......以及它是如何实施的:

$ cat ./waitstatus
#! /bin/bash

"$@" &
PID=$!

wait $PID
STATUS=$?

if   [ $STATUS -gt 128 ]; then
  MSG="terminated on signal $(( $STATUS - 128 ))";
else
  case $STATUS in
    0)
      MSG="exit status 0 (success)"
      ;;
    127)
      MSG="command not found: $1"
      ;;
    126)
      MSG="command not executable: $1"
      ;;
    *)
      MSG="exit status $STATUS (abnormal)"
      ;;
  esac
fi

echo "pid $PID: $MSG"
exit $STATUS

您可能希望将最后echo行更改为系统logger命令的调用,例如,将状态消息定向到 syslog