sem_wait未使用EINTR解锁

时间:2015-11-22 10:44:55

标签: c linux semaphore netbsd

我是信号量的新手,想要在我的程序中添加多线程,但我无法解决以下问题:sem_wait()应该能够接收EINTR并解锁,只要我没有' t设置SA_RESTART标志。我发送SIGUSR1到sem_wait()中阻塞的工作线程,它确实收到信号并被中断,但它会继续阻塞,所以它永远不会给我一个-1返回代码和errno = EINTR 。但是,如果我从主线程执行sem_post,它将解除阻塞,给我一个EINTR的错误,但是RC为0.我对此行为感到十分困惑。这是一些奇怪的NetBSD实现还是我在这里做错了什么?根据手册页,sem_wait符合POSIX.1(ISO / IEC 9945-1:1996)。一个简单的代码:

#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <signal.h>
#include <pthread.h>
#include <semaphore.h>

typedef struct workQueue_s
{
   int full;
   int empty;
   sem_t work;
   int sock_c[10];
} workQueue_t;

void signal_handler( int sig )
{
   switch( sig )
   {
      case SIGUSR1:
      printf( "Signal: I am pthread %p\n", pthread_self() );
      break;
   }
}

extern int errno;
workQueue_t queue;
pthread_t workerbees[8];

void *BeeWork( void *t )
{
   int RC;
   pthread_t tid;
   struct sigaction sa;
   sa.sa_handler = signal_handler;
   sigaction( SIGUSR1, &sa, NULL );

   printf( "Bee: I am pthread %p\n", pthread_self() );
   RC = sem_wait( &queue.work );
   printf( "Bee: got RC = %d and errno = %d\n", RC, errno );

   RC = sem_wait( &queue.work );
   printf( "Bee: got RC = %d and errno = %d\n", RC, errno );
   pthread_exit( ( void * ) t );
}

int main()
{
   int RC;
   long tid = 0;
   pthread_attr_t attr;
   pthread_attr_init( &attr );
   pthread_attr_setdetachstate( &attr, PTHREAD_CREATE_JOINABLE );

   queue.full = 0;
   queue.empty = 0;
   sem_init( &queue.work, 0, 0 );

   printf( "I am pthread %p\n", pthread_self() );
   pthread_create( &workerbees[tid], &attr, BeeWork, ( void * ) tid );
   pthread_attr_destroy( &attr );

   sleep( 2 );
   sem_post( &queue.work );
   sleep( 2 );
   pthread_kill( workerbees[tid], SIGUSR1 );
   sleep( 2 );

   // Remove this and sem_wait will stay blocked
   sem_post( &queue.work );
   sleep( 2 );
   return( 0 );
}

我知道printf在信号处理程序中没有大声说出来,但只是为了它,如果我删除它,我会得到相同的结果。

这些是没有sem_post的结果:

I am pthread 0x7f7fffc00000
Bee: I am pthread 0x7f7ff6c00000
Bee: got RC = 0 and errno = 0
Signal: I am pthread 0x7f7ff6c00000

使用sem_post:

I am pthread 0x7f7fffc00000
Bee: I am pthread 0x7f7ff6c00000
Bee: got RC = 0 and errno = 0
Signal: I am pthread 0x7f7ff6c00000
Bee: got RC = 0 and errno = 4

我知道我真的不需要解锁,只需退出main,但我想看看它是否正常工作。我使用sem_wait的原因是因为我希望保持工作线程保持活动状态,并且只要有来自Postfix的新客户端连接,就用sem_post唤醒一个等待最长的主线程。我不想一直做pthread_create,因为我会每秒多次接听电话而且我不想失去速度并使Postfix对新的smtpd客户端没有反应。这是Postfix的policydaemon,服务器很忙。

我在这里遗漏了什么吗? NetBSD刚搞砸了吗?

1 个答案:

答案 0 :(得分:0)

我的帖子是关于Linux上的行为,但我认为你可能有类似的行为,或者至少我认为可能会有所帮助。如果没有,请告诉我,我会删除这种无用的噪音&#39;。

我试图重现你的设置,我很惊讶你看到你所描述的事情发生了。更深入地帮助我弄清楚实际上有更多的东西;如果你想看看,你会看到类似的东西:

[pid  6984] futex(0x6020e8, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  6983] rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
[pid  6983] rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0
[pid  6983] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  6983] nanosleep({2, 0}, 0x7fffe5794a70) = 0
[pid  6983] tgkill(6983, 6984, SIGUSR1 <unfinished ...>
[pid  6984] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid  6983] <... tgkill resumed> )      = 0
[pid  6984] --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_TKILL, si_pid=6983, si_uid=500} ---
[pid  6983] rt_sigprocmask(SIG_BLOCK, [CHLD],  <unfinished ...>
[pid  6984] rt_sigreturn( <unfinished ...>
[pid  6983] <... rt_sigprocmask resumed> [], 8) = 0
[pid  6984] <... rt_sigreturn resumed> ) = -1 EINTR (Interrupted system call)

查看ERESTARTSYSEINTR的行:正在中断的系统调用实际上是rt_sigreturn resumed,而不是futex(sem_wait的基础系统调用)正如您预期的那样。 我必须说我很困惑但是读这个男人给了一些有趣的线索(男人7信号):

   If  a blocked call to one of the following interfaces is interrupted by
   a signal handler, then the call will be automatically  restarted  after
   the  signal  handler returns if the SA_RESTART flag was used; otherwise
   the call will fail with the error EINTR:
[...]

       * futex(2)  FUTEX_WAIT  (since  Linux  2.6.22;  beforehand,  always
         failed with EINTR).

所以我猜你有一个类似行为的内核(参见netBSD doc?),你可以观察到系统调用会自动重启而你没有任何机会看到它。

那就是说,我已经从你的程序中完全删除了sem_post()并且只是发送了信号给了&#39; break&#39; sem_wait()ans看着我看到的strace(过滤蜜蜂线程):

[pid  8309] futex(0x7fffc0470990, FUTEX_WAIT_PRIVATE, 0, NULL <unfinished ...>
[pid  8309] <... futex resumed> )       = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
[pid  8309] --- SIGUSR1 {si_signo=SIGUSR1, si_code=SI_TKILL, si_pid=8308, si_uid=500} ---
[pid  8309] rt_sigreturn()              = -1 EINTR (Interrupted system call)
[pid  8309] madvise(0x7fd5f6019000, 8368128, MADV_DONTNEED) = 0
[pid  8309] _exit(0)

我必须说我不掌握细节,但内核似乎找到了我试图站立的地方并使整个事情具有正确的行为:

Bee: got RC = -1 and errno = Interrupted system call