我的客户端/服务器代码的简化版本:
/*------CLIENT CODE------*/
SOCKET server_socket;
...
result = send(server_socket, buffer, ...); //error handling after send
shutdown(server_socket, SD_SEND);
log.("data sent successfully, bytes: %d, waiting for response", buffer.size());
result = recv(server_socket, ...); //wait for response
/*------SERVER CODE------*/
SOCKET client_socket;
... // client_socket I/O mode is initialized to non-blocking
for(int retry_num = 0;;)
{
result = recv(client_socket, ...);
if(result == 0)
{
log.("connection gracefully closed by client");
break;
}
else if(result > 0)
{
log("received %d bytes", result);
//add received chunk to buffer and loop again
}
else // error occured
{
switch(WSAGetLastError())
{
case WSAEWOULDBLOCK:
if (++retryNumber == MAX_RETRY_NUMBER) //MAX_RETRY_NUMBER = 10
{
log.("timeout reached on recv()");
throw ...
}
log.("socket would block, retrying after short timeout");
sleep(DELAY); //about 50 milliseconds
break;
...
}
}
}
result = send(client_socket, ...); //send response
来自正确流程的示例日志:
客户端:数据发送成功,字节:150,等待响应
SERVER:收到150个字节
SERVER:客户端正常关闭连接
有时流量因服务器端超时而崩溃,日志形成不良流量:
客户端:数据发送成功,字节:150,等待响应
SERVER:收到150个字节
SERVER:socket会阻塞,短暂超时后重试
...以上日志发生MAX_RETRY_NUMBER - 多次1次
SERVER:在recv()上达到超时
在 shutdown()方法期间,有时服务器无法接收客户端发送的 FIN 信号。
Windows启动后,任务计划程序会触发客户端应用程序。
在工作Windows上执行案例时,错误案例占所有案例的5-10%且不可重现,仅在Windows启动后触发应用程序时才会发生。
什么会导致这种奇怪的行为?也许Winsock有时在Windows启动后出现故障/忙碌?