Question

我的配置：

专用服务器（Ubuntu 16.04 LTS）仅用于Jenkins（2.7.1），
超过100多个Jenkins工作，每个工作都向AWS调用流浪实例（Vagrantfile），
每个作业（供稿脚本）可能需要1-2小时才能运行，
大多数服务器配置文件（例如SSH）都具有默认系统配置。

当我同时运行多个Jenkins实例时，他们更有可能因此错误而失败：

00:00:00.774 + vagrant up --no-provision --destroy-on-error --provider=aws
00:00:09.635 Bringing machine 'MT-aws' up with 'aws' provider...
...
00:01:16.498     MT-aws: Running: inline script
...
00:01:26.415 ==> MT-aws: + echo
00:01:26.415 ==> MT-aws: + sleep 20
00:01:26.427 The SSH connection was unexpectedly closed by the remote end. This
00:01:26.427 usually indicates that SSH within the guest machine was unable to
00:01:26.427 properly start up. Please boot the VM in GUI mode to check whether
00:01:26.427 it is booting properly.
00:01:26.625 Build step 'Execute shell' marked build as failure

事实：

配置脚本在随机位置失败（失败前没有特定代码），
服务器没有过载，并且有足够的可用内存和访问Gbit网络，
我并行运营的工作越多，他们就有更多失败的机会，
单独重新运行相同的工作通常可以正常工作，
/etc/ssh/ssh_config中的默认设置，Jenkins没有~/.ssh/config。

如何在意外关闭的情况下修复上述问题？

我是否需要增加一些SSH超时设置或其他内容？

Answer 1

打开您的for (int i=0; i<a; i++) { // for each row make a *new* arraylist ArrayList<Integer> zero = new ArrayList<Integer>(); // initialize that arraylist for (int j=0; j<a; j++) { zero.add(0); } // then add it to the row res.add(zero); }文件：

/etc/ssh/sshd_config

修改设置如下：

# vi /etc/ssh/sshd_config

其中，

ClientAliveInterval ：设置超时间隔（秒）（30），之后如果没有从客户端收到数据，sshd将通过加密通道发送消息以请求客户端的响应。默认值为0，表示这些消息将会出现   不要发送给客户。此选项仅适用于协议版本2.

ClientAliveCountMax ：设置客户端活动消息（5）的数量，这些消息可以在没有sshd从客户端接收任何消息的情况下发送。如果在发送客户端活动消息时达到此阈值，sshd将断开客户端连接，终止会话。

关闭并保存文件，然后重新启动ClientAliveInterval 30 ClientAliveCountMax 5，例如：

sshd

或：

# /etc/init.d/ssh restart

另一个选项是在客户端（您的工作站）# service sshd restart文件中启用ServerAliveInterval，例如

ssh_config

然后按如下方式追加/修改值：

# vi /etc/ssh/ssh_config

其中，

ServerAliveInterval ：设置超时间隔（以秒为单位），之后如果没有从服务器收到数据，ssh将通过加密通道发送消息以请求服务器的响应。

在上面的示例中，ServerAliveInterval 30 ServerAliveCountMax 5设置为15，ServerAliveInterval保留为3，如果服务器无响应，ssh将在大约45秒后断开连接。同样，此选项仅适用于协议版本2.

Answer 2

根据Chris Roberts的建议，另一种方法是向Vagrantfile添加SSH keep_alive行，例如

config.vm.ssh.keep_alive = true

默认情况下，这将每5秒发送一次SSH保持活动数据包，以保持连接存活。

有关详细信息，请参阅：config.ssh related settings。

如何解决问题＆＃39;远程端意外关闭了SSH连接＆＃39;？

2 个答案: