Question

我需要在许多远程计算机上运行Java程序。我在循环中使用ssh并调用运行Java程序的远程脚本。

可以想象，这用于测试群集上的分布式系统。

问题是，在输入第一个ssh会话的密码后，脚本立即挂起。这可能是一个bash错误，因为Java程序在本地运行正常。

确切的结构是这个，一个运行许多远程bash脚本的本地bash脚本。每个远程脚本都编译并运行Java程序。这个Java程序启动一个单独的线程来完成一些工作。收到SIGINT信号时，会通知Java线程，以便它可以干净地退出。

我做了一个简化的工作示例。

编辑：下面的代码现在可以使用（为子孙后代修复）

如果你想回答，请不要过多地改变代码的结构，否则它将不会像原来那样，我将无法理解错误。

手动运行的Bash脚本

#!/bin/bash

function startBatch()
{
    #the problem was using -n
    ssh -f "$1" "cd $projectDir;./startBatch.sh $2"
}

function stopBatch()
{
    #the problem was using -n
    ssh -f "$1" "pkill -f jnode_.*"
}

projectDir=NetBeansProjects/Runner

#start nodes
nodeNumber=0
while read node; do
    startBatch "$node" "$nodeNumber"
    nodeNumber=$(($nodeNumber + 1))
done < ./nodes.txt

sleep 3

#stop nodes
while read node; do
    stopBatch "$node"
done < ./nodes.txt

由其他脚本运行的Bash脚本

#!/bin/bash

#this is a simplified working example
myNumber=$1
$(exec -a jnode_"$myNumber" java -cp build/classes runner.Runner "$myNumber.txt")

以上是上述脚本的简化版本。如果您想要正确记录，请检查接受答案的第二部分。

#!/bin/bash

batchNumber=$1
procNumber=0
batchSize=3
while [ "$procNumber" -lt "$batchSize" ]; do
    procName="$batchNumber"_"$procNumber"
    #this line was no good
    #$(exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" &)
    #this line works fine
    exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" 1>/dev/null 2>/dev/null &
    procNumber=$(($procNumber + 1))
done

Java Runner（启动线程的东西）

import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintStream;

public class Runner {

    public static void main(String[] args) throws FileNotFoundException, InterruptedException {
        //redirect all outputs to a given file
        PrintStream output = new PrintStream(new File(args[0]));
        System.setOut(output);
        System.setErr(output);

        //controlled object
        final MyRunnable myRunnable = new MyRunnable();

        //shutdown the controlled process on command
        Runtime.getRuntime().addShutdownHook(new Thread() {
            @Override
            public void run() {
                myRunnable.stop = true;
            }
        });

        //run the process
        new Thread(myRunnable).start();
    }
}

Java MyRunnable（正在运行的线程）

public class MyRunnable implements Runnable {

    public boolean stop = false;

    @Override
    public void run() {
        while (!stop) {
            try {
                System.out.println("running");
                Thread.sleep(1000);
            } catch (InterruptedException ex) {
                System.out.println("interrupted");
            }
        }
        System.out.println("stopping");
    }
}

不要在Java程序中使用System.exit（），否则将无法正确调用（或完全执行）关闭钩子。从外面发送SIGINT消息。

正如评论中提到的，输入密码可能很无聊。无密码RSA密钥是一种选择，但我们可以做得更好。让我们添加一些安全功能。

创建公钥/私钥对

ssh-keygen -t rsa
Enter file in which to save the key (home/your_user/.ssh/id_rsa): [input ~/.ssh/nameOfKey]
Enter passphrase (empty for no passphrase): [input a passphrase not weaker than your ssh password]

将公钥添加到远程主机的authorized_keys文件中，以便对其进行身份验证。

#first option (use proper command)
ssh-copy-id user@123.45.67.89

#second option (append the key at the end of the file)
cat ~/.ssh/nameOfKey.pub | ssh user@123.45.67.89 "cat >> ~/.ssh/authorized_keys"

现在，如果我们使用ssh-agent，我们可以这样做，以便密码只会被询问一次（执行第一个命令时）。请注意，它会询问密码（创建密钥时输入的密码），而不是实际的ssh密码。

#activate the agent
eval `ssh-agent`

#add the key, its passphrase will be asked
ssh-add ~/.ssh/keyName1

#add more keys, if needed
ssh-add ~/.ssh/keyName2

现在，您的分布式系统有一个非常简单但功能强大的测试框架。玩得开心。

Answer 1

ssh的手册页建议如果ssh需要输入密码，则使用-n将无效。您应该使用-f，或设置无密码ssh，这样您就不需要输入密码了。

从ssh的Mac OS X手册页中引用：

 -n      Redirects stdin from /dev/null (actually, prevents reading from stdin).  This must be used when ssh is run in the background.  A common trick is to use this to run X11
         programs on a remote machine.  For example, ssh -n shadows.cs.hut.fi emacs & will start an emacs on shadows.cs.hut.fi, and the X11 connection will be automatically for-
         warded over an encrypted channel.  The ssh program will be put in the background.  (This does not work if ssh needs to ask for a password or passphrase; see also the -f
         option.)

还有：

 -f      Requests ssh to go to background just before command execution.  This is useful if ssh is going to ask for passwords or passphrases, but the user wants it in the back-
         ground.  This implies -n.  The recommended way to start X11 programs at a remote site is with something like ssh -f host xterm.

         If the ExitOnForwardFailure configuration option is set to ``yes'', then a client started with -f will wait for all remote port forwards to be successfully established
         before placing itself in the background.

Answer 2

执行远程命令时，SSH将不会退出，直到远程命令完成。在Java程序完成之前，您的远程脚本不会退出，并且Java程序将不会退出，直到其所有非守护程序线程都退出，并且您的Java程序将永远运行。因此，您的服务器端SSH调用将永远运行（好吧，直到您通过其他方式终止它）并且您的脚本挂起。

您需要决定何时立即返回SSH远程命令。你有选择。最简单的可能只是在服务器脚本上使用&调用它，如：

ssh -n "$1" "cd $projectDir;./startBatch.sh $2 &"

更强大的选项是在远程脚本中使用java调用&，让服务器端现在运行（没有&），就像你一样有机会完全阅读例如远程脚本生成的错误消息。

附注：至于密码本身（一旦你超越目前的障碍，你最终必须处理），正如我在这个问题的评论中所提到的：一种可能性就是创建一个无密码密钥（{{1在你的机器上然后将公钥粘贴在每台远程机器上的ssh-keygen -t rsa中，然后从机器连接时就不必处理密码了。 SSH密码提示有时会对脚本交互性造成严重破坏。附带相关的安全隐患，但它们可能对您的情况无关紧要。

回应以下评论。你有几个选择。如果要将所有内容捕获到同一个日志文件中，请使用append，不要重定向程序输出，只需将while循环所做的一切重定向到日志，例如：

authorized_keys2

如果您希望每个进程有一个日志，请附加：

while [ "$procNumber" -lt "$batchSize" ]; do
    procName="$batchNumber"_"$procNumber"
    exec -a jnode_"$procName" java -cp build/classes runner.Runner "$procName.txt" &
    procNumber=$(($procNumber + 1))
done >> "$myLog" 2>&1

如果要将应用程序输出与循环中其他命令的输出分开，也可以将上述两种结合起来。

使用Bash ssh在多台远程计算机上运行Java程序

2 个答案: