Question

我想从我的Jupyter笔记本中动态启动集群以获取特定功能。虽然我可以启动集群并让引擎运行，但我遇到了两个问题：

（1）我无法在后台运行ipcluster命令。当我通过笔记本运行命令时，单元格一直运行到集群运行的时间，即我无法在同一个笔记本中运行其他单元格。一旦他们在不同的笔记本电脑中被解雇，我就可以使用它们。如何在后台运行ipcluster？

（2）无论ipcluster_config.py中的设置如何，我的代码总是启动8个引擎。

代码：

server_num = 3
ip_new = '10.0.1.' + str(10+server_num)
cluster_profile = "ssh" + str(server_num)

import commands
import time
commands.getstatusoutput("ipython profile create --parallel --profile=" + cluster_profile)

text = """
c.SSHEngineSetLauncher.engines = {'""" +ip_new + """' : 12}
c.LocalControllerLauncher.controller_args = ["--ip=10.0.1.163"]
c.SSHEngineSetLauncher.engine_cmd = ['/home/ubuntu/anaconda2/pkgs/ipyparallel-6.0.2-py27_0/bin/ipengine']
"""

with open("/home/ubuntu/.ipython/profile_" + cluster_profile + "/ipcluster_config.py", "a") as myfile:
    myfile.write(text)

result = commands.getstatusoutput("(ipcluster start --profile='"+ cluster_profile+"' &)")
time.sleep(120)
print(result[1])

Answer 1

当我在StackOverflow上看到您的答案没有答案时，我几乎心脏病发作，因为我也遇到了同样的问题。

但是运行

ipcluster start --help

命令显示如下：

--daemonize

这使它在后台运行。

因此，您可以在笔记本中执行以下操作：

no_engines = 6
!ipcluster start -n {no_engines} --daemonize

注意：根据

，该功能在Windows上不起作用

ipcluster start --help

Answer 2

我不熟悉commands模块的详细信息（根据https://docs.python.org/2/library/commands.html，自2.6起不推荐使用），但是我知道subprocess模块捕获输出将使使解释程序块直到系统调用完成。

此外，如果您使用ipcluster命令，即使没有调整配置文件，也可以从命令行设置引擎数。所以，这样的事情对我有用：

from ipyparallel import Client
import subprocess

nengines = 3 # or whatever
subprocess.Popen(["ipcluster", "start", "-n={:d}".format(nengines)])
rc = Client()
# send your jobs to the engines; when done do
subprocess.Popen(["ipcluster", "stop"])

这当然不能解决动态添加或删除主机的问题（从您的代码中看起来好像您正在尝试这样做），但是如果您只关心有多少个主机可用，而不关心，您可以进行默认的ipcluster配置，其中包括所有可能的主机，并通过与上述类似的代码按需分配它们。

还请注意，ipcluster可能需要一两秒钟才能启动，因此您可能希望在第一个time.sleep调用和尝试生成客户端之间添加一个subprocess.Popen调用。

从代码启动ipcluster

2 个答案: