Question

我有以下代码，我正在运行

try:
    child = pexpect.spawn(
        ('some command --path {0}  somethingmore --args {1}').format(
            <iterator-output>,something),
        timeout=300)
    child.logfile = open(file_name,'w')
    child.expect('x*')
    child.sendline(something)
    child.expect('E*')
    child.sendline(something))
   #child.read()
    child.interact()
    time.sleep(15)
    print child.status
except Exception as e:
    print "Exception in child process"
    print str(e)

现在，pexpect中的命令通过从循环中获取一个输入来创建子进程，现在每次它旋转一个子进程我尝试通过child.read捕获日志，在这种情况下它等待该子进程在再次进入循环之前完成，我如何让它继续在后台运行它（我得到动态输入的命令输入/输出的日志，但不是之后运行的进程的日志，除非我使用读取或交互？我使用了这个How do I make a command to run in background using pexpect.spawn?但它使用了交互，它再次等待该子进程完成..因为循环将被迭代超过100次我不能等待一个完成然后移动到其他子进程，因为命令在pexpect是一个AWS lambda调用，我需要确保命令被触发但是我无法在不等待它完成的情况下捕获该调用的进程输出....请告诉我你的建议

Answer 1

如果你想在后台运行一个进程，但同时与它进行交互，最简单的解决方案就是启动一个线程来与进程交互。^*

在你的情况下，听起来你正在运行数百个进程，所以你想要并行运行其中一些进程，但可能不是同时运行它们中的所有进程？如果是这样，您应该使用线程池或执行程序。例如，如果您的Python太旧，则使用stdlib中的concurrent.futures（或pip install futures backport：

def run_command(path, arg):
    try:
        child = pexpect.spawn(('some command --path {0}  somethingmore --args {1}').format(path, arg), timeout=300)
        child.logfile = open(file_name,'w')
        child.expect('x*')
        child.sendline(something)
        child.expect('E*')
        child.sendline(something))
        # child.read()
        child.interact()
        time.sleep(15)
        print child.status
    except Exception as e:
        print "Exception in child process"
        print str(e)

with concurrent.futures.ThreadPoolExecutor(max_workers=8) as x:
    fs = []
    for path, arg in some_iterable:
        fs.append(x.submit(run_command, path, arg))
    concurrent.futures.wait(fs)

如果您需要从线程代码返回值（或引发异常），您可能需要循环as_completed(fs)而不是普通wait。但是在这里，你似乎只是print出来然后忘掉了它。

如果path, arg确实直接来自这样的迭代，那么使用x.map(run_command, some_iterable)通常会更简单。

所有这些（以及其他选项）都在模块文档中得到了很好的解释。

另请参阅pexpect FAQ和common problems。我不认为在当前版本中会有任何影响你的问题（我们总是产生孩子并完全在一个线程池任务中与它进行交互），但我依旧记得曾经有一个额外的问题在过去（与信号有关吗？）。

_{*我认为asyncio是一个更好的解决方案，但据我所知，没有以非阻塞方式分叉或重新实现pexpect的尝试都不足以实际使用...}

Answer 2

如果您实际上并不希望与许多进程并行进行交互，而是希望简单地与每个进程进行交互，那么只需在运行时忽略它并转移到与下一个进程交互...

# Do everything up to the final `interact`. After that, the child
# won't be writing to us anymore, but it will still be running for
# many seconds. So, return the child object so we can deal with it
# later, after we've started up all the other children.
def start_command(path, arg):
    try:
        child = pexpect.spawn(('some command --path {0}  somethingmore --args {1}').format(path, arg), timeout=300)
        child.logfile = open(file_name,'w')
        child.expect('x*')
        child.sendline(something)
        child.expect('E*')
        child.sendline(something))
        # child.read()
        child.interact()
        return child
    except Exception as e:
        print "Exception in child process"
        print str(e)

# First, start up all the children and do the initial interaction
# with each one.
children = []
for path, args in some_iterable:
    children.append(start_command(path, args))

# Now we just need to wait until they're all done. This will get
# them in as-launched order, rather than as-completed, but that
# seems like it should be fine for your use case.
for child in children:
    try:
        child.wait()
        print child.status
    except Exception as e:
        print "Exception in child process"
        print str(e)

一些事情：

从代码评论中注意到，我假设孩子在初次互动后没有向我们写任何东西（并等待我们阅读）。如果事实并非如此，事情就会复杂一些。

如果你不仅要这样做，而且还要一次启动8个孩子，或者甚至一次启动所有孩子，你可以（如我的另一个答案所示）使用执行器或只是乱七八糟的线程初始start_command调用，并让这些任务/线程返回子对象以便稍后wait编辑。例如，对于Executor版本，每个未来的result()都将是一个pexpect子进程。但是，你肯定需要在这种情况下阅读线程上的pexpect文档 - 对于某些版本的linux，在线程之间传递子进程对象会破坏对象。

最后，既然您现在看到的内容远比原始版本多，那么您可能需要更改print语句，以显示您要为哪个孩子打印（这也是可能意味着将children从子项列表更改为(child, path, arg)元组列表等。

在后台运行pexpect子进程

2 个答案: