Python concurrent.future在完成之前停止

时间:2015-01-22 23:12:44

标签: python concurrency

我有一个简单的并发脚本设置如下:

import concurrent.futures
import json
import requests
import datetime
import sys

from datetime import datetime
from time import sleep

executor = concurrent.futures.ThreadPoolExecutor(max_workers=5)

poll_worker = None
request_headers = {"Content-Type": "application/json"}

def addBulkListings(bulk_listings):
    print("Adding listings")
    future = executor.submit(runBulkListings, bulk_listings)
    future.add_done_callback(addPollBulkID)

# Future callback
def addPollBulkID(future):
    if not future.exception():
        poll_id = future.result().json()['results']
        print("Polling id: %s" % poll_id)
        new_future = executor.submit(pollBulkListing, poll_id)
        new_future.add_done_callback(callProcessMatches)
        print(new_future.result())
    else:
        print("Error getting Poll ID")
        print(future.exception())

# Future callback
def callProcessMatches(future):
    print("callProcessMatches")
    if not future.exception():
        print("Processing matches")
        result = future.result()
        new_future = executor.submit(processMatches, result.json())
        new_future.add_done_callback(finishBulkListing)
    else:
        print("Error polling")
        print(future.exception())

# Future callback
def finishBulkListing(future):
    if not future.exception():
        print(future.result())
    else:
        print("Error processing matches")
        print(future.exception())

# Executor called
def processMatches(response):
    results = []

    for product in response['results']:
        processResults(product, results)

    return results

# Executor called
def pollBulkListing(poll_id):
    start = datetime.now()
    overtime = False

    while not overtime:
        response = requests.get(MAIN_URL + poll_id,
            headers = request_headers)
        if response.status_code == requests.codes.ok:
            return response
        sleep(5)
        overtime = (datetime.now() - start).seconds >= (1 * 60)

    raise requests.exceptions.Timeout

# Executor called
def runBulkListings(bulk_listings):
    response = requests.post(MAIN_URL, 
        data=json.dumps(bulk_listings),
        headers = request_headers)
    response.raise_for_status()
    return response

“addBulkListing”由另一个脚本调用,然后该脚本开始使用执行程序。当我只调用一次addBulkListing时,我已经完成了这项工作,但是如果我两次调用它就会失败。 “addPollBulkID”方法会出错。那里的print语句将被执行而没有例外,但是程序就会退出。无论是否有异常,都会调用“callProcessMatches”中的任何内容。就像我说的,当我只调用addBulkListings时,一切都很好。

猜测:我一直在讨厌这个问题,但我不确定。在我见过人们使用的例子中:

with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor:

但这会创造一个我不想要的背景。我在第一次开始时没有将所有参数都添加到“addBulkListings”函数中,并且需要能够在不重新创建执行程序的情况下添加它们。也许我误会了什么。

感谢您的帮助!

1 个答案:

答案 0 :(得分:0)

啊哈!所以,我会回答我自己的问题,任何其他人都需要它。事实证明我可以通过切换到:

来修复它
executor = concurrent.futures.ProcessPoolExecutor(max_workers=5)

而不是ThreadPoolExecutor。我认为我正在创建的线程存在一些内存重叠。 This StackOverflow answer通过指向正确的方向帮助了我很多(看第二个答案)。