多进程子函数不返回任何结果

时间:2016-12-18 10:51:49

标签: python python-multiprocessing multiprocess deco

我正在尝试使用deco模块提供的并发功能。代码在没有多个线程的情况下工作,如答案所示....

Extract specific columns from a given webpage

但是以下代码不会返回finallist的任何元素。 (它是空的)。它在" slow"的函数范围内返回一些结果。从印刷声明中可以看出。但为什么外部列表是空的?

import urllib.request
from bs4 import BeautifulSoup
from deco import concurrent, synchronized

finallist=list()
urllist=list()

@concurrent
def slow(url):
    #print (url)
    try:
        page = urllib.request.urlopen(url).read()
        soup = BeautifulSoup(page)
        mylist=list()
        for anchor in soup.find_all('div', {'class':'col-xs-8'})[:9]: 
            mylist.append(anchor.text)
            urllist.append(url)
        finallist.append(mylist)
        #print (mylist)
        print (finallist)
    except:
        pass


@synchronized
def run():
    finallist=list()
    urllist=list()
    for i in range(10):
        url='https://pythonexpress.in/workshop/'+str(i).zfill(3)
        print (url)
        slow(url)
    slow.wait()

1 个答案:

答案 0 :(得分:1)

我重构了您的代码以使用该模块。我修复了common pitfalls outlined on the deco wiki中的两个:

  1. 不要使用全局变量
  2. 使用方括号操作执行所有操作:obj [key] = value
  3. 结果如下:

    import urllib
    from bs4 import BeautifulSoup
    from deco import concurrent, synchronized
    
    N = 10
    
    @concurrent
    def slow(url):
        try:
            page = urllib.urlopen(url).read()
            soup = BeautifulSoup(page, "html.parser")
            mylist=list()
            for anchor in soup.find_all('div', {'class':'col-xs-8'})[:9]: 
                mylist.append(anchor.text)
            return mylist
        except:
            pass
    
    @synchronized
    def run():
        finallist=[None] * N
        urllist = ['https://pythonexpress.in/workshop/'+str(i).zfill(3) for i in range(N)]
        for i, url in enumerate(urllist):
            print (url)
            finallist[i] = slow(url)
        return finallist
    
    if __name__ == "__main__":
        finallist = run()
        print(finallist)