我正在尝试使用deco模块提供的并发功能。代码在没有多个线程的情况下工作,如答案所示....
Extract specific columns from a given webpage
但是以下代码不会返回finallist的任何元素。 (它是空的)。它在" slow"的函数范围内返回一些结果。从印刷声明中可以看出。但为什么外部列表是空的?
import urllib.request
from bs4 import BeautifulSoup
from deco import concurrent, synchronized
finallist=list()
urllist=list()
@concurrent
def slow(url):
#print (url)
try:
page = urllib.request.urlopen(url).read()
soup = BeautifulSoup(page)
mylist=list()
for anchor in soup.find_all('div', {'class':'col-xs-8'})[:9]:
mylist.append(anchor.text)
urllist.append(url)
finallist.append(mylist)
#print (mylist)
print (finallist)
except:
pass
@synchronized
def run():
finallist=list()
urllist=list()
for i in range(10):
url='https://pythonexpress.in/workshop/'+str(i).zfill(3)
print (url)
slow(url)
slow.wait()
答案 0 :(得分:1)
我重构了您的代码以使用该模块。我修复了common pitfalls outlined on the deco wiki中的两个:
结果如下:
import urllib
from bs4 import BeautifulSoup
from deco import concurrent, synchronized
N = 10
@concurrent
def slow(url):
try:
page = urllib.urlopen(url).read()
soup = BeautifulSoup(page, "html.parser")
mylist=list()
for anchor in soup.find_all('div', {'class':'col-xs-8'})[:9]:
mylist.append(anchor.text)
return mylist
except:
pass
@synchronized
def run():
finallist=[None] * N
urllist = ['https://pythonexpress.in/workshop/'+str(i).zfill(3) for i in range(N)]
for i, url in enumerate(urllist):
print (url)
finallist[i] = slow(url)
return finallist
if __name__ == "__main__":
finallist = run()
print(finallist)