刚开始学习Python和Beautiful Soup,所以请保持温柔。我一直在苦苦挣扎的一个问题是,在抓取时,我只能得到以下代码的回报:
import bs4
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.dailyfaceoff.com/teams/pittsburgh-penguins/line-combinations/'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html,"html.parser")
containers = page_soup.findAll("div",{"class":"team-line-combination-wrap"})
for container in containers:
name_container = container.findAll("span",{"class":"player-name"})
name = name_container[0].text
print(name)
只是不确定是什么原因导致了这种情况的发生。当我检查name_container时,它包含我使用findAll找到的所有结果,但是当循环开始时,它仅返回一个结果。任何帮助/指导/指导都将不胜感激。
答案 0 :(得分:0)
您需要遍历所有name_container
,而不仅仅是containers
:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
my_url = 'https://www.dailyfaceoff.com/teams/pittsburgh-penguins/line-combinations/'
uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = soup(page_html,"html.parser")
data = page_soup.findAll("div",{"class":"team-line-combination-wrap"})
for container in data:
name_container = container.findAll("span",{"class":"player-name"})
for cont in name_container:
print(cont.text)
输出:
Jake Guentzel
Sidney Crosby
Bryan Rust
.
.
.
Matt Murray
Olli Maatta
Process finished with exit code 0