无法使用Python美丽的汤来抓取特定网站

时间:2017-09-01 05:49:45

标签: python web-scraping beautifulsoup

我一直试图用BS抓这个网页,但无济于事。谁能帮助我?我不确定这个网页有什么问题,或者我的代码有问题。

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup

my_url = "https://www.cea.gov.sg/Custom/CEA/PublicRegister/Page/PublicRegisterDetail.aspx?UserId=ae0cdf1d-a30c-4c8c-9f80-b2cec17b4bd9"

uClient = uReq(my_url)
page_html = uClient.read()
uClient.close()
page_soup = Soup(page_html, "html.parser")
nameList2 = page_soup.findAll("span")

print (nameList2.string[1])

1 个答案:

答案 0 :(得分:0)

你可以这样试试。我没有发现任何问题。

 itemTemplate: function(value) {
    return $("<audio controls>").css({
        display: "inline-block",
        width: "50px",
        height: "22px"
    });
},

结果:

import requests
from bs4 import BeautifulSoup

response = requests.get("https://www.cea.gov.sg/Custom/CEA/PublicRegister/Page/PublicRegisterDetail.aspx?UserId=ae0cdf1d-a30c-4c8c-9f80-b2cec17b4bd9")
soup = BeautifulSoup(response.text,"html.parser")
for item in soup.select(".form-wrap"):
    Name = item.select_one("#FtPublicRegisterDetail_LblName").get_text()
    Agent_Name = item.select_one("#FtPublicRegisterDetail_LblEstAgentName").get_text()
    print(Name, Agent_Name)

如果您愿意,只使用“span”:

A R N MADANAGOPALAN (MADAN) PROPNEX REALTY PTE LTD

结果:

import requests
from bs4 import BeautifulSoup

response = requests.get("https://www.cea.gov.sg/Custom/CEA/PublicRegister/Page/PublicRegisterDetail.aspx?UserId=ae0cdf1d-a30c-4c8c-9f80-b2cec17b4bd9")
soup = BeautifulSoup(response.text,"html.parser")

doc_list = soup.select("span")

for item in range(len(doc_list)):
    names = soup.select("span")[item].text
    print(names)