Web抓取python,empy输出

时间:2018-07-26 07:33:27

标签: python web-scraping beautifulsoup

我需要以下代码的帮助:

import csv
import requests
from bs4 import BeautifulSoup
import datetime

filename = "imob_" + datetime.datetime.now().strftime("%Y-%m-%d-%H-%M")+".csv"
with open(filename, "w+") as f:
    writer = csv.writer(f)
    writer.writerow(["Localizare","Pret","Data"])

    for i in range(1,100):
        r = requests.get("https://www.imobiliare.ro/inchirieri-case-vile/brasov?pagina="+format(i))

        soup = BeautifulSoup(r.text, "html.parser")
        array_price= soup.find_all('div', class_='pret')
        array_desc=soup.find_all('h2', class_='titlu-anunt hidden-xs',text=True)
        for iterator in range(0,len(array_price)):
            localizare = array_desc[iterator].text.strip()
            pret = array_price[iterator].text.strip()

            writer.writerow([localizare, pret, datetime.datetime.now()]) 

输出为空。有人可以给我个建议吗?谢谢。

1 个答案:

答案 0 :(得分:0)

您遇到了几个问题:

首先,如注释中所述,类price不存在。您可以使用pret,但使用soup.find_all('span', class_="pret-mare")

更容易

第二array_desc=soup.find_all('h2', class_='titlu-anunt hidden-xs',text=True)返回空。我删除了text=True,它开始起作用。

import csv
import requests
from bs4 import BeautifulSoup
import datetime

filename = "imob_" + datetime.datetime.now().strftime("%Y-%m-%d-%H-%M")+".csv"
with open(filename, "w+") as f:
    writer = csv.writer(f)
    writer.writerow(["Localizare","Pret","Data"])

    for i in range(1,100):
        r = requests.get("https://www.imobiliare.ro/inchirieri-case-vile/brasov?pagina="+format(i))

        soup = BeautifulSoup(r.text, "html.parser")
        array_price = soup.find_all('span', class_="pret-mare")
        array_desc=soup.find_all('h2', class_='titlu-anunt hidden-xs')
        for iterator in range(0,len(array_price)):
            localizare = array_desc[iterator].text.strip()
            pret = array_price[iterator].text.strip()

            writer.writerow([localizare, pret, datetime.datetime.now()])
相关问题