如何将这些经过网络剪切的数据导出到csv文件中?

时间:2019-05-17 06:41:34

标签: python-3.x selenium-webdriver web-scraping beautifulsoup export-to-csv

我是编码和网络抓取的新手,我一直在YouTube上观看过许多教程,但找不到将这些数据写入csv文件的方法。 有人可以帮忙吗?

import pandas as pd
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup


options = Options()
options.add_argument("window-size=1400,600")
from fake_useragent import UserAgent
ua = UserAgent()
a = ua.random
user_agent = ua.random
print(user_agent)
options.add_argument(f'user-agent={user_agent}')


driver = webdriver.Chrome('/Users/raduulea/Documents/chromedriver', options=options)
driver.get('https://www.immoweb.be/fr/recherche/immeuble-de-rapport/a-vendre')

import time
time.sleep(10)

html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')

results = soup.find_all("div", {"class":"result-xl"})

for result in results:
    print(result.find("div", {"class":"title-bar-left"}).get_text())
    print(result.find("span", {"result-adress"}).get_text())
    print(result.find("div", {"class":"xl-price rangePrice"}).get_text())
    print(result.find("div", {"class":"xl-surface-ch"}).get_text())
    print(result.find("div", {"class":"xl-desc"}).get_text())

1 个答案:

答案 0 :(得分:0)

使用pandas DataFrame在其中添加数据。然后将其导出到CSV文件中。

    import pandas as pd
    from selenium import webdriver
    from selenium.webdriver.chrome.options import Options
    from bs4 import BeautifulSoup


    options = Options()
    options.add_argument("window-size=1400,600")
    from fake_useragent import UserAgent
    ua = UserAgent()
    a = ua.random
    user_agent = ua.random
    print(user_agent)
    options.add_argument(f'user-agent={user_agent}')


    driver = webdriver.Chrome('/Users/raduulea/Documents/chromedriver', options=options)

    driver.get('https://www.immoweb.be/fr/recherche/immeuble-de-rapport/a-vendre')

    import time
    time.sleep(10)

    html = driver.page_source
    soup = BeautifulSoup(html, 'html.parser')

    results = soup.find_all("div", {"class":"result-xl"})
    title=[]
    address=[]
    price=[]
    surface=[]
    desc=[]
    for result in results:
       title.append(result.find("div", {"class":"title-bar-left"}).get_text().strip())
       address.append(result.find("span", {"result-adress"}).get_text().strip())
       price.append(result.find("div", {"class":"xl-price rangePrice"}).get_text().strip())
       surface.append(result.find("div", {"class":"xl-surface-ch"}).get_text().strip())
       desc.append(result.find("div", {"class":"xl-desc"}).get_text().strip())


df = pd.DataFrame({"Title":title,"Address":address,"Price:":price,"Surface" : surface,"Description":desc})
df.to_csv("output.csv")

输出: 您的csv文件将如下所示。

Output CSV

相关问题