将美丽的汤输出写入CSV

时间:2016-04-20 21:08:20

标签: python

我想在Excel中将价格和相应的地址写入CSV文件。到目前为止,我有这个代码,它给出了下面照片中显示的输出。

我想要的是价格优先的列和地址秒的列。

[![from bs4 import BeautifulSoup
import requests 
import csv


number = "1"
url = "http://www.trademe.co.nz/browse/categoryattributesearchresults.aspx?cid=5748&search=1&v=list&134=1&nofilters=1&originalsidebar=1&key=1654466070&page=" + number + "&sort_order=prop_default&rptpath=350-5748-3399-"
r= requests.get(url)
soup = BeautifulSoup(r.content)


output_file= open("output.csv","w")

price = soup.find_all("div",{"class":"property-card-price-container"})

address = soup.find_all("div",{"class":"property-card-subtitle"})


n = 1
while n != 150:
    b = (price\[n\].text)
    b = str(b)
    n = n + 1
    output_file.write(b)

output_file.close()][1]][1]

l

2 个答案:

答案 0 :(得分:3)

也许是这样的?

from bs4 import BeautifulSoup
import requests 
import csv
....
r = requests.get(url)
soup = BeautifulSoup(r.content)
price = soup.find_all("div",{"class":"property-card-price-container"})
address = soup.find_all("div",{"class":"property-card-subtitle"})

dataset = [(x.text, y.text) for x,y in zip(price, address)]

with open("output.csv", "w", newline='') as csvfile:
    writer = csv.writer(csvfile)
    for data in dataset[:150]: #truncate to 150 rows
        writer.writerow(data)

答案 1 :(得分:0)

您的代码存在一些问题。将价格和地址分成单独的列表会使站点切换项目的顺序等,并使它们混淆。在刮擦这样的条目时,首先找到较大的封闭容器,然后从那里缩小范围是很重要的。

很遗憾,您提供的网址不再有效。因此,我只浏览了另一组此列表的列表:

from bs4 import BeautifulSoup
import requests
import csv

url = 'http://www.trademe.co.nz/property/residential-property-for-sale'
url += '/waikato/view-list'

r = requests.get(url)
soup = BeautifulSoup(r.content, 'html5lib')

with open('output.csv', 'w', newline='') as csvfile:

    propertyWriter = csv.writer(csvfile, quoting=csv.QUOTE_ALL)

    for listing in soup.find_all('div',
                                 {'class': 'property-list-view-card'}):
        price = listing.find_all('div',
                                 {'class': 'property-card-price-container'})
        address = listing.find_all('div',
                                   {'class': 'property-card-subtitle'})

        propertyWriter.writerow([price[0].text.strip(),
                                 address[0].text.strip()])