Question

我一直在做一个小程序，我需要做以下事情：

将csv文件'domains_prices.csv'与一列域进行比较，然后为每个文件确定价格：

http://www.example1.com,$20
http://www.example2.net,$30

等等

然后是第二个文件'orders_list.csv'，它只是来自第一个文件中列出的相同域的一列博客帖子，例如：

http://www.exmaple2.net/blog-post-1
http://www.example1.com/some-article
http://www.exmaple3.net/blog-post-feb-19

等等

我需要检查orders_list中针对第一个文件中的域的完整网址，并检查该域上博客帖子的价格是多少，然后将所有博客帖子网址输出到新文件中，每个文件的价格都是例如：

http://www.example2.net/blog-post-1, $20

然后在输出文件的末尾会有一个总金额。

我的计划是为domain_prices创建一个dict，其中k，v为domain＆amp; price然后将list_list中的所有url放在列表中，然后将该列表的元素与dict中的价格进行比较。

这是我的代码，我坚持到最后，我有parsed_orders_list，它似乎将所有网址作为单个列表返回，所以我想我应该把所有这些网址放在一个列表中？

最后注释掉的代码最后是我打算做的操作，一旦我有正确的网址列表，将它们与dict的k，v进行比较，我不确定这是否也正确。

请注意这也是我第一次从头开始创建的每个完整的python程序，所以如果它的可怕性那就是为什么：）

import csv
from urlparse import urlparse

#get the csv file with all domains and prices in
reader = csv.reader(open("domains_prices.csv", 'r'))

#get all the completed blog post urls
reader2 = csv.reader(open('orders_list.csv', 'r'))

domains_prices={}


orders_list = []




for row in reader2:
    #put the blog post urls into a list
    orders_list.append(','.join(row))


for domain, price in reader:
    #strip the domains
    domain = domain.replace('http://', '').replace('/','')

    #insert the domains and prices into the dictionary
    domains_prices[domain] = price


for i in orders_list:
    #iterate over the blog post urls orders_list and
    #then parse them with urlparse
    data = urlparse(i)

    #use netloc to get just the domain from each blog post url
    parsed_orders =  data.netloc


    parsed_orders_list = parsed_orders.split()


    print parsed_orders_list


"""
for k in parsed_orders:
    if k in domains_prices:
        print k, domains_prices[k]
"""

Answer 1

在其他人的帮助下我已经弄明白了，在order_list＆＃39;中对我进行了以下更改：节

parsed_orders = []

for i in orders_list:
#iterate over the blog post urls orders_list and
#then parse them with urlparse
data = urlparse(i)

#use netloc to get just the domain from each blog post url then put each netloc url into a list
parsed_orders.append(data.netloc)


#print parsed_orders - to check that Im getting a list of netloc urls back

#Iterate over the list of urls and dict of domains and prices to match them up
for k in parsed_orders:
    if k in domains_prices:
        print k, domains_prices[k]

Csv解析程序＆amp;如何将多个列表展平为单个列表

1 个答案: