Question

我正在进行Web Scraping - 我有一个包含大约140个页面标题的列表，但在将其写入CSV之后，标题的格式变得非常奇怪。在使用Spyder的Python中，我可以看到正确的结果，但只有在写完之后，在CSV中才变得很奇怪。

这是我的写代码。我在这里做错了吗？

H e r e ' s " " W h y " " T h e r e " " W a s " " a n " " E m p t y " " S e a t " " N e x t " " t o " " P r i n c e " " W i l l i a m " " a t " " t h e " " R o y a l " " W e d d i n g

CSV中的输出（每个字母后的空格和每个单词后的引号）：

$query = QueryBuilder::for(Advertisement::class)
            ->with('locations');

此外，在某些行的CSV中，第二列中也有数据。虽然我正在做的工作有一个包含140个页面标题的列表，我可以在Spyder中看到它然后如何以及为什么在第二列中提供一些页面标题？有什么想法吗？

Answer 1

我不明白为什么你仍然没有提供我（以及其他人）要求的所需的额外信息，所以充其量只是一个有根据的猜测 - 在我的一条评论中实现我向你建议的内容（即将h_list转换为包含单个字符串的列表列表）：

import csv

h_list = [
    "Here's Why There Was an Empty Seat Next to Prince William at the Royal Wedding",
    "NASA wrestles with what to do with International Space Station after 2024",
    "Father-son team pilot plane from Seattle to Amsterdam",
    # etc...
]

with open('headlines.csv', 'w', newline='') as o_file:
    writer = csv.writer(o_file)
    # Make each line in h_list a row with a single headline string in it.
    writer.writerows([headline] for headline in h_list)

print('done')

执行后headlines.csv的内容：

Here's Why There Was an Empty Seat Next to Prince William at the Royal Wedding
NASA wrestles with what to do with International Space Station after 2024
Father-son team pilot plane from Seattle to Amsterdam

我不确定这是否是您想要的csv文件中的内容 - 因为当只有一个文件时，使用该格式并不是很有意义每行中的值（字段）（因此不需要分隔符） - 但是，如果没有别的话，也许它会帮助你弄清楚正确的事情。

Answer 2

我们可以避免循环，而是为此实现单行解决方案：

将 h_list 转换为数据框 df ，然后使用 df.to_csv 保存为csv格式

df=pd.DataFrame({'Headline':h_list})
df.to_csv('file.csv', index=False)

输出 file.csv 将在不同的行中包含列表元素。

写输出CSV具有错误/奇怪的格式

2 个答案: