如何将提取的信息保存到单独的txt文件中?

时间:2020-10-30 11:55:59

标签: python pandas csv beautifulsoup

我构建了一个代码,该代码从我的网站博客中提取信息(这些URL在excel文件中,因此我从那里提取了这些URL)。我希望将提取的每个URL信息保存在单独的.txt文件中(到目前为止,我仅设法将它们保存为1)。我怎样才能做到这一点?我什至不知道从哪里开始,我在这里很失落:(任何帮助将不胜感激。

import urllib
from bs4 import BeautifulSoup
import pandas as pd
import time

i = []

crawl = pd.read_excel('C:/Users/Acer/Desktop/internal_all2.xlsx') 
addresses = crawl['Address'].tolist() 

for row in addresses:
    url = row
    time.sleep(5)
    response = urllib.request.urlopen(url)
    soup = BeautifulSoup(response, 'html.parser')
    content = soup.find_all('p')
    
    for content2 in content:
        print(url, content2)
        i.append([url,content2])
        
    df = pd.DataFrame(i)
    df.to_csv('C:/Users/Acer/Desktop/scripts/content/test.txt', index=False)

1 个答案:

答案 0 :(得分:1)

只需在文件名后附加一个字符串:

import urllib
from bs4 import BeautifulSoup
import pandas as pd
import time

i = []

crawl = pd.read_excel('C:/Users/Acer/Desktop/internal_all2.xlsx') 
addresses = crawl['Address'].tolist() 

for row in addresses:
    url = row
    time.sleep(5)
    response = urllib.request.urlopen(url)
    soup = BeautifulSoup(response, 'html.parser')
    content = soup.find_all('p')
    
    for content2 in content:
        print(url, content2)
        i.append([url,content2])
        
    df = pd.DataFrame(i)
    df.to_csv(f'C:/Users/Acer/Desktop/scripts/content/test_{url}.txt', index=False)

相关问题