Python漂亮的汤代码昨天工作,但今天给出了错误信息

时间:2018-03-29 16:59:20

标签: python-3.x web-scraping beautifulsoup

我使用Python 3.6创建了一个非常基本的Web scraper,旨在获取存储在csv文档中的url列表并返回信息。昨天它正在运作。

今天,即使使用之前使用的URL的csv,它也不再有效。相反,我收到错误消息。

以下是我正在使用的代码:

import pandas as pd
from bs4 import BeautifulSoup as bs
from urllib.request import urlopen
import time


dataset = pd.read_csv('read_csv.csv')
dataset = dataset.iloc[:, 0].str.strip('[]')


data = []
for i in dataset:
    page = urlopen(i)
    soup = bs(page, 'html.parser', time.sleep(1))
    title = soup.find(attrs = {'class': 'title'})
    title = title.text.strip()
    content = soup.find(attrs = {'class': 'articleContent articleTruncate'}, itemprop = 'text')
    content = content.text.strip()
    date = soup.find(attrs = {'class': 'date'})
    date = date.text.strip()
    author = soup.find(attrs = {'class': 'authorInfo'})
    author = author.text.strip()
    data.append((title, date, author, content))

以下是控制台错误消息:

Traceback (most recent call last):

  File "<ipython-input-26-3a1fc158da11>", line 6, in <module>
    title = title.text.strip()

AttributeError: 'NoneType' object has no attribute 'text'

0 个答案:

没有答案