使用BeatifulSoup错误进行Python解析

时间:2017-08-23 15:14:29

标签: python selenium parsing beautifulsoup lxml

当我运行此代码时,出现错误 汤= BeautifulSoup(来源,“lxml”) TypeError:'module'对象不可调用

from selenium import webdriver
import bs4 as BeautifulSoup

def html_pin():
    browser = webdriver.Chrome()
    browser.get('http://FULL_URL')
    sources = browser.page_source
    browser.quit()
    soup = BeautifulSoup(sources, "lxml")
    print(soup)

html_pin()

请告诉我我的代码中有什么问题?我认为这是数据类型错误,但当我尝试应用类型(源)函数时,我得到响应类'str'

2 个答案:

答案 0 :(得分:1)

试试这个:

from bs4 import BeautifulSoup
from selenium import webdriver

def html_pin():
    browser = webdriver.Chrome()
    browser.get('http://FULL_URL')
    sources = browser.page_source
    browser.quit()
    soup = BeautifulSoup(sources, "lxml")
    print(soup)

html_pin()

答案 1 :(得分:1)

您正在导入{ "_id" : "sails.backstageadmin", "user" : "backstageadmin", "db" : "sails", "roles" : [ { "role" : "dbAdmin", "db" : "sails" }, { "role" : "dbOwner", "db" : "sails" } ] } 模块,为其提供自定义bs4别名,然后尝试调用/实例化此别名BeautifulSoup模块。

相反,您需要从bs4模块 导入BeautifulSoup课程:

bs4

请注意,现代IDE确实有助于避免此类问题,这是我在将代码粘贴到编辑器中时在PyCharm中看到的内容:

enter image description here