无法获取Zomato网站的HTML源代码Selenium、Python

时间:2021-07-27 12:45:54

标签: python selenium web-scraping beautifulsoup

我正在尝试抓取 Zomato 网站以获取评论,但无法从该网站获取源 HTML 代码。我正在尝试获取 Review 框,但它返回 null 或“NoneType”。 这是我的代码:

from bs4 import BeautifulSoup
import requests
import re
import pandas as pd
from selenium import webdriver
import codecs
import os
import numpy as np
import pandas as pd
#import nltk
#import matplotlib.pyplot as plt
#from tensorflow import keras
os.system('cls')


PATH = "C:\\Users\\HCES\\Downloads\\chromedriver.exe"
driver = webdriver.Chrome(PATH)
i=1
html = driver.get("https://www.zomato.com/beirut/divvy-ashrafieh/reviews?page= 
{}&sort=dd&filter=reviews-dd".format(i))
driver.quit()
#soup=BeautifulSoup(html,"lxml")
#tag=soup.find_all('div', class_ = 'sc-esoVGF cHxNXn')
#print(atag)
print(html)

1 个答案:

答案 0 :(得分:0)

你做错了。

您正试图从 driver.get() 返回,但它应该是

driver.page_source 

见下:

i=1
driver.get("https://www.zomato.com/beirut/divvy-ashrafieh/reviews?page= {}&sort=dd&filter=reviews-dd".format(i))
page_source = driver.page_source
soup = BeautifulSoup(page_source,"lxml")