从网页上获取图片来源

时间:2019-12-01 12:59:25

标签: python beautifulsoup python-requests

因此,我想从此网站获取图像源: https://www.pixiv.net/en/artworks/77619496

但是,每次我尝试使用bs4进行刮擦时,我都会不断失败,我也尝试过其他帖子,但无法使其正常工作。 它不断返回None

import requests
import bs4
from bs4 import BeautifulSoup

url = 'https://www.pixiv.net/en/artworks/77564597'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
x = soup.find("img")
print(x)

2 个答案:

答案 0 :(得分:0)

如果您查看 chrome调试控制台的网络部分或正在浏览的浏览器中的控制台,则应该看到开头没有img元素,即页面通过执行javascript生成img元素。但是,我检查了页面,并且有一个meta元素,其中包含图像数据,您可以使用JSON对其进行解析,如下所示:

import requests, json
from bs4 import BeautifulSoup

url = 'https://www.pixiv.net/en/artworks/77564597'
r = requests.get(url)
soup = BeautifulSoup(r.content, 'html.parser')
x = soup.find("meta", {"id": "meta-preload-data"}).get("content")

usefulData = json.loads(x)

print(usefulData)

示例输出为here

答案 1 :(得分:0)

from selenium import webdriver
import time
from bs4 import BeautifulSoup


browser = webdriver.Firefox()

url = 'https://www.pixiv.net/en/artworks/77564597'
sada = browser.get(url)
time.sleep(3)
source = browser.page_source
soup = BeautifulSoup(source, 'html.parser')
for item in soup.findAll('div', attrs={'class': 'sc-fzXfPI fRnFme'}):
    for img in item.findAll('img', attrs={'class': 'sc-fzXfPJ lclRkv'}):
        print(img.get('src'))

输出:

https://i.pximg.net/c/250x250_80_a2/custom-thumb/img/2019/11/28/00/02/59/78026183_p0_custom1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/10/31/04/15/04/77564597_p0_square1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/08/30/07/23/45/76528190_p0_square1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/08/23/08/01/08/76410568_p0_square1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/07/24/03/41/47/75881545_p0_square1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/05/30/04/24/27/74969583_p0_square1200.jpg
https://i.pximg.net/c/250x250_80_a2/custom-thumb/img/2019/11/28/00/02/59/78026183_p0_custom1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/10/31/04/15/04/77564597_p0_square1200.jpg
https://i.pximg.net/c/250x250_80_a2/img-master/img/2019/08/30/07/23/45/76528190_p0_square1200.jpg
相关问题