我曾经使用Python和漂亮的汤来检测网站的链接,现在我想从检测到的url下载图像文件并将其存储在特定的文件夹中,最简单的方法是什么?>
我到目前为止开发的代码:
from bs4 import BeautifulSoup as soup # HTML data structure
from urllib.request import urlopen as uReq # Web client
from PIL import Image
import requests
my_url = "https://abc/videos/vod/movies/actress/letter=a/sort=popular/page=1/"
uClient = uReq(my_url)
page_html=uClient.read()
uClient.close()
page_soup = soup(page_html, "html.parser")
for div in page_soup.findAll('div', attrs={'class':'main'}):
for ul in div.findAll('ul'):
for li in ul.findAll('li'):
for img in li.findAll('img', alt=True):
link=img['src']
检测到网址链接:
https://abcde/mono/actjpgs/abb1.jpg
https://abcde/mono/actjpgs/t31sw.jpg
https://abcde/mono/actjpgs/beaas.jpg
最终结果文件名:
abb1.jpg
t31sw.jpg
beaas.jpg
答案 0 :(得分:0)
import os
import shutil
from urllib.parse import urlparse
# get filename from URL
url = "https://abcde/mono/actjpgs/abb1.jpg"
url_parsed = urlparse(url)
filename = os.path.basename(url_parsed.path) # will contain abb1.jpg
# download file
with urllib.request.urlopen(url) as response, open(filename, 'wb') as out_file:
shutil.copyfileobj(response, out_file)
答案 1 :(得分:0)
正如卡尔(Karl)所建议的那样,通过谷歌快速搜索可以告诉您这一点,但是由于我在SO早期职业生涯中有所帮助,因此我会尽力为您做到这一点。
import requests
link = your/example/link.jpg
# Get image and file name
r = requests.get(link, allow_redirects=True)
fname = link.split('/')[-1]
# save the file
open(fname, 'wb').write(r.content)
我尚未测试此代码。