似乎无法使用BeautifulSoup获取任何打印链接

时间:2016-06-20 16:05:23

标签: python beautifulsoup python-requests

嘿,我不熟悉使用模块进行编码,我正在尝试编写一个程序,列出所有新的动漫剧集,这些剧集来自一个网站只是为了一些乐趣(我知道的书呆子)。但是我似乎无法打印剧集的链接,我不确定是什么问题。有人可以开导我吗?

import requests
from bs4 import BeautifulSoup

def spider():
    url = 'https://kissanime.to/AnimeList/LatestUpdate'
    source_code = requests.get(url)
    text = source_code.text
    soup = BeautifulSoup(text, 'html.parser')
    for link in soup.findAll('a', {'class': 'listing'}):
        href = link.get('href')
        print(href)
spider()

1 个答案:

答案 0 :(得分:0)

由于您似乎有等待的时间长达五秒钟,因此您无法使用请求解析源代码,您获得的内容就是您在浏览器中处于加载阶段时所看到的内容。您可以从teh表单获取https://kissanime.to/cdn-cgi/l/chk_jschl/cdn-cgi/l/chk_jschl传递 jschl_vc 传递,但需要的第三个参数是使用js函数计算的jschl-answer所以我没有看到获得这个价值的方法:

  <form id="challenge-form" action="/cdn-cgi/l/chk_jschl" method="get">
    <input type="hidden" name="jschl_vc" value="7d830410394a469c1b9afc24326aa5dd"/>
    <input type="hidden" name="pass" value="1466454867.055-eUN53b1Q/B"/>
    <input type="hidden" id="jschl-answer" name="jschl_answer"/>
  </form>

如果您要解析网站,则需要selenium之类的内容,您可以将其与phantomjs结合使用以进行无头浏览:

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


def wait(dr, x):
    element = WebDriverWait(dr, 20).until(
        EC.presence_of_all_elements_located((By.XPATH, x))
    )
    return element

dr = webdriver.PhantomJS()
dr.get("https://kissanime.to/AnimeList/LatestUpdate")

# wait until the table is visible
table = wait(dr, "//table[@class='listing']")
print(table)

获取所有链接:

dr = webdriver.PhantomJS()
dr.get("https://kissanime.to/AnimeList/LatestUpdate")

table_links = wait(dr, "//table[@class='listing']//a")
for link in table_links:
    print(link.get_attribute("href"))

哪会给你:

https://kissanime.to/Anime/Hundred
https://kissanime.to/Anime/Hundred/Episode-012?id=126873
https://kissanime.to/Anime/Seisen-Cerberus-Ryuukoku-no-Fatalites
https://kissanime.to/Anime/Seisen-Cerberus-Ryuukoku-no-Fatalites/Episode-012?id=126872
https://kissanime.to/Anime/Wagamama-High-Spec
https://kissanime.to/Anime/Wagamama-High-Spec/Episode-011?id=126871
https://kissanime.to/Anime/Usakame
https://kissanime.to/Anime/Usakame/Episode-011?id=126870
https://kissanime.to/Anime/Bakuon
https://kissanime.to/Anime/Bakuon/Episode-012?id=126869
https://kissanime.to/Anime/Pretty-Guardian-Sailor-Moon-Crystal-Death-Busters
https://kissanime.to/Anime/Pretty-Guardian-Sailor-Moon-Crystal-Death-Busters/Episode-038?id=126868
https://kissanime.to/Anime/Kaitou-Joker-3rd-Season
https://kissanime.to/Anime/Kaitou-Joker-3rd-Season/Episode-038?id=126867
https://kissanime.to/Anime/Aoyama-Goushou-Tanpenshuu
https://kissanime.to/Anime/Yu-Gi-Oh-Arc-V-5
https://kissanime.to/Anime/Yu-Gi-Oh-Arc-V-5/Episode-110?id=126859
https://kissanime.to/Anime/New-Dream-Hunter-Rem-Yume-no-Kishitachi
https://kissanime.to/Anime/Dream-Hunter-Rem
https://kissanime.to/Anime/Seitokai-Yakuindomo-OVA-2
https://kissanime.to/Anime/Seitokai-Yakuindomo-OVA-2/Episode-018?id=126853
https://kissanime.to/Anime/Sansha-Sanyou-Dub
https://kissanime.to/Anime/Sansha-Sanyou-Dub/Episode-006?id=126852
https://kissanime.to/Anime/Macross-Delta
https://kissanime.to/Anime/Macross-Delta/Episode-012?id=126851
https://kissanime.to/Anime/Tanaka-kun-wa-Kyou-mo-Kedaruge
https://kissanime.to/Anime/Tanaka-kun-wa-Kyou-mo-Kedaruge/Episode-027?id=126850
https://kissanime.to/Anime/Kare-Baka-Wagahai-no-Kare-wa-Baka-de-R
https://kissanime.to/Anime/Re-Zero-kara-Hajimeru-Isekai-Seikatsu
https://kissanime.to/Anime/Re-Zero-kara-Hajimeru-Isekai-Seikatsu/Episode-012?id=126848
https://kissanime.to/Anime/Kuma-Miko-Girl-meets-Bear
https://kissanime.to/Anime/Sansha-Sanyou
https://kissanime.to/Anime/Sansha-Sanyou/Episode-011?id=126846
https://kissanime.to/Anime/Tonkatsu-DJ-Agetarou
https://kissanime.to/Anime/Tonkatsu-DJ-Agetarou/Episode-011?id=126845
https://kissanime.to/Anime/Nijiiro-Days
https://kissanime.to/Anime/Nijiiro-Days/Episode-023?id=126844
https://kissanime.to/Anime/Pan-de-Peace
https://kissanime.to/Anime/Pan-de-Peace/Episode-012?id=126843
https://kissanime.to/Anime/Mobile-Suit-Gundam-Iron-Blooded-Orphans-Dub
https://kissanime.to/Anime/Mobile-Suit-Gundam-Iron-Blooded-Orphans-Dub/Episode-003?id=126842
https://kissanime.to/Anime/Hunter-x-Hunter-2011-Dub
https://kissanime.to/Anime/Hunter-x-Hunter-2011-Dub/Episode-009?id=126841
https://kissanime.to/Anime/Momokuri-1
https://kissanime.to/Anime/Momokuri-1/Episode-010?id=126840
https://kissanime.to/Anime/Boku-no-Hero-Academia-My-Hero-Academia
https://kissanime.to/Anime/Boku-no-Hero-Academia-My-Hero-Academia/Episode-012?id=126839
https://kissanime.to/Anime/Dragon-Ball-Super
https://kissanime.to/Anime/Dragon-Ball-Super/Episode-048?id=126838
https://kissanime.to/Anime/Monster-Strike
https://kissanime.to/Anime/Monster-Strike/Episode-028?id=126837
https://kissanime.to/Anime/One-Piece
https://kissanime.to/Anime/One-Piece/Episode-746?id=126836
https://kissanime.to/Anime/Cardfight-Vanguard-G-Stride-Gate-hen
https://kissanime.to/Anime/Cardfight-Vanguard-G-Stride-Gate-hen/Episode-036?id=126835
https://kissanime.to/Anime/Folktales-from-Japan
https://kissanime.to/Anime/Folktales-from-Japan/Episode-219?id=126831
https://kissanime.to/Anime/Cardfight-Vanguard-G-Stride-Gate-hen-Dub
https://kissanime.to/Anime/Cardfight-Vanguard-G-Stride-Gate-hen-Dub/Episode-029?id=126834
https://kissanime.to/Anime/Mobile-Suit-Gundam-More-Information-on-the-Universal-Century
https://kissanime.to/Anime/Endride-Dub
https://kissanime.to/Anime/Endride-Dub/Episode-006?id=126822
https://kissanime.to/Anime/Pokemon-XY-Z-Dub
https://kissanime.to/Anime/Pokemon-XY-Z-Dub/Episode-018?id=126821
https://kissanime.to/Anime/Mahoutsukai-Precure
https://kissanime.to/Anime/Mahoutsukai-Precure/Episode-019?id=126747
https://kissanime.to/Anime/Aikatsu-Stars
https://kissanime.to/Anime/Aikatsu-Stars/Episode-011?id=126820
https://kissanime.to/Anime/Haifuri
https://kissanime.to/Anime/Haifuri/Episode-011?id=126818
https://kissanime.to/Anime/Kiznaiver
https://kissanime.to/Anime/Kiznaiver/Episode-011?id=126817
https://kissanime.to/Anime/Ragnastrike-Angels
https://kissanime.to/Anime/Gakusen-Toshi-Asterisk-2nd-Season
https://kissanime.to/Anime/Tanaka-kun-wa-Itsumo-Kedaruge
https://kissanime.to/Anime/Tanaka-kun-wa-Itsumo-Kedaruge/Episode-011?id=126815
https://kissanime.to/Anime/Kyoukai-no-Rinne-TV-2nd-Season-RIN-NE
https://kissanime.to/Anime/Kyoukai-no-Rinne-TV-2nd-Season-RIN-NE/Episode-036?id=126812
https://kissanime.to/Anime/Gyakuten-Saiban-Sono-Shinjitsu-Igi-Ari
https://kissanime.to/Anime/Gyakuten-Saiban-Sono-Shinjitsu-Igi-Ari/Episode-012?id=126813
https://kissanime.to/Anime/Sakamoto-desu-ga
https://kissanime.to/Anime/Sakamoto-desu-ga/Episode-010?id=126811
https://kissanime.to/Anime/Shounen-Maid-Dub
https://kissanime.to/Anime/Shounen-Maid-Dub/Episode-005?id=126810
https://kissanime.to/Anime/Mayoiga
https://kissanime.to/Anime/Shounen-Maid
https://kissanime.to/Anime/Shounen-Maid/Episode-010?id=126806
https://kissanime.to/Anime/Future-Card-Buddyfight-DDD
https://kissanime.to/Anime/Future-Card-Buddyfight-DDD/Episode-012?id=126807
https://kissanime.to/Anime/Bonobono-2016
https://kissanime.to/Anime/Bonobono-2016/Episode-011?id=126805

如果您更喜欢使用bs4进行解析,可以在等待后传递源:

from urlparse import urljoin

dr = webdriver.PhantomJS()
dr.get("https://kissanime.to/AnimeList/LatestUpdate")

wait(dr, "//table[@class='listing']//a")
soup = BeautifulSoup(dr.page_source)
for link in soup.select("table.listing a"):
    print(urljoin("https://kissanime.to" ,link["href"]))