如何使用python从wiki获得具有特定语言的文章?

时间:2020-06-15 23:17:39

标签: python-3.x beautifulsoup python-requests mediawiki

我想在Wiki中获得针对特定语言的文章。

我尝试了以下代码:

URL = "https://en.wikipedia.org/w/api.php"
PARAMS = {
        "action": "query",
        "titles": "Python",
        "prop": "langlinks",
        "lllang": "de",
        "format": "json"
        }
results = requests.get(url=URL, params=PARAMS)
soup = BeautifulSoup(results.content, 'html.parser')
print(soup.prettify())

但是我没有得到整篇文章,我只是

{"batchcomplete":"","query":{"pages":{"46332325":{"pageid":46332325,"ns":0,"title":"Python","langlinks":[{"lang":"de","*":"Python"}]}}}}

您能帮我了解我做错了什么吗?

2 个答案:

答案 0 :(得分:0)

将URL更改为de.wikipedia.org以获取德语版本。

例如:

import requests
from bs4 import BeautifulSoup

URL = "https://de.wikipedia.org/w/api.php"  # <-- note the de.
PARAMS = {
        "action": "parse",
        "page": "Python (Programmiersprache)",
        "prop": "text",
        "section": 0,
        "format": "json"
        }

results = requests.get(url=URL, params=PARAMS).json()
soup = BeautifulSoup(results['parse']['text']['*'], 'html.parser')
print(soup.prettify())

打印:

<div class="mw-parser-output">
 <table cellspacing="5" class="float-right infobox toccolours toptextcells" style="font-size:90%; margin-top:0; width:21em;">
  <tbody>
   <tr>
    <th class="hintergrundfarbe6" colspan="2" style="font-size:larger;">
     Python
    </th>
   </tr>
   <tr>

... and so on.

要仅获取Wiki模板/标签,可以执行以下操作:

URL = "https://de.wikipedia.org/w/api.php"
PARAMS = {
        "action": "query",
        "titles": "Python (Programmiersprache)",
        "prop": "revisions",
        "rvprop": "content",
        "rvsection": 0,
        "format": "json"
        }

results = requests.get(url=URL, params=PARAMS).json()
print(results)

答案 1 :(得分:0)

如果您拥有一种语言的维基百科页面标题,并且想知道另一种语言的标题,则可以适当地使用“语言链接”,如下所示:

https://en.wikipedia.org/w/api.php?action=query&prop=langlinks&titles=Python+(programming+language)&lllang=de

注意将“ lllang”设置为“ de”

这给您:

scrollImageView.addSubview(teritorysImages)

有关更多信息,请参见此处: https://www.mediawiki.org/wiki/API:Langlinks

相关问题