Question

我注意到有一个类似于我的问题，只有c＃：link text。让我解释一下：我对整个Web服务实现都很陌生，因此我遇到了一些难以理解的问题（特别是由于模糊的MediaWiki API手册）。

我想在PHP（XML文件）中将整个页面作为字符串检索，然后在PHP中处理它（我很确定还有其他更复杂的方法来解析XML文件，但无论如何）： Main Page wikipedia

我尝试过$fp = fopen($url,'r');。它输出：HTTP request failed! HTTP/1.0 400 Bad Request。 API不需要密钥即可连接到它。

您能详细描述如何连接API并将页面作为字符串获取吗？

修改网址为$url='http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main Page';。我只是想将文件的整个内容读成一个字符串来使用它。

Answer 1

连接到该API就像检索文件一样简单，

fopen

$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$fp = fopen($url, 'r');
while (!feof($fp)) {
    $c .= fread($fp, 8192);
}
echo $c;

file_get_contents

$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$c = file_get_contents($url);
echo $c;

只有在服务器启用了fopen包装器时才能使用上述两个。

否则，如果您的服务器安装了cURL，则可以使用

$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$ch = curl_init($url);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
$c = curl_exec($ch);
echo $c;

Answer 2

您可能需要对查询字符串中传递的参数进行urlencode;在这里，至少“Main Page”需要编码 - 如果没有这种编码，我也会收到400错误。

如果你尝试这个，它应该更好（注意空格被%20替换）：

$url='http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=Main%20Page';
$str = file_get_contents($url);
var_dump($str);

有了这个，我就得到了页面的内容。

解决方案是使用urlencode，因此您不必自己编码：

$url='http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=xml&redirects&titles=' . urlencode('Main Page');
$str = file_get_contents($url);
var_dump($str);

Answer 3

根据MediaWiki API文档，如果您未在PHP请求中指定User-Agent，WikiMedia将拒绝使用4xx HTTP响应代码的连接：

https://www.mediawiki.org/wiki/API:Main_page#Identifying_your_client

您可以尝试更新代码以添加该请求标头，或者如果您具有该编辑权限，则可以更改php.ini中的默认设置。

PHP连接到MediaWiki API并检索数据

3 个答案: