PhantomJS无法打开未知扩展名的本地文件

时间:2015-03-01 16:47:55

标签: javascript phantomjs

我正在使用phantomjs获取本地文件的屏幕截图。 现在我传递了一个完全有效的html文件:

<!DOCTYPE html><html><head><title>Title of the document</title></head><body>The file name dummy</body></html> 

文件名为dummy.hoo

PhantomJS似乎无法打开它。这是在某处记录的吗?但是,扩展程序.html.htm的本地文件很好。

示例调用(页面的路径始终转换为Uri方案)

"Phantomjs.exe" --proxy-type=none --ssl-protocol=any --local-to-remote-url-access=true "Scripts\screenshot.js" "file:///D:/dummy.hoo" "base.png"

js很简单:

var page = require('webpage').create();
var system = require('system');

if (system.args.length !== 3) {
    console.log('Usage: script.js <URL> <screenshot destination>');
    phantom.exit();
}

page.onResourceError = function(resourceError) {
    page.reason = resourceError.errorString;
    page.reason_url = resourceError.url;
};

page.open(system.args[1], function(status) {
    if (status !== 'success') {
        console.log('Failed to load address '+system.args[1]+' ' + page.reason_url               + ": " + page.reason);
        phantom.exit(-1);
    }
    page.render(system.args[2]);
    phantom.exit();
});

当我复制Uri并将其粘贴到firefox等时,我可以正确地看到dummy.hoo的html内容。只有phantomjs似乎拒绝渲染它。

对于dummy.hoo,它始终是错误路径,表示无法加载地址,状态为fail并且没有通过回调给出任何理由。 (当我传递一个不存在的网址时,我得到了正确的理由)

Failed to load address file:///D:/dummy.hoo undefined: undefined

我使用链接来从这里详细输出错误: Debugging PhantomJS webpage.open failures

这就是结果:

= onNavigationRequested
  destination_url: file:///D:/dummy.hoo
  type (cause): Other
  will navigate: true
  from page's main frame: true
= onResourceRequested()
  request: {
    "headers": [
        {
            "name": "User-Agent",
            "value": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/534.34 (KHTML, like Gecko) PhantomJS/1.9.0 Safari/534.34"
        },
        {
            "name": "Accept",
            "value": "*/*"
        }
    ],
    "id": 1,
    "method": "GET",
    "time": "2015-03-01T16:40:11.080Z",
    "url": "file:///D:/dummy.hoo"
}
= onLoadStarted()
  leaving url: about:blank
= onResourceReceived()
  id: 1, stage: "start", response: {"bodySize":110,"contentType":null,"headers":[{"name":"Last-Modified","value":"Sun, 01 Mar 2015 17:13:02 GMT"},{"name":"Content-Length","value":"110"}],"id":1,"redir
ectURL":null,"stage":"start","status":null,"statusText":null,"time":"2015-03-01T16:40:11.082Z","url":"file:///D:/dummy.hoo"}
= onResourceReceived()
  id: 1, stage: "end", response: {"contentType":null,"headers":[{"name":"Last-Modified","value":"Sun, 01 Mar 2015 17:13:02 GMT"},{"name":"Content-Length","value":"110"}],"id":1,"redirectURL":null,"sta
ge":"end","status":null,"statusText":null,"time":"2015-03-01T16:40:11.082Z","url":"file:///D:/dummy.hoo"}
= onLoadFinished()
  status: fail
Failed to load address file:///D:/dummy.hoo undefined: undefined

1 个答案:

答案 0 :(得分:2)

我能够在phantomjs中找到处理mime类型的代码(不同驱动程序的多个位置):

https://github.com/ariya/phantomjs/blob/48fabe06463460d2fb7026d6df9783216e26265c/src/qt/qtwebkit/Source/WebCore/platform/MIMETypeRegistry.cpp#L154

https://github.com/ariya/phantomjs/blob/48fabe06463460d2fb7026d6df9783216e26265c/src/qt/qtwebkit/Source/WebCore/platform/win/MIMETypeRegistryWin.cpp#L80

背后的要点(hehe)是本地文件不发送包含MIME类型的头信息。因此,Phantomjs不知道应该调用哪个处理程序来正确呈现内容。我基本上可以将.jpeg重命名为.exe,只要Web服务器发送jpg mime类型,它就会正确呈现。这是网络中的常见行为,根据任何内容(正则表达式,扩展名等)重定向网址部分

Phantoms没有某种检测文件的真实内容(这完全合理),因此它必须依赖于文件扩展名和给定的映射。

因此我知道我必须接受我可以使用htmlhtm扩展来呈现HTML数据,而不是其他任何内容。