HtmlUnit从href调用javascript来下载文件

时间:2013-07-12 18:35:53

标签: javascript download href attachment htmlunit

我试图下载一个似乎必须通过浏览器点击的文件。该网站使用一个表格,其内部是一个名为downloadFile的javascript函数的几个href。在此函数中,名为poslimit的元素由document.getElementById:

获取
function downloadFile(actionUrl, formId)
{
    document.getElementById(formId).action=actionUrl;
    document.getElementById(formId).submit();
}

HTML源代码段:

<form method="post" name="commandForm" action="position-limits" id="poslimit">
    <div id="content">
        <li><a href="javascript:downloadFile('position-limits?fileName=20130711&positionLimit=CURRENT_POSITION_LIMIT_', 'poslimit');" > July 11, 2013 </a></li>

因此,点击href上面的链接代码会调用另一个文件中的javascript:

我试过了:

WebClient webClient = new WebClient(BrowserVersion.CHROME_16);
HtmlPage page = webClient.getPage("http://www.theocc.com/webapps/position-limits");
HtmlForm elt = page.getHtmlElementById("poslimit");
elt.setAttribute("action", "position-limits?fileName=20130709&positionLimit=POSITIONLIMITCHANGE_");
InputStream is = elt.click().getWebResponse().getContentAsStream();
int b = 0;
while ((b = is.read()) != -1)
{
    System.out.print((char)b);
}
webClient.closeAllWindows();

还尝试使用HtmlElement 我也试过了:

WebClient webClient = new WebClient(BrowserVersion.CHROME_16);
HtmlPage page = webClient.getPage("http://www.theocc.com/webapps/position-limits");
ScriptResult sr = page.executeJavaScript("downloadFile('position-limits?fileName=20130709&positionLimit=POSITIONLIMITCHANGE_', 'poslimit'");
InputStream is = sr.getNewPage().getWebResponse().getContentAsStream();
int b = 0;
while ((b = is.read()) != -1)
{
    System.out.print((char)b);
}
webClient.closeAllWindows();

这两个来自这个和其他主板上的例子,但我继续只是获取原始页面而不是附加文件。我也想知道我是否需要查看历史记录以获取正确的页面响应,因为我需要的返回窗口/文档可能是之前的。我很欣赏有礼貌的链接到完整的解释或良好的示例文档以及我可以尝试的来源。

1 个答案:

答案 0 :(得分:1)

所以我认为这对其他人有帮助,因为我还没有看到一个有效的例子。

WebClient webClient = new WebClient(BrowserVersion.CHROME_16);
HtmlPage page = webClient.getPage("http://www.theocc.com/webapps/position-limits");
HtmlAnchor anchor = null;
List<HtmlAnchor> anchors = page.getAnchors();
for (int i = 0; i < anchors.size(); ++i)
{
    anchor = anchors.get(i);
    String sAnchor = anchor.asText();
    // This date should come in from args
    if (sAnchor.equals("July 9, 2013"))
        break;
}
// This is not safe, need null check
Page p = anchor.click();
InputStream is = p.getWebResponse().getContentAsStream();
int b = 0;
while ((b = is.read()) != -1)
{
    System.out.print((char)b);
}
webClient.closeAllWindows();

这个问题对我有点帮助,因为我尝试了锚的东西而且它有效。 struggling to click on link within htmlunit