Question

我正在尝试使用HtmlUnit来分析html页面。这是代码

    String  url="http://192.168.1.1
    URL link=new URL(url); 
    WebClient wc=new WebClient();
    WebRequest request=new WebRequest(link);
    request.setCharset("UTF-8");
    request.setAdditionalHeader("User-Agent", "Mozilla/5.0 (Windows NT 5.1; rv:6.0.2) Gecko/20100101 Firefox/6.0.2");

    wc.getCookieManager().setCookiesEnabled(true);
    wc.getOptions().setJavaScriptEnabled(false);
    wc.getOptions().setCssEnabled(false);
    wc.getOptions().setThrowExceptionOnFailingStatusCode(true); // here is my question.
    //wc.getOptions().setPrintContentOnFailingStatusCode(false);
    //wc.getOptions().setThrowExceptionOnScriptError(false);
    wc.getOptions().setTimeout(10000);

         HtmlPage page=null;
    page = wc.getPage(request);
    if(page==null)
    {

        return ;
    }
    String content=page.asText();
    String titleText = page.getTitleText();
    if(content==null)
    {

        return ;
    }
    System.out.println(content);
    System.out.println("title text:" + titleText);

如果我使用wc.getOptions().setThrowExceptionOnFailingStatusCode(true);，我无法获取我的网页并返回statusCode=[500] contentType=[text/html] 但是，如果我不使用它，它会很好。

我想知道哪个页面会抛出该异常，以及为什么会这样做。

Answer 1

选项.setThrowExceptionOnFailingStatusCode(true);会在遇到第一个错误时停止webClient导航。我以前把它设置为假。也许有一个链接断开的资源或图像（这使得html状态代码为500）。

将其设置为false，Webclient将忽略并导航到您的页面。

WebClient.setThrowExceptionOnFailingStatusCode的含义是什么？

1 个答案: