HtmlUnit不会返回所有标头

时间:2013-08-21 18:52:10

标签: htmlunit

final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage(webPageURL);

        final String pageAsXml = page.asXml();
        final String pageAsText = page.asText();


        List <NameValuePair> response = page.getWebResponse().getResponseHeaders();
        for (NameValuePair header : response) {
            log.info(header.toString() + " = " + header.getValue());

网页返回多个标题。但是日志只显示第一个标题。如何获得其余标题?我正在寻找的标题是内容类型:application / javascript; charset = ISO-8859-1

网页是一个内部网页。

1 个答案:

答案 0 :(得分:0)

您提供的代码对我有用。实际上,我替换了header.toString()的{​​{1}}:

header.getName()

输出结果为:

Date = Thu, 22 Aug 2013 19:48:54 GMT
Server = Apache
Content-Location = index.en.html
Vary = negotiate,accept-language,Accept-Encoding
TCN = choice
Last-Modified = Thu, 22 Aug 2013 15:31:17 GMT
ETag = "3887-4e48afb257b40"
Accept-Ranges = bytes
Cache-Control = max-age=86400
Expires = Fri, 23 Aug 2013 19:48:54 GMT
Content-Encoding = gzip
Content-Length = 4605
Keep-Alive = timeout=15, max=100
Connection = Keep-Alive
Content-Type = text/html
Content-Language = en

如您所见,final WebClient webClient = new WebClient(); final HtmlPage page = webClient.getPage("http://www.debian.org"); List<NameValuePair> response = page.getWebResponse().getResponseHeaders(); for (NameValuePair header : response) { System.out.println(header.getName() + " = " + header.getValue()); } 标题就在那里。你能否确认服务器实际上是在发送那条数据(这是一个常见的标题,所以它应该)。