HtmlUnit不会加载整个页面

时间:2019-06-24 11:54:05

标签: htmlunit

HtmlUnit未加载此页面的一部分:

https://www.milanuncios.com/mis-anuncios/

使用浏览器检查时,该部分:

<div class="ma-LayoutBasicMainContent">

有很多内容,但是由HtmlUnit加载时为空

我尝试了各种webClient开关,包括

webClient.setAjaxController(new NicelyResynchronizingAjaxController());
webClient.getOptions().setDownloadImages(true);
webClient.getOptions().setCssEnabled(true);
webClient.getOptions().setJavaScriptEnabled(true);
webClient.setJavaScriptTimeout(10000);

但是总是相同的结果。未加载“ ma-LayoutBasicMainContent”部分。这是我使用的代码:

import com.gargoylesoftware.htmlunit.NicelyResynchronizingAjaxController;
import com.gargoylesoftware.htmlunit.WebClient;
import com.gargoylesoftware.htmlunit.html.*;

class MarnvHtmlUnitTest {

    public static void main(String[] args) {

        WebClient webClient = null;

        try {

            final long javascriptTimeout = 10000;

            webClient = new WebClient();
            webClient.setAjaxController(new NicelyResynchronizingAjaxController());
            webClient.getOptions().setDownloadImages(true);
            webClient.getOptions().setCssEnabled(true);
            webClient.getOptions().setJavaScriptEnabled(true);
            webClient.setJavaScriptTimeout(10000);

            String loginURL = "https://www.milanuncios.com/mis-anuncios/";
            System.out.println("Connecting to " + loginURL + " (" + webClient.getBrowserVersion() + ")");

            HtmlPage page = webClient.getPage(loginURL);
            System.out.print("    Waiting for Javascript to complete...");
            long millis = System.currentTimeMillis();
            webClient.waitForBackgroundJavaScript(javascriptTimeout);                              
            System.out.println(System.currentTimeMillis() - millis + " milliseconds");
            if (!page.asText().contains("ACCESO A MIS ANUNCIOS")) {
                System.out.println("ERROR!");
                System.out.println(page.asXml());
                System.out.println("EXITING. " + webClient.getWebWindows().size());
                return;
            }

            System.out.println("OK");

        } catch (Exception e) {
            e.printStackTrace();
        }
        finally {
            if (webClient != null)
                webClient.close();
        }
    }
}

如果正确加载页面,则页面应包含文本“ ACCESO A MIS ANUNCIOS”。 请注意,waitForBackgroundJavaScript会立即返回,这对我来说很奇怪……它通常要等待几秒钟,直到页面完全加载。我正在使用HtmlUnit 2.35.0

0 个答案:

没有答案