C#htmlagilitypack操作已经超时

时间:2011-08-02 17:10:04

标签: c#

如何增加htmlagiliypack的超时值?我收到此错误很多,但我想增加超时限制,或者你如何终止请求并再试一次?

resultingHTML = null;
        try
        {
            string htmlstring = string.Empty;
            HttpWebRequest newwebRequest = (HttpWebRequest)WebRequest.Create(htmlURL);
            HttpWebResponse mywebResponce = (HttpWebResponse)newwebRequest.GetResponse();
            if (mywebResponce.StatusCode == HttpStatusCode.OK)
            {
                Stream ReceiveStream = mywebResponce.GetResponseStream();
                using (StreamReader reader = new StreamReader(ReceiveStream))
                {
                    htmlstring = reader.ReadToEnd();
                }
                HtmlDocument doc = new HtmlDocument();
                doc.Load(htmlstring);
                HtmlWeb hwObject = new HtmlWeb();
                HtmlNode body = doc.DocumentNode.SelectSingleNode("//body");
                resultingHTML = body.InnerHtml.ToString();
            }

        }

2 个答案:

答案 0 :(得分:3)

我假设您正在使用HtmlAgility包通过网络请求阅读HTML?

我建议使用框架WebRequest对象,

http://msdn.microsoft.com/en-us/library/system.net.webrequest.getresponse.aspx#Y700

..您可以在其中指定超时。只需通过包装在try / catch块中就可以捕获超时(和其他连接错误)。

然后直接通过HtmlAgility从WebResponse对象解析生成的HTML。

以下是如何从WebResponse

获取html的示例

http://msdn.microsoft.com/en-us/library/system.net.webresponse.getresponsestream.aspx

一旦你将html作为WebResponse的字符串,你就会:

HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(HTML);

答案 1 :(得分:0)

 HttpWebRequest httpWebRequest = (HttpWebRequest)WebRequest.Create("wwww.someurl.com");
        httpWebRequest.Timeout = 10000; // 10 second timeout
        using(HttpWebResponse httpWebResponse = (HttpWebResponse)httpWebRequest.GetResponse())
        {
            if (httpWebResponse.StatusCode == HttpStatusCode.OK)
            {
                using(Stream responseStream = httpWebResponse.GetResponseStream())
                {
                    using (StreamReader reader = new StreamReader(responseStream))
                    {
                        var htmlstring = reader.ReadToEnd();
                         HtmlDocument doc = new HtmlDocument();
                         doc.Load(htmlstring);
                    }
                }

            }
        }

我也会看: Adjusting HttpWebRequest Connection Timeout in C#

只是为了理解HttpWebRequest类上TimeOut和ReadWriteTimeout之间的区别。

相关问题