无法从HTML源代码字符串C#

时间:2015-10-24 16:37:06

标签: c# html webclient html-agility-pack

我需要从https://www.bcr.ro/en/exchange-rates获取货币值,但是使用这些方法获取html字符串:

  1. WebRequest req = HttpWebRequest.Create("https://www.bcr.ro/en/exchange-rates");
    req.Method = "GET";
    
    string source;
    using (StreamReader reader = new StreamReader(req.GetResponse().GetResponseStream()))
    {
        source = reader.ReadToEnd();
    }
    
  2. WebClient wc = new WebClient();
    string s = wc.DownloadString("https://www.bcr.ro/en/exchange-rates");
    
  3. 两者都导致得到一个奇怪的html字符串,其中不包含所需的数据:

    <!DOCTYPE html>
    <html  lang="en" class="no-js false_EAM isEmil_false">
    <!-- Version: 2.16.7.0 (gportals2m1pvm1-044457035960000082024075) Date: 24.10.2015 18:19:59 -->
    <head>
      <meta charset="utf-8">
      <meta http-equiv="X-UA-Compatible" content="IE=edge">
      <title>Exchange rates | BCR</title>
      <link rel="shortcut icon" type="image/x-icon" href="https://www.bcr.ro/content/8ea9dd8a/-3b9c-429b-9f72-34e75b7512e3/favicon.ico">
      <meta name="author" content="Banca Comerciala Romana (BCR): loans, cards, deposits, Internet Banking, current account">
      <meta name="description" content="Banca Comerciala Romana (BCR), a member of Erste Group, is a universal bank serving both retail and corporate clients. ">
      <meta name="generator" content="Group Portal - 2.16.7.0"><meta name="keywords" content=" loans, cards, deposits, Internet Banking, current account">
    

    我怎样才能达到想要的结果?

2 个答案:

答案 0 :(得分:1)

因此经过快速研究后,答案非常简单:

  1. WebRequestWebClient initiallink同时拉取页面源中包含的数据 Crtl + U ,其中不包含所需数据
  2. 在DEV( Crtl + F12 )中进行快速搜索后,很明显所需的数据是动态的,所以在看了一下Network TAB之后我找到了请求( data)以精美的JSON(完美)提取精确的所需数据。

答案 1 :(得分:-2)

尝试创建一个RegEx来抓取它('<table class="overview glaze fullsize">'),然后抓取HTML页面的这个标记中的所有内容。然后在需要的地方使用它。