使用Jsoup解析html

时间:2015-02-05 01:30:05

标签: java html parsing jsoup

第一关我在java中编码非常新,我正在使用Android Studio。我正在使用Jsoup转到URL并获取HTML源代码。我的代码成功地执行了此操作,现在我需要解析一个特定行的HTML。我需要HTML的字符串包含一个链接,但我不需要链接的地址只是显示为链接的字符串。这是我用来完成此任务的类的代码:

private class FetchAnton extends AsyncTask<Void, Void, Void> {

    String price;
    String url = "http://www.antoncoop.com/markets/cash.php";
    Elements hrefEles;
    String value = null;
    String html = null;
    Document doc = null;

    @Override
    protected Void doInBackground(Void... params) {

        try {
            //Connect to website
            html = Jsoup.connect(url).get().toString();

            if (html != null && html.length() > 0) {
                doc = Jsoup.parse(html);           
                if (doc != null) {
                    /** Get all A tag element with HREF attribute like '/markets/cashchart.php?c=2246' **/
                    hrefEles = doc.select("a[href*=/markets/cashchart.php?c=2246]");

                    if (hrefEles != null && hrefEles.size() > 0) {
                        for (Element e: hrefEles) {
                            //value = e.ownText();
                           // break;
                        }

                        price = value;
                    }
                }
            }
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }

以下是我感兴趣的HTML部分:

</table>
<br />
<table class="homepage_quoteboard" cellspacing="0" cellpadding="0" border="0" width="100%">
<thead>
<tr class="section">
<td colspan="10">Wheat</td>
</tr>
<tr>
<td width="10%">Name</td>
<td width="10%">Delivery</td>
<td width="10%">Delivery End</td>
<td width="10%">Futures Month</td>
<td width="10%" align="right">Futures Price</td>
<td width="10%" align="right">Change</td>
<td width="10%" align="right">Basis</td>
<td width="10%" align="right">Cash Price</td>
<td width="10%" align="right">Settlement</td>
<td width="10%">Notes</td>
</tr>
</thead>
<tbody>
<script language="javascript">          
writeBidRow('Wheat',-60,false,false,false,0.5,'01/15/2015','02/26/2015','All','&nbsp;','&nbsp;',60,'even','c=2246&l=3519&d=G15',quotes['KEH15'], 0-0);
writeBidRow('Wheat',-65,false,false,false,0.5,'07/01/2015','07/31/2015','All','&nbsp;','&nbsp;',60,'odd','c=2246&l=3519&d=N15',quotes['KEN15'], 0-0);
</script>
</tbody>
</table>

我唯一感兴趣的是获得4.91美元的字符串,名为&#34; price&#34;。它位于HTML代码行中,向右缩进。谁能告诉我用什么代码来完成这个?

1 个答案:

答案 0 :(得分:0)

以下源代码中都注明了所有内容并附有注释。

@Override
protected Void doInBackground(Void... params) {
    String value = null;
    String html = null;
    Document doc = null;
    Elements hrefEles = null;

    try {
        //Connect to website
        html = Jsoup.connect(url).get().toString();

        if (html != null && html.length() > 0) {
            doc = Jsoup.parse(html);

            if (doc != null) {
                /** Get all A tag element with HREF attribute like '/markets/cashchart.php?c=2246' **/
                hrefEles = doc.select("a[href*=/markets/cashchart.php?c=2246]"); 

                if (hrefEles != null && hrefEles.size() > 0) {
                    for (Element e: hrefEles) {
                        value = e.ownText();
                        break;
                    }

                    System.out.println("value: " + value);
                }
            }
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
    return null;
}