如何使用jsoup从此html中提取数据

时间:2015-11-18 12:32:28

标签: android html parsing jsoup

我正在开发一个Android应用程序,我必须每天显示圣训。我有一个关于每天更新圣训的链接。这是该页面的后端html。

<div class="hadith-explanation" id="hadithcontent">
<h2>Today's Hadith</h2>
<br>
<h3>Commitments</h3>
<br>
<p>The Messenger of Allah (sal Allahu alaihi wa sallam) said: "He has (really) no faith who fulfills not his trust,
   and he has (really) no religion who fulfills not his promise." [Baihaqi]<br /><br />Always keep your word.
   Each time you keep a commitment you are rewarded by Allah (subhana wa ta'ala) for obeying Him. If you mix a few drops of wine in a glass
   full of water, it spoils the whole glass of water and makes it unfit for consumption. Similarly, dishonesty in any sphere of your life
   permeates and corrupts your entire nature and eeman. When a person&rsquo;s words carry no weight, it only reveals his/her treacherous nature.
   <br /><br />If you promise to be somewhere, make sure you are there on time. If you promise to call somebody back, do so on time. Don't commit
   what you cannot do. One minute means 60 seconds everywhere, no more. Make a habit of under-committing rather than over-committing.<br /><br />
   The online version of Daily Hadith is available. Please visit http://dailyhadith.adaptivesolutionsinc.com
</p>

我想从段落标记开始直到第一个br标记。即从“真主的安拉......”开始直到[白哈琪]。

我在网上搜索过,发现可以使用jsoup库完成,但我没有太多关于它的编程知识。有人请指导我。

1 个答案:

答案 0 :(得分:0)

您可以使用childNodes()

爪哇:

package com.github.davidepastore.stackoverflow33780189;

import java.io.IOException;
import java.io.InputStream;

import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;

/**
 * Stackoverflow 33780189
 *
 */
public class App 
{
    public static void main( String[] args ) throws IOException
    {
        ClassLoader classloader = Thread.currentThread()
                .getContextClassLoader();
        InputStream is = classloader.getResourceAsStream("file.html");
        Document document = Jsoup.parse(is, "UTF-8", "");
        Element element = document.select(".hadith-explanation p").first();
        String result = element.childNodes().get(0).toString();

        System.out.println("Result: " + result);
    }
}

file.html:

<div class="hadith-explanation" id="hadithcontent">
    <h2>Today's Hadith</h2>
    <br>
    <h3>Commitments</h3>
    <br>
    <p>
        The Messenger of Allah (sal Allahu alaihi wa sallam) said: "He has
        (really) no faith who fulfills not his trust, and he has (really) no
        religion who fulfills not his promise." [Baihaqi]<br />
        <br />Always keep your word. Each time you keep a commitment you are
        rewarded by Allah (subhana wa ta'ala) for obeying Him. If you mix a
        few drops of wine in a glass full of water, it spoils the whole glass
        of water and makes it unfit for consumption. Similarly, dishonesty in
        any sphere of your life permeates and corrupts your entire nature and
        eeman. When a person&rsquo;s words carry no weight, it only reveals
        his/her treacherous nature. <br />
        <br />If you promise to be somewhere, make sure you are there on
        time. If you promise to call somebody back, do so on time. Don't
        commit what you cannot do. One minute means 60 seconds everywhere, no
        more. Make a habit of under-committing rather than over-committing.<br />
        <br /> The online version of Daily Hadith is available. Please visit
        http://dailyhadith.adaptivesolutionsinc.com
    </p>
</div>

A similar case here on stackoverflow