使用JSOUP将段落解析为不同的字符串

时间:2013-11-26 21:07:55

标签: java jsoup

如何提取PARAGRAPHS的其余部分,除了java中以下字符串中的图像?

我能够获得法师链接,但我仍然坚持使用Ps。

<img width="300" height="246" src="http://something.mything.com/wp-content/uploads/2013/11/ray-300x246.jpg" class="attachment-medium wp-post-image" alt="rayi_slleiman_bkerke" style="float: right; margin-left: 5px;" />
<p>
<strong>Lorem Ipsum</strong> is simply dummy text of the priectronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
</p>
<p>
 <strong>Lorem Ipsum</strong> is simply dummy text of the priectronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.
</p>

1 个答案:

答案 0 :(得分:1)

试试这个:

String html = "<img width='300' height='246' src='http://mesrobian.sarnok.com/wp-content/uploads/2013/11/rayi_slleiman_bkerke-300x246.jpg'" +
                " class='attachment-medium wp-post-image' alt='rayi_slleiman_bkerke' style='float: right; margin-left: 5px;' /><p><strong>Lorem Ip" +
                "sum</strong> is simply dummy text of the priectronic typesetting, remaining essentially unchanged. It was popularised in the 1960" +
                "s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Al" +
                "dus PageMaker including versions of Lorem Ipsum.</p> <p><strong>Lorem Ipsum</strong> is simply dummy text of the priectronic type" +
                "setting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ip" +
                "sum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>";

        Document doc = Jsoup.parse(html);
        Elements paragraphs = doc.select("p");      
        System.out.println(paragraphs);

将输出:

<p><strong>Lorem Ipsum</strong> is simply dummy text of the priectronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>
<p><strong>Lorem Ipsum</strong> is simply dummy text of the priectronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.</p>