Android从String中提取具有特定域名的URL

时间:2015-06-11 19:01:52

标签: android json

我正在开发一个JSON应用程序。我能够下载所有数据,但我遇到了一个有趣的问题。我正在尝试使用域名抓取一个字符串:

http://www.prindlepost.org/

当抓取所有JSON时,我得到一个非常大的字符串,我无法在那里粘贴。我试图解析的部分是:

<p>The road through Belgrade was quiet at 4 A.M. Besides the occasional whir of another car speeding by, my taxi was largely alone on the road. Through the windshield I could see the last traces of apartment blocks pass by as we left the outskirts of the city. Somewhere beyond the limits of my vision, I knew the airport waited, its converging neon runway lines already lighting up the pre-dawn darkness.</p>
    <div class="more-link-wrap wpb_button"> <a href="http://www.prindlepost.org/2015/06/this-is-a-self-portrait/" class="more-link">Read more</a></div>

我关注的地方:

<a href="http://www.prindlepost.org/2015/06/this-is-a-self-portrait/" class="more-link">Read more</a></div>

我不熟悉这样提取字符串。最后,我希望能够将URL保存为自己的字符串。例如,以上内容将转换为:

String url = "http://www.prindlepost.org/2015/06/this-is-a-self-portrait/";

有一点需要注意,有很多网址按类名缩小可能会帮助我一堆。

我最初的猜测是:

// <READ MORE>
Pattern p = Pattern.compile("href=\"(.*?)\"");
Matcher m = p.matcher(content);
String urlTemp = null;
if (m.find()) {
      urlTemp = m.group(1); // this variable should contain the link URL
}
Log.d("LINK WITHIN TEXT", ""+urlTemp);
// </READ MORE>

感谢任何帮助!

1 个答案:

答案 0 :(得分:0)

尝试使用类似http://jsoup.org/

的内容可能会有所作为

如果您查看解析链接的示例:

String html = "<p>The road through Belgrade was quiet at 4 A.M. Besides the occasional whir of another car speeding by, my taxi was largely alone on the road. Through the windshield I could see the last traces of apartment blocks pass by as we left the outskirts of the city. Somewhere beyond the limits of my vision, I knew the airport waited, its converging neon runway lines already lighting up the pre-dawn darkness.</p>"
            + "<div class=\"more-link-wrap wpb_button\">"
            + "<a href=\"http://www.prindlepost.org/2015/06/this-is-a-self-portrait/\" class=\"more-link\">"
            + "Read more</a></div>";

Document doc = Jsoup.parse(html);

Element link = doc.select("a").first();
String relHref = link.attr("href"); // == "/2015/06/this-is-a-self-portrait/"
String absHref = link.attr("abs:href"); // "http://www.prindlepost.org/2015/06/this-is-a-self-portrait/"