我有以下html字符串:
<h3>I only want this content</h3> I don't want this content <b>random content</b>
我想只从h3标签中获取内容并删除其他内容。我有以下内容:
String getArticleBody = listArt.getChildText("body");
StringBuilder mainArticle = new StringBuilder();
String getSubHeadlineFromArticle;
if(getArticleBody.startsWith("<h3>") && getArticleBody.endsWith("</h3>")){
mainArticle.append(getSubHeadlineFromArticle);
}
但这会返回整个内容,这不是我所追求的。如果有人能帮助我,那将非常感谢。
答案 0 :(得分:1)
谢谢,伙计们。你的所有答案都有效,但我最终还是使用了Jsoup。
String getArticleBody = listArt.getChildText("body");
org.jsoup.nodes.Document docc = Jsoup.parse(getArticleBody);
org.jsoup.nodes.Element h3Tag = docc.getElementsByTag("h3").first();
String getSubHeadlineFromArticle = h3Tag.text();
答案 1 :(得分:0)
你可以使用像这样的子串方法 -
String a="<h3>I only want this content</h3> I don't want this content <b>random content</b>";
System.out.println(a.substring(a.indexOf("<h3>")+4,a.indexOf("</h3>")));
输出 -
I only want this content
答案 2 :(得分:0)
试试这个
String result = getArticleBody.substring(getArticleBody.indexOf("<h3>"), getArticleBody.indexOf("</h3>"))
.replaceFirst("<h3>", "");
System.out.println(result);
答案 3 :(得分:0)
你需要使用这样的正则表达式:
public static void main(String[] args) {
String str = "<h3>asdfsdafsdaf</h3>dsdafsdfsafsadfa<h3>second</h3>";
// your pattern goes here
// ? is important since you need to catch the nearest closing tag
Pattern pattern = Pattern.compile("<h3>(.+?)</h3>");
Matcher matcher = pattern.matcher(str);
while (matcher.find()) System.out.println(matcher.group(1));
}
matcher.group(1)
返回h3标记之间的文本。
答案 4 :(得分:0)
使用正则表达式
它可能会帮助你:
String str = "<h3>I only want this content</h3> I don't want this content <b>random content</b>";
final Pattern pattern = Pattern.compile("<h3>(.+?)</h3>");
final Matcher matcher = pattern.matcher(str);
matcher.find();
System.out.println(matcher.group(1)); // Prints String I want to extract
输出
I only want this content
答案 5 :(得分:0)
其他答案已经涵盖了如何获得所需的结果。我要评论你的代码来解释为什么它还没有这样做。 (注意我修改了你的变量名,因为字符串没有得到任何东西;它们 是一个东西。)
// declare a bunch of variables
String articleBody = listArt.getChildText("body");
StringBuilder mainArticle = new StringBuilder();
String subHeadlineFromArticle;
// check to see if the article body consists entirely of a subheadline
if(articleBody.startsWith("<h3>") && articleBody.endsWith("</h3>")){
// if it does, append an empty string to the StringBuilder
mainArticle.append(subHeadlineFromArticle);
}
// if it doesn't, don't do anything
// final result:
// articleBody = the entire article body
// mainArticle = empty StringBuilder (regardless of whether you appended anything)
// subHeadlineFromArticle = empty string