jsoup删除某个类的div

时间:2016-02-23 17:40:01

标签: java jsoup

我在private static void retrieveItemInformation() throws InterruptedException, ClassNotFoundException, IOException { txtBoxServerMessage.setText("CLIENT: Waiting for Item Information..."); Thread.sleep(3000); clientNumber = (int) inputFromServer.readObject(); item1Code = (String) inputFromServer.readObject(); item1Desc = (String) inputFromServer.readObject(); item1Deadline = (String) inputFromServer.readObject(); item1AuctionValue = (double) inputFromServer.readObject(); item2Code = (String) inputFromServer.readObject(); item2Desc = (String) inputFromServer.readObject(); item2Deadline = (String) inputFromServer.readObject(); item2AuctionValue = (double) inputFromServer.readObject(); item1Image = (ImageIcon) inputFromServer.readObject(); // READING IMAGE 1 = LINE 378 item2Image = (ImageIcon) inputFromServer.readObject(); // READING IMAGE 2 txtBoxServerMessage.setText("DEBUG: FINISHED IMAGE"); txtBoxServerMessage.setText("CLIENT: Information Recieved! Displaying..."); Thread.sleep(1000); comboItemSelect.addItem(item1Code); comboItemSelect.addItem(item2Code); txtBoxClientNumber.setText(String.valueOf(clientNumber)); updateDisplay(); } 中有一个列表,如下所示:

jsoup

Elements tbody = new Elements(); 可能如下所示(tbody分隔----列表中的元素):

tbody

我的目标是删除所有待处理/ onAir的电影/节目。所以在这个例子中我想摆脱整个<td> <div data-emission="56b2140adb6da7bf3cbf6228" class="mainCell"> <a href="/tv/weather-country-12457/"> <span class="left">16:00</span> <div> <p>Weather - country</p> </div> </a> </div> <div data-emission="56b2140adb6da7bf3cbf6237" class="mainCell shows pending"> <a href="/shows/that's-70-show-550347/epi1201/"> <span class="left">16:10</span> <div> <p>That's 70 show</p> <span class="info">epi. 1201, Show</span> </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 5%"></u> </p> </a> </div> </td> --------------------------------------------------------------------------- <td> <div data-emission="56b23876db6da7bf3cbf6588" class="mainCell pending"> <a href="/tv/weather-563806/"> <span class="left">16:10</span> <div> <p>Weather</p> </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 51%"></u> </p> </a> </div> <div data-emission="56b23876db6da7bf3cbf6589" class="mainCell"> <a href="/tv/animal-cops-2615/"> <span class="left">16:15</span> <div> <p>Animal Cops</p> <span class="info">epi. 3079, Show</span> </div> </a> </div> <div data-emission="56b23876db6da7bf3cbf658a" class="mainCell shows"> <a href="/show/house-md-1601/odc137/"> <span class="left">16:30</span> <div> <p>House MD</p> <span class="info">epi. 137, Show</span> </div> </a> </div> </td> --------------------------------------------------------------------------- <td> <div data-emission="56b213b3db6da7bf3cbf61a1" class="mainCell movies pending"> <a href="/movie/star-trek-564170/"> <span class="left">16:00</span> <div> <p>Star Trek</p> <span class="info">Movie</span> <span class="szh prem">| Premiere</span> </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 21%"></u> </p> </a> </div> </td>

  • div
  • that's 70 show
  • weather

f.e:

star trek

它只删除一个元素本身,而不是整个for(int i = 0; i < tbody.size(); i++){ tbody.get(i).select("div").select("p").select(".onAir").remove(); } 。我在很多方面尝试过,但没有成功。我将不胜感激。

2 个答案:

答案 0 :(得分:4)

似乎挂起的节目也带有pending css类。如果所有情况都适用,您可以通过以下方式完成:

doc.select("td>div.pending").remove();

这将删除文档doc中包含div类的所有pending元素。如果他们是td元素的直接孩子。

或者,您可以使用您的方法并使用正确的p类和内部文本过滤onAir元素:

doc.select("td>div:has(p.onAir:contains(Pending))").remove();

请参阅CSS selector syntax以了解Jsoup的力量。

答案 1 :(得分:1)

尝试以下代码段。

Elements mainCells = tbody.select("div.mainCell");
for(int i = 0; i < mainCells.size(); i++){
    Elements mainCellsP = mainCells.get(i).select("div").select("a").select("p");
    if (mainCellsP.size() == 2) {
        // Remove this node from DOM tree
        mainCells.get(i).remove();
    }
}

首先选择要删除的相应节点,然后调用该节点的remove()方法。