使用simple_html_dom来解析ul

时间:2011-05-03 22:02:02

标签: php regex simple-html-dom

我想获得这个ul中每个范围的内部文本。

<ul class="alternatingList">
     <li><strong>Last Played</strong><span id="ctl00_mainContent_lastPlayedLabel">04.29.2011</span></li>
     <li class="alt"><strong>Armory Completion</strong><span id="ctl00_mainContent_armorCompletionLabel">52%</span></li>
     <li><strong>Daily Challenges</strong><span id="ctl00_mainContent_dailyChallengesLabel">127</span></li>
     <li class="alt"><strong>Weekly Challenges</strong><span id="ctl00_mainContent_weeklyChallengesLabel">4</span></li>
     <li><strong>Matchmaking MP Kills</strong><span id="ctl00_mainContent_matchmakingKillsLabel">11,280 (1.18)</span></li>
     <li class="alt"><strong>Matchmaking MP Medals</strong><span id="ctl00_mainContent_medalsLabel">15,383</span></li>
     <li><strong>Covenant Killed</strong><span id="ctl00_mainContent_covenantKilledLabel">10,395</span></li>
     <li class="alt"><strong>Player Since</strong><span id="ctl00_mainContent_playerSinceLabel">09.13.2010</span></li>
     <li class="gamesPlayed"><strong>Games Played</strong><span id="ctl00_mainContent_gamesPlayedLabel">975</span></li>
</ul>

我现在有这个,但我想在没有为每个范围编写相同的代码的情况下这样做。

//pull last played text
$last_played = '';
$last_played_el = $html->find(".alternatingList");
if (preg_match('|<span[^>]+>(.*)</span>|U', $last_played_el[0], $matches)) {
    $last_played = $matches[1];
}

echo $last_played;

2 个答案:

答案 0 :(得分:0)

看看preg_match_all。它将给出所有匹配的数组

答案 1 :(得分:0)

你已经在使用解析器了,为什么要使用正则表达式?您可以非常轻松地访问每个span的内部文本:

$html = new simple_html_dom();  

// Load from a string, where $raw is your UL or the page its on  
$html->load($raw); 

foreach($html->find('.alternatingList span')as $found) {
    echo $found->innertext . "\n";
}