PHP RegEx匹配部分数据

时间:2012-09-09 13:02:04

标签: php regex

我正在尝试使用RegEx解析ESPN底线提要但是我的时间有点短。而不是它获得所有的NFL比赛只有一半左右。

这是ESPN返回的数据。

 &nfl_s_delay=120&nfl_s_stamp=0909064837&nfl_s_left1=^Dallas 24   NY Giants 17 (FINAL)&nfl_s_right1_count=0&nfl_s_url1=http://sports.espn.go.com/nfl/boxscore?gameId=320905019&nfl_s_left2=Indianapolis at Chicago (1:00 PM ET)&nfl_s_right2_count=0&nfl_s_url2=http://sports.espn.go.com/nfl/preview?gameId=320909003&nfl_s_left3=Philadelphia at Cleveland (1:00 PM ET)&nfl_s_right3_count=0&nfl_s_url3=http://sports.espn.go.com/nfl/preview?gameId=320909005&nfl_s_left4=St. Louis at Detroit (1:00 PM ET)&nfl_s_right4_count=0&nfl_s_url4=http://sports.espn.go.com/nfl/preview?gameId=320909008&nfl_s_left5=New England at Tennessee (1:00 PM ET)&nfl_s_right5_count=0&nfl_s_url5=http://sports.espn.go.com/nfl/preview?gameId=320909010&nfl_s_left6=Atlanta at Kansas City (1:00 PM ET)&nfl_s_right6_count=0&nfl_s_url6=http://sports.espn.go.com/nfl/preview?gameId=320909012&nfl_s_left7=Jacksonville at Minnesota (1:00 PM ET)&nfl_s_right7_count=0&nfl_s_url7=http://sports.espn.go.com/nfl/preview?gameId=320909016&nfl_s_left8=Washington at New Orleans (1:00 PM ET)&nfl_s_right8_count=0&nfl_s_url8=http://sports.espn.go.com/nfl/preview?gameId=320909018&nfl_s_left9=Buffalo at NY Jets (1:00 PM ET)&nfl_s_right9_count=0&nfl_s_url9=http://sports.espn.go.com/nfl/preview?gameId=320909020&nfl_s_left10=Miami at Houston (1:00 PM ET)&nfl_s_right10_count=0&nfl_s_url10=http://sports.espn.go.com/nfl/preview?gameId=320909034&nfl_s_left11=San Francisco at Green Bay (4:25 PM ET)&nfl_s_right11_count=0&nfl_s_url11=http://sports.espn.go.com/nfl/preview?gameId=320909009&nfl_s_left12=Seattle at Arizona (4:25 PM ET)&nfl_s_right12_count=0&nfl_s_url12=http://sports.espn.go.com/nfl/preview?gameId=320909022&nfl_s_left13=Carolina at Tampa Bay (4:25 PM ET)&nfl_s_right13_count=0&nfl_s_url13=http://sports.espn.go.com/nfl/preview?gameId=320909027&nfl_s_left14=Pittsburgh at Denver (8:20 PM ET)&nfl_s_right14_count=0&nfl_s_url14=http://sports.espn.go.com/nfl/preview?gameId=320909007&nfl_s_left15=Cincinnati at Baltimore (7:00 PM ET)&nfl_s_right15_count=0&nfl_s_url15=http://sports.espn.go.com/nfl/preview?gameId=320910033&nfl_s_left16=San Diego at Oakland (10:15 PM ET)&nfl_s_right16_count=0&nfl_s_url16=http://sports.espn.go.com/nfl/preview?gameId=320910013&nfl_s_count=16&nfl_s_loaded=true

我正在使用两种RegEx模式,一种用于捕捉已完成/正在进行的游戏:

preg_match_all('/nfl_s_left\d{1,2}=\^?(?P<awayteam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(?P<awayscore>\d+)\s+\^?(?P<hometeam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(?P<homescore>\d+)\s+(?P<time>\(.*?\))/', $content, $matches_in_progress, PREG_SET_ORDER);

还有另外一个尚未开始的游戏:

preg_match_all('/nhl_s_left\d=\^?(?P<awayteam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s+(at)+\s+\^?(?P<hometeam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s(?P<time>\(.*?\))/', $content, $matches_upcoming, PREG_SET_ORDER);

16场比赛中只有8场比赛,我无法弄清楚原因。本周的匹配为Dallas at NY GiantsPhiladelphia at ClevelandNew England at TennesseeAtlanta at Kansas CityJacksonville at MinnesotaWashington at New OrleansBuffalo at NY Jets。< / p>

我非常感谢任何帮助让我的RegEx与本周和未来几周的所有16场比赛相匹配。

编辑:修改后的字符串匹配,忘记在将RegEx应用到它之前删除%20。

2 个答案:

答案 0 :(得分:0)

你的正则表达式非常复杂。由于原始字符串被编码为URL参数,因此您可以使用parse_str来获取密钥=&gt;值数组。然后使用正则表达式查找所需的条目。

$string = "&nfl_s_delay=120&nfl_s_stamp=0909054449&nfl_s_left1=^Dallas%2024%20%20%20NY%20Giants%2017%20(FINAL)&nfl_s_right1_count=0&nfl_s_url1=http://sports.espn.go.com/nfl/boxscore?gameId=320905019&nfl_s_left2=Indianapolis%20at%20Chicago%20(1:00%20PM%20ET)&nfl_s_right2_count=0&nfl_s_url2=http://sports.espn.go.com/nfl/preview?gameId=320909003&nfl_s_left3=Philadelphia%20at%20Cleveland%20(1:00%20PM%20ET)&nfl_s_right3_count=0&nfl_s_url3=http://sports.espn.go.com/nfl/preview?gameId=320909005&nfl_s_left4=St.%20Louis%20at%20Detroit%20(1:00%20PM%20ET)&nfl_s_right4_count=0&nfl_s_url4=http://sports.espn.go.com/nfl/preview?gameId=320909008&nfl_s_left5=New%20England%20at%20Tennessee%20(1:00%20PM%20ET)&nfl_s_right5_count=0&nfl_s_url5=http://sports.espn.go.com/nfl/preview?gameId=320909010&nfl_s_left6=Atlanta%20at%20Kansas%20City%20(1:00%20PM%20ET)&nfl_s_right6_count=0&nfl_s_url6=http://sports.espn.go.com/nfl/preview?gameId=320909012&nfl_s_left7=Jacksonville%20at%20Minnesota%20(1:00%20PM%20ET)&nfl_s_right7_count=0&nfl_s_url7=http://sports.espn.go.com/nfl/preview?gameId=320909016&nfl_s_left8=Washington%20at%20New%20Orleans%20(1:00%20PM%20ET)&nfl_s_right8_count=0&nfl_s_url8=http://sports.espn.go.com/nfl/preview?gameId=320909018&nfl_s_left9=Buffalo%20at%20NY%20Jets%20(1:00%20PM%20ET)&nfl_s_right9_count=0&nfl_s_url9=http://sports.espn.go.com/nfl/preview?gameId=320909020&nfl_s_left10=Miami%20at%20Houston%20(1:00%20PM%20ET)&nfl_s_right10_count=0&nfl_s_url10=http://sports.espn.go.com/nfl/preview?gameId=320909034&nfl_s_left11=San%20Francisco%20at%20Green%20Bay%20(4:25%20PM%20ET)&nfl_s_right11_count=0&nfl_s_url11=http://sports.espn.go.com/nfl/preview?gameId=320909009&nfl_s_left12=Seattle%20at%20Arizona%20(4:25%20PM%20ET)&nfl_s_right12_count=0&nfl_s_url12=http://sports.espn.go.com/nfl/preview?gameId=320909022&nfl_s_left13=Carolina%20at%20Tampa%20Bay%20(4:25%20PM%20ET)&nfl_s_right13_count=0&nfl_s_url13=http://sports.espn.go.com/nfl/preview?gameId=320909027&nfl_s_left14=Pittsburgh%20at%20Denver%20(8:20%20PM%20ET)&nfl_s_right14_count=0&nfl_s_url14=http://sports.espn.go.com/nfl/preview?gameId=320909007&nfl_s_left15=Cincinnati%20at%20Baltimore%20(7:00%20PM%20ET)&nfl_s_right15_count=0&nfl_s_url15=http://sports.espn.go.com/nfl/preview?gameId=320910033&nfl_s_left16=San%20Diego%20at%20Oakland%20(10:15%20PM%20ET)&nfl_s_right16_count=0&nfl_s_url16=http://sports.espn.go.com/nfl/preview?gameId=320910013&nfl_s_count=16&nfl_s_loaded=true";
parse_str($string, $array);
var_dump($array);

答案 1 :(得分:0)

发现错误。我在第二个正则表达式中没有d{1,2}来匹配游戏10-16。我也没有+(?:\.*)迫使圣路易斯被跳过。

我的新工作RegEx是:

/nfl_s_left\d{1,2}=\^?(?P<awayteam>[a-zA-Z]+(?:\.*)+(?:\s+[a-zA-Z]+)*)\s+(at)+\s+\^?(?P<hometeam>[a-zA-Z]+(?:\s+[a-zA-Z]+)*)\s(?P<time>\(.*?\))/
相关问题