PHP,用于解析数据的正则表达式

时间:2010-12-23 09:23:22

标签: php regex

我的数据格式为:

足球 - 101卡罗莱纳黑豹+15 -110比赛

足球 - 卡罗莱纳黑豹队/匹兹堡钢人队以101½110的成绩获得比赛

足球 - 上半场102匹兹堡钢人队-9 -120


如何将其转换为PHP数组:

$game_data[] = array( 'sport_type'  => 'Football',
                      'game_number' => 101,
                      'game_name'   => 'Carolina Panthers',
                      'runline_odd' => '+15 -110',
                      'total_odd'   => '',
                      'odd_type'    => 'runline',
                      'period'      => 'Game' );

$game_data[] = array( 'sport_type'  => 'Football',
                      'game_number' => 101,
                      'game_name'   => 'Carolina Panthers/Pittsburgh Steelers',
                      'runline_odd' => '',
                      'total_odd'   => 'under 36½ -110',
                      'odd_type'    => 'total_odd',
                      'period'      => 'Game' );

$game_data[] = array( 'sport_type'  => 'Football',
                      'game_number' => 102,
                      'game_name'   => 'Pittsburgh Steelers',
                      'runline_odd' => '-9 -120',
                      'total_odd'   => '',
                      'odd_type'    => 'runline',
                      'period'      => '1st Half' );

2 个答案:

答案 0 :(得分:1)

以下作品除了gmae名称之外的情况:

/([^-]+)\s*-\s*(\d+)\s*([^\d+-]+)\s*((?:under\s*)?[\d\s+-]+)\s*for\s*(.+)/

说明:

([^-]+): Match anything other than -, which is separating gmae name from other details.
\s*-\s*: - surrounded with spaces
(\d+)  : Game number
([^\d+-]+): Anything other than +, -, a digit. Matches gmae name.
((?:under\s*)?[\d\s+-]+): runline odd or total odd.

PS:

  1. 照顾有“欠”的情况。上面的正则表达式是使用game_name转储它。
  2. 照顾unicode字符。

答案 1 :(得分:1)

通常情况下,我不会为某人解决整个问题,但½角色让它变得有趣。现在,我不是正则表达式的超级专家,所以这可能不是最优化或优雅的解决方案,但它似乎完成了工作。至少使用提供的样本输入。

编辑:哎呀。没有发现under实际上是runline_odd数据的一部分。所以这实际上并没有完成工作。我会回来的。

EDIT2:略微修改了正则表达式,现在它在runline_oddrunline_total之间正确匹配。

<?php
$input = array(
'Football - 101 Carolina Panthers +15 -110 for Game',
'Football - 101 Carolina Panthers/Pittsburgh Steelers under 36½ -110 for Game',
'Football - 102 Pittsburgh Steelers -9 -120 for 1st Half'
);

$regex = '^(?<sport_type>[[:alpha:]]*) - '.
         '(?<game_number>[0-9]*) '.
         '('.
            '(?<game_nameb>[[:alpha:]\/ ]*?) '.
            '(?<runline_total>(under ([0-9\x{00BD}]+){1}) ((-|\+)?([-+0-9\x{00BD}]+){1})) for '.
         '|'.
            '(?<game_namea>[[:alpha:]\/ ]*) '.
            '(?<runline_odd>((-|\+)?([0-9\x{00BD}]+){1}) ((-|\+)?([-+0-9\x{00BD}]+){1})) for '.
         ')'.
         '(?<period>.*)$';


$game_data = array();

foreach ($input as $in) {
    $matches = false;
    $cnt = preg_match('/' . $regex . '/ui', $in, $matches);

    if ($cnt && is_array($matches) && count($matches)) {
        if (empty($matches['game_nameb'])) {
            $game_name = $matches['game_namea'];
            $runline_odd = $matches['runline_odd'];
            $total_odd = '';
        } else {
            $game_name = $matches['game_nameb'];
            $runline_odd = '';
            $total_odd = $matches['runline_total'];
        }


        $result = array(
            'sport_type' => $matches['sport_type'],
            'game_number' => $matches['game_number'],
            'game_name' => $game_name,
            'runline_odd' => $runline_odd,
            'total_odd' => $total_odd,
            'period' => $matches['period']
        );

        array_push($game_data, $result);
    }
}

var_dump($game_data);

这会产生以下结果:

$ /usr/local/bin/php preg-match.php 
array(3) {
[0]=>
  array(6) {
    ["sport_type"]=>
    string(8) "Football"
    ["game_number"]=>
    string(3) "101"
    ["game_name"]=>
    string(17) "Carolina Panthers"
    ["runline_odd"]=>
    string(8) "+15 -110"
    ["total_odd"]=>
    string(0) ""
    ["period"]=>
    string(4) "Game"
  }
  [1]=>
  array(6) {
    ["sport_type"]=>
    string(8) "Football"
    ["game_number"]=>
    string(3) "101"
    ["game_name"]=>
    string(37) "Carolina Panthers/Pittsburgh Steelers"
    ["runline_odd"]=>
    string(0) ""
    ["total_odd"]=>
    string(15) "under 36½ -110"
    ["period"]=>
    string(4) "Game"
  }
  [2]=>
  array(6) {
    ["sport_type"]=>
    string(8) "Football"
    ["game_number"]=>
    string(3) "102"
    ["game_name"]=>
    string(19) "Pittsburgh Steelers"
    ["runline_odd"]=>
    string(7) "-9 -120"
    ["total_odd"]=>
    string(0) ""
    ["period"]=>
    string(8) "1st Half"
  }
}