通过单独的某些单词拆分/解析PHP字符串

时间:2012-05-04 03:16:21

标签: php string parsing split explode

我搜索过PHP手册,Stackoverflow和一些论坛,但我对一些PHP逻辑感到难过。也许我只是累了,但我真的很感激任何人的帮助或指导。

我有一个PHP字符串,比如说:

 $string = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3';

最终,我希望我的最终输出看起来像这样,但是现在只需要获取数组就可以了:

 <h2>cats</h2>
 <ul>
     <li>cat1</li>
     <li>cat2</li>
     <li>cat3</li>
 </ul>

 <h2>dogs</h2>
 <ul>
     <li>dog1</li>
     <li>dog2</li>
 </ul>

 <h2>monkey creatures</h2>
 <ul>
     <li>monkey_creature1</li>
     <li>monkey_creature2</li>
     <li>monkey_creature3</li>
 </ul>

虽然有一个问题,但有时字符串会略有不同:

 $string = 'cats cat1 cat2 cat3 cat4 cat5 cats6 dogs dogs1 dogs2 monkey creatures monkey_creature1 lemurs lemur1 lemur2 lemur3';

无论如何,这是我在Stackoverflow上的第一个问题,并提前感谢所有帮助人员!

编辑:我正在某些限制下工作,我无法在字符串之前更改任何代码。我知道所有的父母('猫','狗','狐猴','猴子生物(有空间)'

5 个答案:

答案 0 :(得分:4)

我设计的答案无论“关键字”之间是否有空格都有效,只要第一个关键字不是复数:)

下面是代码,随便查看一下,你用文字做什么真的很漂亮:)。

<?
$string = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3';

$current_prefix = '';
$potential_prefix_elements = array();

$word_mapping = array();

foreach(split(" ", $string) as $substring) {
    if(strlen($current_prefix)) {
        // Check to see if the current substring, starts with the prefix
        if(strrpos($substring, $current_prefix) === 0)
            $word_mapping[$current_prefix . 's'][] = $substring;
        else
            $current_prefix = '';
    }

    if(!strlen($current_prefix)) {
        if(preg_match("/(?P<new_prefix>.+)s$/", $substring, $matches)) {
            $potential_prefix_elements[] = $matches['new_prefix'];

            // Add an 's' to make the keys plural
            $current_prefix = join("_", $potential_prefix_elements);

            // Initialize an array for the current word mapping
            $word_mapping[$current_prefix . 's'] = array();

            // Clear the potential prefix elements
            $potential_prefix_elements = array();
        } else {
            $potential_prefix_elements[] = $substring;
        }
    }
}

print_r($word_mapping);

这是输出,我已经将它作为数组提供给您,因此您可以轻松构建ul / li层次结构:)

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
        )

    [dogs] => Array
        (
            [0] => dog1
            [1] => dog2
        )

    [monkey_creatures] => Array
        (
            [0] => monkey_creature1
            [1] => monkey_creature2
            [2] => monkey_creature3
        )

)

答案 1 :(得分:2)

您可能希望使用preg_match_all函数并使用正则表达式。这样,您就不必使用任何循环:

$matches = array();
$string = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3'
preg_match_all('/((?:[a-z]+ )*?[a-z]+s) ((?:[a-z_]+[0-9] ?)+)*/i', $string, $matches);

// $matches now contains multidemensional array with 3 elements, indices
// 1 and 2 contain the animal name and list of those animals, respectively
$animals = array_combine($matches[1], $matches[2]);
$animals = array_map(function($value) {
    return explode(' ', trim($value));
}, $animals);
print_r($animals);

输出:

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
        )

    [dogs] => Array
        (
            [0] => dog1
            [1] => dog2
        )

    [monkey creatures] => Array
        (
            [0] => monkey_creature1
            [1] => monkey_creature2
            [2] => monkey_creature3
        )

)

答案 2 :(得分:1)

你的第二个例子是字符串:

<?php

$parents = array('cats', 'dogs', 'monkey creatures', 'lemurs');
$result = array();

$dataString = 'cats cat1 cat2 cat3 cat4 cat5 cats6 dogs dogs1 dogs2 monkey creatures monkey_creature1 lemurs lemur1 lemur2 lemur3';
foreach ($parents as $parent) {
  // Consider group only if it is present in the data string
  if (strpos($dataString, $parent) !== false) {
    $result[$parent] = array();
  }
}
$parts = explode(' ', $dataString);
foreach (array_keys($result) as $group) {
  $normalizedGroup = str_replace(' ', '_', $group);
  foreach ($parts as $part) {
    if (preg_match("/^$normalizedGroup?\d+$/", $part)) {
      $result[$group][] = $part;
    }
  }
}
print_r($result);

输出:

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
            [3] => cat4
            [4] => cat5
            [5] => cats6
        )

    [dogs] => Array
        (
            [0] => dogs1
            [1] => dogs2
        )

    [monkey creatures] => Array
        (
            [0] => monkey_creature1
        )

    [lemurs] => Array
        (
            [0] => lemur1
            [1] => lemur2
            [2] => lemur3
        )

)

答案 3 :(得分:1)

这是我的$ 0.50

<?php
$parents = array('cats', 'dogs', 'lemurs', 'monkey creatures');

// Convert all spaces to underscores in parents
$cleaned_parents = array();
foreach ($parents as $parent)
{
        $cleaned_parents[] = str_replace(' ', '_', $parent);
}

$input = 'cats cat1 cat2 cat3 dogs dog1 dog2 monkey creatures monkey_creature1 monkey_creature2 monkey_creature3';

// Change all parents to the "cleaned" versions with underscores
$input = str_replace($parents, $cleaned_parents, $input);

// Make an array of all tokens in the input string
$tokens = explode(' ', $input);
$result = array();

// Loop through all the tokens
$currentParent = null; // Keep track of current parent
foreach ($tokens as $token)
{
    // Is this a parent?
    if (in_array($token, $cleaned_parents))
    {
        // Create the parent in the $result array
        $currentParent = $token;
        $result[$currentParent] = array();
    }
    elseif ($currentParent != null)
    {
        // Add as child to the current parent
        $result[$currentParent][] = $token;
    }
}

print_r($result);

输出:

Array
(
    [cats] => Array
        (
            [0] => cat1
            [1] => cat2
            [2] => cat3
        )

    [dogs] => Array
        (
            [0] => dog1
            [1] => dog2
        )

    [monkey_creatures] => Array
        (
            [0] => monkey_creature1
            [1] => monkey_creature2
            [2] => monkey_creature3
        )

)

答案 4 :(得分:1)

想想我将无法提交最佳答案,因此决定以最少的线路运行。 (开玩笑,抱歉极其肮脏的代码)

$string = 'cats cat1 cat2 cat3 cat4 cat5 cats6 dogs dogs1 dogs2 monkey creatures monkey_creature1 lemurs lemur1 lemur2 lemur3';
$categories = array( 'cats', 'dogs', 'monkey creatures', 'lemurs' );

for( $i=0; $i<count( $categories ); $i++ ) $parts[] = @explode( ' ', strstr( $string, $categories[$i] ) );
for( $i=0; $i<count( $parts ); $i++ ) $groups[] = ($i<count($parts)-1) ? array_diff( $parts[$i], $parts[$i+1] ) : $parts[$i];
for( $i=0; $i<count( $groups ); $i++ ) for( $j=0; $j<count( $groups[$i] ); $j++ ) if( ! is_numeric( substr( $groups[$i][$j], -1 ) ) ) unset($groups[$i][$j]);

print_r( $groups );

您可能会注意到我的方法取决于元素应该具有数字后缀的事实。这实际上是无稽之谈,但我们正在处理的是输入。

我的输出是:

Array
(
    [0] => Array
        (
            [1] => cat1
            [2] => cat2
            [3] => cat3
            [4] => cat4
            [5] => cat5
            [6] => cats6
        )

    [1] => Array
        (
            [1] => dogs1
            [2] => dogs2
        )

    [2] => Array
        (
            [2] => monkey_creature1
        )

    [3] => Array
        (
            [1] => lemur1
            [2] => lemur2
            [3] => lemur3
        )

)