在PHP中匹配/比较文本字符串

时间:2013-04-02 14:09:14

标签: php compare matching similarity

匹配/比较PHP中的文本字符串

大家好, 我想比较一些字符串,基本上是为了了解我是否在产品的Feed中有产品。由于来源不同,完美匹配(相同)并不确定。 由于产品的名称有时会有更多或更少的字符(iPad白色和iPad Apple白色),我想做一个近似匹配,也许类似于Lucene中的模糊搜索(〜)。

我知道到目前为止并使用了preg_match和levenshtein。你能推荐任何其他方法来为PHP的字符串进行相似性匹配吗?

1 个答案:

答案 0 :(得分:2)

您问过是否有人有使用想法:嗯,这是来自PHP网站的示例,但我想它可以帮助您。

(我已修改代码以适合您网站上的某种体验):

<?php

$productString= 'Apple white IPOD';

// array of words to check against
$products = array('zen','dell laptop','apple laptop','apple black ipod',
                'apple mini','Random product');

// no shortest distance found, yet
$shortest = -1;

// loop through products to find the closest product
foreach ($products as $product) {

    // calculate the distance between the input word,
    // and the current word
    $lev = levenshtein($productString, $product);

    // check for an exact match
    if ($lev == 0) {

        // closest word is this one (exact match)
        $closest = $product;
        $shortest = 0;

        // break out of the loop; we've found an exact match
        break;
    }

    // if this distance is less than the next found shortest
    // distance, OR if a next shortest word has not yet been found
    if ($lev <= $shortest || $shortest < 0) {
        // set the closest match, and shortest distance
        $closest  = $word;
        $shortest = $lev;
    }
}

echo "Search product: $productString\n";
if ($shortest == 0) {
    echo "Exact match found: $closest\n";
} else {
    echo "Did you mean: $closest?\n";
}

?>

上面的代码搜索产品列表,数组,并找到最接近的匹配项。如果找到完全匹配,则使用该匹配。