检查哪个标签与标题最相关

时间:2017-07-10 12:53:49

标签: php mysql

有两个dbs:

+-----+----------------+
| id  |     tag        |
+----------------------+
|  1  |      Audi      |
|  2  |      BMW       |
|  3  |  Volkswagen    |
|  4  |  Mercedes Benz |
+----------------------+

+-----+-------------------------------+------+
| id  |           title               |  tag |
+-------------------------------------+ -----+
|  1  |      Audi is a great car      | NULL | 
+-------------------------------------+------+   

我需要做什么:

1.检查哪个标签与标题最相关。

2.提取最相关的标签并插入标题附近的数据库中。

到目前为止我做了什么:

function compareStrings($s1, $s2) {
    //one is empty, so no result
    if (strlen($s1)==0 || strlen($s2)==0) {
        return 0;
    }

    //replace none alphanumeric charactors
    //i left - in case its used to combine words
    $s1clean = preg_replace("/[^A-Za-z0-9-]/", ' ', $s1);
    $s2clean = preg_replace("/[^A-Za-z0-9-]/", ' ', $s2);

    //remove double spaces
    while (strpos($s1clean, "  ")!==false) {
        $s1clean = str_replace("  ", " ", $s1clean);
    }
    while (strpos($s2clean, "  ")!==false) {
        $s2clean = str_replace("  ", " ", $s2clean);
    }

    //create arrays
    $ar1 = explode(" ",$s1clean);
    $ar2 = explode(" ",$s2clean);
    $l1 = count($ar1);
    $l2 = count($ar2);

    //flip the arrays if needed so ar1 is always largest.
    if ($l2>$l1) {
        $t = $ar2;
        $ar2 = $ar1;
        $ar1 = $t;
    }

    //flip array 2, to make the words the keys
    $ar2 = array_flip($ar2);


    $maxwords = max($l1, $l2);
    $matches = 0;

    //find matching words
    foreach($ar1 as $word) {
        if (array_key_exists($word, $ar2))
            $matches++;
    }

    return ($matches / $maxwords) * 100;    
}


$all_values = '';

$sql_object = "SELECT * FROM tag";
$result_object = mysql_query($sql_object);
while($row_object = mysql_fetch_array($result_object))
{
    $tag = $row_object['tag'];

    $sql_subject = "SELECT * FROM title ORDER BY added";
    $result_subject = mysql_query($sql_subject);
    while($row_subject = mysql_fetch_array($result_subject))
    {
        $title = $row_subject['title'];

        $all_values .= "Title($title) and Tag($tag) relevancy:". compareStrings($tag, $title) . "%"."<br/>";
    }
}
echo $all_values;

输出:

Title(Audi is a great car) and Tag(Audi) relevancy:20%
Title(Audi is a great car) and Tag(BMW) relevancy:0%
Title(Audi is a great car) and Tag(Volkswagen) relevancy:0%
Title(Audi is a great car) and Tag(Mercedes Benz) relevancy:0%

问题是:如何从$ all_values中提取最相关的标签并插入到数据库中,因为这里我被卡住了。或者也许有更好的解决方案。我将不胜感激任何帮助。

0 个答案:

没有答案