Question

我正在为我的框架创建一个基本的搜索功能，并寻找一些关于如何以最佳方式显示搜索结果的建议（谷歌风格）。我的MYSQL查询根据搜索查询返回不同的页面。从MySQL返回的结果是完美的，我只需要执行以下操作：

一个例子可能是有人在搜索术语＆＃34; Hello World＆＃34;。我的搜索结果将返回包含＆＃34; hello＆＃34;的所有行。和＆＃34;世界＆＃34;。

我想要实现的目标是：

突出显示搜索查询中的字词，但只显示结果的一部分。我想只返回200个字符并突出显示（粗体）搜索词中任何一个单词的第一次出现。
显示的副本将在CMS中创建并具有html标记。我可以在显示之前剥离html标签，但是如果我以正确的方式进行操作，我希望得到反馈。

我目前使用的代码是：

  // The query string:
  <?php $q = urldecode($_GET['qString']); ?>

  // Run a loop through the results:
  <?php foreach ($this->get("pageResults") AS $result): ?>
      // a clickable H3 to the actual page:
      <h3><?= $this->html->link($result['sub_heading'] . " " . $result['heading'], array("controller" => "pages", "action" => "viewer", "properties" => array($result['name']))) ?></h3>
      <?php
      // Strip all html characters as the content comes from an WYSIWYG editor:
      $value = preg_replace('/<[^>]*>/', '', $result['content']);
      // Find the position within the text:
      $position = stripos($value, $q);
      // If a positive position, display 200 characters and start -100 from the first occurance
      if ($position == true) {
           $string = substr($value, $position - 100, 200);
      } else {
           $string = " ... ";
      ?>
      <p><?= $string ?></p>
      <hr />
 <?php endforeach; ?>

我遇到的主要问题是：

即使查询字符串，搜索结果也会返回行不精确（所以如果列包含，它将返回结果＆＃34;你好＆＃34;和＃34;世界＆＃34;而stripos只会找到＆＃34; hello world＆＃34;。
我不知道在剥离的html中第一次出现单词或短语时围绕<strong></strong>标记的最佳方法。我知道这可能是一个棘手的事情，特别是由于发生问题。我可以没有这个功能，但如果有一种漂亮的方式可以做到这一点很棒：）

非常感谢任何想法！

Answer 1

我建议你阅读Natural Language Full-Text Searches

这是最（基于我的观点），在进行搜索功能时的优化方式。

Answer 2

即使查询字符串不精确，搜索结果也会返回行（如果列包含＆＃34; hello＆＃34;＆＃34; world＆＃34;它将返回结果;而stripos只会返回找到＆＃34;你好世界＆＃34;。

这看起来像一个简单的答案，但看到你通过url传递查询字符串我认为它看起来像这样：

?searchText=Hello%20World

所以你可以打破空格上的单词（使用explode）并创建一个位置数组：

$positionArray = array();

$qs = explode($q, '%20');

$value = preg_replace('/<[^>]*>/', '', $result['content']);

foreach( $qs as $qword ){
    $position = stripos($value, $qword);
    array_push($positionArray, $position);
}

所以现在你会有一系列的位置，你的单词出现在结果中：

positionArray = [4, 15, 32];

所以你可以在这些位置开始相关的突出显示标签（强烈或者你正在使用的任何东西），然后在单词的结尾处关闭它们，或者你可以找到起始位置AND使用类似这样的单词的结束位置：

foreach( $qs as $qword ){
    $start_position = stripos($value, $qword);
    $end_position = $start_position + strlen($qword);
    array_push($positionArray, {qword: $qword, start_position:$start_position, end_position:$end_position});
}

不幸的是我现在没有时间考虑如何在这些位置插入标签，我相信你会弄清楚它（但你可以使用像substr_replace这样的东西）。我希望无论如何这给了你一些想法。

Answer 3

这是一种实现您要求的相当简单的方式。

首先，您还没有说明您在搜索输入上执行的转换，但我猜您会将这些转换为单词并进行不区分大小写的搜索。因此，我将创建一个包含原始搜索字符串的数据结构和一个解析版本，其中包含split up和lowercased：

// $input is your sanitised query
$arr = explode(" ", strtolower($input));

$search_arr = [
    'original' => $input,
    'parsed' => $arr
];

现在，处理数据库的结果：让我们从数据库中调用$text结果。

# strip the html tags
$stripped = strip_tags($text);

# first, see if the original search query is in the page
$pos = stripos($stripped, $search_arr['original']);
if ($pos !== false) {
    # if it is, take a 200 character snippet of the page (note that
    # if the search string occurs earlier than the first 50 characters,
    # we just take the first 200 characters of the page [I used 50 rather
    # than 100 as 100 seemed too many]):
    if ($pos < 50) {
        $stripped = substr($stripped, 0, 200);
    }
    else {
        $stripped = substr($stripped, $pos-50, 200);
    }
    # use a regular expression to enclose the search string in a <strong> tag
    $stripped = preg_replace("/{$search_arr['original']}/i","<strong>$1</strong>", $stripped);
}
else {
    # otherwise, for each word in the parsed version of the search query...
    foreach ($search_arr['parsed'] as $s) {
        # surround it with <> and </> (I'm doing this in case part of the query
        # matches within the <strong> tag - of course, if <> and </> appear in
        # the source text, this could be a problem!) 
        $stripped = preg_replace("/($s)/i", "<>$1</>", $stripped);
    }
    # now replace the <> and </> with strong tags
    $find = [ '<>', '</>'];
    $replace = ['<strong>', '</strong>'];

    $stripped = str_replace($find, $replace, $stripped);

    # find the first <strong> tag...
    $pos = strpos($stripped, "<strong>");
    if ($pos < 50) {
        $stripped = substr($stripped, 0, 200);
    }
    else {
        $stripped = substr($stripped, $pos-50, 200);
    }
}

echo $stripped;

这是相当粗略的，你可能想要改进一些东西，但它应该让你知道如何继续。

PHP搜索结果显示

3 个答案: