优化字符串搜索的前缀树

时间:2021-07-13 14:52:38

标签: algorithm scala tree prefix-tree

我正在考虑改进前缀树。它允许我搜索包含输入字符串的指定数量的单词。

任务:我们需要一个通过子字符串实现公司名称列表的类——从所有可用名称的列表中,输出一定数量的以输入行开头的公司。假设在具有高 RPS(每秒请求数)的网站/移动应用程序上填写表单时将调用该类。

我的解决方案:

 class SuggestService(companyNames : Seq[String]) {

  val arrayCompany = companyNames.toArray
  val tree = getTree(Ternary.apply, 0)

  def getTree(tree: Ternary, index: Int): Ternary = {
    if(index == arrayCompany.length-1)
      return tree.insert(arrayCompany(index))

    getTree(tree.insert(arrayCompany(index)), index+1)
  }

  def suggest(input: String, numberOfSuggest : Int) : Seq[String] = {
    val result = tree.keysWithPrefix(input)
    result.take(numberOfSuggest)
  }
}

树类:

sealed trait Ternary {
  def insert(key: String): Ternary = Ternary.insert(this, key, 0)

  def keysWithPrefix(prefix: String): List[String] = Ternary.keys(this, prefix)
}

case class Node(value: Option[Int], char: Char, left: Ternary, mid: Ternary, right: Ternary) extends Ternary

case object Leaf extends Ternary

object Ternary {
  def apply: Ternary = Leaf

  private def keys(root: Ternary, prefix: String): List[String] =
    get(root, prefix, 0) match {
      case None => Nil
      case Some(node) =>
        collect(node, prefix.dropRight(1))
    }

  private def collect(node: Ternary, prefix: String): List[String] =
    node match {
      case Leaf => Nil
      case node: Node if node.value.isDefined =>
        (prefix + node.char) +: (collect(node.left, prefix) ++ collect(node.mid, prefix + node.char) ++ collect(node.right, prefix))
      case node: Node =>
        collect(node.left, prefix) ++ collect(node.mid, prefix + node.char) ++ collect(node.right, prefix)
    }

  private def get(root: Ternary, prefix: String, step: Int): Option[Ternary] = root match {
    case Leaf => None
    case node: Node if node.char > prefix.charAt(step) => get(node.left, prefix, step)
    case node: Node if node.char < prefix.charAt(step) => get(node.right, prefix, step)
    case node: Node if step < prefix.length - 1 => get(node.mid, prefix, step + 1)
    case node: Node => Some(node)
  }

  private def insert(root: Ternary, key: String, step: Int): Ternary = root match {
    case Leaf =>
      val node = Node(None, key.charAt(step), Leaf, Leaf, Leaf)
      insert(node, key, step)

    case node: Node if node.char > key.charAt(step) =>
      val left = insert(node.left, key, step)
      node.copy(left = left)

    case node: Node if node.char < key.charAt(step) =>
      val right = insert(node.right, key, step)
      node.copy(right = right)

    case node: Node if step < key.length - 1 =>
      val mid = insert(node.mid, key, step + 1)
      node.copy(mid = mid)

    case node: Node =>
      node.copy(value = Some(0))
  }
}

该解决方案运行良好,并且似乎有望非常有效地工作,但我对它的不满程度已经足够了。 根据条件,我们必须返回数量等于 numberOfSuggest 的单词列表。

并且我强制树返回包含输入的所有单词。然后我才从结果列表中获取所需数量的单词:

def suggest(input: String, numberOfSuggest : Int) : Seq[String] = {
    val result = tree.keysWithPrefix(input)
    result.take(numberOfSuggest)
  }

我想尽量节省时间,教树返回一个现成的受numberOfSuggest数量限制的单词列表。

实验:https://scastie.scala-lang.org/m0MxnlChT0GkpGJIBnNnUQ

0 个答案:

没有答案
相关问题