Spark数据帧udf没有TypeTag可用

时间:2016-05-30 10:24:52

标签: scala apache-spark

我正在尝试使用过滤器转换器扩展spark ml管道模型,之后

abstract class RuleFilter[IN, T <: RuleFilter[IN, T]]
    extends RuleTransformer with HasInputCol  {
  // def filterFuntion: String
  /** @group setParam */
  def setInputCol(value: String): T = set(inputCol, value).asInstanceOf[T]

  protected def createFilterFunc: IN => Boolean

  override def transform(df: DataFrame): DataFrame = {
    transformSchema(df.schema, logging = true)
    val transformUDF = udf[Boolean, IN](this.createFilterFunc)
    df.filter(transformUDF(df($(inputCol))))
  }
}

此代码未编译时出错:

 No TypeTag available for IN
[error]     val transformUDF = udf[Boolean, IN](this.createFilterFunc)

我该如何让它发挥作用?

我需要它在继承类中使用一些明确定义的类型,例如

class PriceFilter extends RuleFilter {
    def createFilterFunc(val: Double) = val > 500
}

1 个答案:

答案 0 :(得分:1)

您需要明确告诉编译器您希望类型为In的{​​{3}}:

import scala.reflect.runtime.universe._
abstract class RuleFilter[In: TypeTag, T <: RuleFilter[In, T]]