Question

我正尝试使用map函数遍历数据集，返回元素，而无需对新变量进行任何更改。然后调用collect方法。我得到了类强制转换Exception.ClassCastException。我想念什么？

def fun() {
   val df = Seq(Person("Max", 33), 
                Person("Adam", 32), 
                Person("Muller", 62)).toDF()

   val encoderPerson = Encoders.product[Person]

   val personDS: Dataset[Person] = df.as[Person](encoderPerson)

   val newPersonDS = personDS.map { iter2 => iter2}

   newPersonDS.collect()
}


case class Person(name: String, age: Int)

java.lang.ClassCastException：com.query.Person无法转换为com.query.Person 在com.query.MyClass $$ anonfun $ 1.apply（MyClass.scala：42）在org.apache.spark.sql.catalyst.expressions.GeneratedClass $ GeneratedIteratorForCodegenStage3.mapelements_doConsume_0 $（未知来源）在org.apache.spark.sql.catalyst.expressions.GeneratedClass $ GeneratedIteratorForCodegenStage3.deserializetoobject_doConsume_0 $（未知来源）在org.apache.spark.sql.catalyst.expressions.GeneratedClass $ GeneratedIteratorForCodegenStage3.agg_doAggregateWithKeysOutput_0 $（未知来源）在org.apache.spark.sql.catalyst.expressions.GeneratedClass $ GeneratedIteratorForCodegenStage3.processNext（未知来源）在org.apache.spark.sql.execution.BufferedRowIterator.hasNext（BufferedRowIterator.java:43）在org.apache.spark.sql.execution.WholeStageCodegenExec $$ anonfun $ 10 $$ anon $ 1.hasNext（WholeStageCodegenExec.scala：614）在org.apache.spark.sql.execution.SparkPlan $$ anonfun $ 2.apply（SparkPlan.scala：253）在org.apache.spark.sql.execution.SparkPlan $$ anonfun $ 2.apply（SparkPlan.scala：247）在org.apache.spark.rdd.RDD $$ anonfun $ mapPartitionsInternal $ 1 $$ anonfun $ apply $ 25.apply（RDD.scala：830）在org.apache.spark.rdd.RDD $$ anonfun $ mapPartitionsInternal $ 1 $$ anonfun $ apply $ 25.apply（RDD.scala：830）在org.apache.spark.rdd.MapPartitionsRDD.compute（MapPartitionsRDD.scala：38）在org.apache.spark.rdd.RDD.computeOrReadCheckpoint（RDD.scala：324）在org.apache.spark.rdd.RDD.iterator（RDD.scala：288）在org.apache.spark.scheduler.ResultTask.runTask（ResultTask.scala：87）在org.apache.spark.scheduler.Task.run（Task.scala：109）在org.apache.spark.executor.Executor $ TaskRunner.run（Executor.scala：345）在java.util.concurrent.ThreadPoolExecutor.runWorker（ThreadPoolExecutor.java:1142）在java.util.concurrent.ThreadPoolExecutor $ Worker.run（ThreadPoolExecutor.java:617）在java.lang.Thread.run（Thread.java:745）

Dataset.map（）引发ClassCastException

0 个答案: