【发布时间】:2015-10-25 09:47:14
【问题描述】:
我正在尝试使用模型进行预测来制作火花流程序,但这样做时出现错误:任务不可序列化。
代码:
val model = sc.objectFile[DecisionTreeModel]("DecisionTreeModel").first()
val parsedData = reducedData.map { line =>
val arr = Array(line._2._1,line._2._2,line._2._3,line._2._4,line._2._5,line._2._6,line._2._7,line._2._8,line._2._9,line._2._10,line._2._11)
val vector = LabeledPoint(line._2._4, Vectors.dense(arr))
model.predict(vector.features))
}
我粘贴错误:
scala> val parsedData = reducedData.map { line =>
| val arr = Array(line._2._1,line._2._2,line._2._3,line._2._4,line._2._5,line._2._6,line._2._7,line._2._8,line._2._9,line._2._10,line._2._11)
| val vector=LabeledPoint(line._2._4, Vectors.dense(arr))
| model.predict(vector.features)
| }
org.apache.spark.SparkException: Task not serializable
at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:304)
at org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:294)
at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:122)
at org.apache.spark.SparkContext.clean(SparkContext.scala:2030)
at org.apache.spark.streaming.dstream.DStream$$anonfun$map$1.apply(DStream.scala:528)
at org.apache.spark.streaming.dstream.DStream$$anonfun$map$1.apply(DStream.scala:528)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.SparkContext.withScope(SparkContext.scala:709)
at .......
我该如何解决这个问题?
谢谢!
【问题讨论】:
-
你能打印错误堆栈吗?
-
我刚刚粘贴了它;-) 谢谢!
标签: scala serialization apache-spark apache-spark-mllib