【发布时间】:2015-08-28 10:40:43
【问题描述】:
我有一个Spark (version 1.3.1) 应用程序。其中,我正在尝试将一个 Java bean RDD JavaRDD<Message> 转换为 Dataframe,它有许多不同数据类型(整数、字符串、列表、映射、双精度)的字段。
但是什么时候,我正在执行我的代码。
messages.foreachRDD(new Function2<JavaRDD<Message>,Time,Void>(){
@Override
public Void call(JavaRDD<Message> arg0, Time arg1) throws Exception {
SQLContext sqlContext = SparkConnection.getSqlContext();
DataFrame df = sqlContext.createDataFrame(arg0, Message.class);
df.registerTempTable("messages");
我收到了这个错误
/06/12 17:27:40 INFO JobScheduler: Starting job streaming job 1434110260000 ms.0 from job set of time 1434110260000 ms
15/06/12 17:27:40 ERROR JobScheduler: Error running job streaming job 1434110260000 ms.1
scala.MatchError: interface java.util.List (of class java.lang.Class)
at org.apache.spark.sql.SQLContext$$anonfun$getSchema$1.apply(SQLContext.scala:1193)
at org.apache.spark.sql.SQLContext$$anonfun$getSchema$1.apply(SQLContext.scala:1192)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.mutable.ArrayOps$ofRef.map(ArrayOps.scala:108)
at org.apache.spark.sql.SQLContext.getSchema(SQLContext.scala:1192)
at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:437)
at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:465)
【问题讨论】:
标签: java scala apache-spark spark-streaming apache-spark-sql