【发布时间】:2016-04-10 01:04:41
【问题描述】:
我正在尝试在 Spark (Pyspark) 中创建一个空数据框。
我正在使用与此处讨论的enter link description here 类似的方法,但它不起作用。
这是我的代码
df = sqlContext.createDataFrame(sc.emptyRDD(), schema)
这是错误
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/Me/Desktop/spark-1.5.1-bin-hadoop2.6/python/pyspark/sql/context.py", line 404, in createDataFrame
rdd, schema = self._createFromRDD(data, schema, samplingRatio)
File "/Users/Me/Desktop/spark-1.5.1-bin-hadoop2.6/python/pyspark/sql/context.py", line 285, in _createFromRDD
struct = self._inferSchema(rdd, samplingRatio)
File "/Users/Me/Desktop/spark-1.5.1-bin-hadoop2.6/python/pyspark/sql/context.py", line 229, in _inferSchema
first = rdd.first()
File "/Users/Me/Desktop/spark-1.5.1-bin-hadoop2.6/python/pyspark/rdd.py", line 1320, in first
raise ValueError("RDD is empty")
ValueError: RDD is empty
【问题讨论】:
标签: apache-spark pyspark