【发布时间】:2018-01-07 07:19:25
【问题描述】:
我想创建并保存一个用随机ints 填充的表。到目前为止一切都很好,但我不明白我如何能够将多维数组 tmp 放入具有顶部定义的架构的 Dataframe 中。
import org.apache.spark.sql.types.{
StructType, StructField, StringType, IntegerType, DoubleType}
import org.apache.spark.sql.Row
val schema = StructType(
StructField("rowId", IntegerType, true) ::
StructField("t0_1", DoubleType, true) ::
StructField("t0_2", DoubleType, true) ::
StructField("t0_3", DoubleType, true) ::
StructField("t0_4", DoubleType, true) ::
StructField("t0_5", DoubleType, true) ::
StructField("t0_6", DoubleType, true) ::
StructField("t0_7", DoubleType, true) ::
StructField("t0_8", DoubleType, true) ::
StructField("t0_9", DoubleType, true) ::
StructField("t0_10", DoubleType, true) :: Nil)
val columnNo = 10;
val rowNo = 50;
var c = 0;
var r = 0;
val tmp = Array.ofDim[Double](10,rowNo)
for (r <- 1 to rowNo){
for (c <- 1 to columnNo){
val temp = new scala.util.Random
tmp(c-1)(r-1) = temp.nextDouble
println( "Value of " + c + "/"+ r + ":" + tmp(c-1)(r-1));
}
}
val df = sc.parallelize(tmp).toDF
df.show
dataframe.show
【问题讨论】:
标签: sql scala apache-spark dataframe spark-dataframe