【问题标题】:java.lang.RuntimeException: scala.collection.immutable.$colon$colon is not a valid external type for schema of struct<513:int,549:int>java.lang.RuntimeException:scala.collection.immutable.$colon$colon 不是 struct<513:int,549:int> 架构的有效外部类型
【发布时间】:2019-12-21 18:45:28
【问题描述】:

我想用这个架构创建一个数据框:

 |-- Col1 : string (nullable = true)
 |-- Col2 : string (nullable = true)
 |-- Col3 : struct (nullable = true)
 |    |-- 513: long (nullable = true)
 |    |-- 549: long (nullable = true)

代码:

val someData = Seq(
      Row("AAAAAAAAA", "BBBBB", Seq(513, 549))
    )

val col3Fields = Seq[StructField](StructField.apply("513",IntegerType, true), StructField.apply("549",IntegerType, true))

val someSchema = List(
  StructField("Col1", StringType, true),
  StructField("Col2", StringType, true),
  StructField("Col3", StructType.apply(col3Fields), true)
)

val someDF = spark.createDataFrame(
  spark.sparkContext.parallelize(someData),
  StructType(someSchema)
)

someDF.show

但是someDF.show 抛出:

错误执行程序:阶段 0.0 (TID 0) 中任务 0.0 中的异常 java.lang.RuntimeException:编码时出错: java.lang.RuntimeException: scala.collection.immutable.$colon$colon 是 不是 struct 架构的有效外部类型 if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null 否则 staticinvoke(类 org.apache.spark.unsafe.types.UTF8String, 字符串类型,来自字符串, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 0, Col1), StringType), true, false) AS Col1#0 if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else staticinvoke(class org.apache.spark.unsafe.types.UTF8String,StringType,fromString, validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 1, Col2), StringType), true, false) AS Col2#1 if (assertnotnull(input[0, org.apache.spark.sql.Row, true]).isNullAt) null else named_struct(513, if (validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549,IntegerType,true)).isNullAt) null else validateexternaltype(getexternalrowfield(validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549,IntegerType,true)), 0, 513), IntegerType), 549, 如果 (validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549,IntegerType,true)).isNullAt) null else validateexternaltype(getexternalrowfield(validateexternaltype(getexternalrowfield(assertnotnull(input[0, org.apache.spark.sql.Row, true]), 2, Col3), StructField(513,IntegerType,true), StructField(549,IntegerType,true)), 1, 549), IntegerType)) AS Col3#2 在 org.apache.spark.sql.catalyst.encoders.ExpressionEncoder.toRow(ExpressionEncoder.scala:291)

编辑:

513 和 549 应该是子列名而不是值。这是我期望的输出示例:

someDF.select("Col1","Col2","Col3.*").show

+-----------+--------+------+------+
|       Col1|    Col1|   513|   549|
+-----------+--------+------+------+
| AAAAAAAAA |  BBBBB |    39|    38|
+-----------+--------+------+------+

【问题讨论】:

    标签: java scala apache-spark


    【解决方案1】:

    你拥有的数据和你拥有的Schema不一样, 您要创建的架构在这里是您如何创建的

    val schema = StructType(
      Seq(
        StructField("col1", StringType, true),
        StructField("col2", StringType, true),
        StructField("col3", StructType(
          Seq(
            StructField("513", LongType, true),
            StructField("549", LongType, true)
          ))
        )
      )
    )
    

    架构:

    root
     |-- col1: string (nullable = true)
     |-- col2: string (nullable = true)
     |-- col3: struct (nullable = true)
     |    |-- 513: long (nullable = true)
     |    |-- 549: long (nullable = true)
    

    这会给你你想要的架构

    您可以获取如下数据并应用架构

    val someData = Seq(
      Row("AAAAAAAAA", "  BBBBB", Row(39l, 38l))
    )
    
    val someDF = spark.createDataFrame(
      spark.sparkContext.parallelize(someData), schema
    )
    
    df.select("Col1","Col2","Col3.*").show 
    

    输出:

    +---------+-------+---+---+
    |     Col1|   Col2|513|549|
    +---------+-------+---+---+
    |AAAAAAAAA|  BBBBB| 39| 38|
    +---------+-------+---+---+
    

    【讨论】:

    • 非常感谢您的帮助。虽然,这不是我所需要的。我已经编辑了我的问题以更好地解释我的情况。请看一看。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-03-22
    • 1970-01-01
    • 2013-01-24
    • 1970-01-01
    • 1970-01-01
    • 2014-10-20
    相关资源
    最近更新 更多