【问题标题】:Define schema programmatically in java returning nulls?在返回空值的java中以编程方式定义模式?
【发布时间】:2018-04-11 10:50:05
【问题描述】:

我有一个 JSON 数据文件,我想以编程方式将架构应用于列。

pets.json

{"id":"311","species":"canine","color":"golden","weight":"75","name":"Captain"}
{"id":"928","species":"feline","color":"gray","weight":"8","name":"Oscar"}


SparkSession session = SparkSession.builder().appName("SparkSQLTests").master("local[*]").getOrCreate();
        DataFrameReader dataFrameReader = session.read();

        // Create Data Frame
        Dataset<Row> pets = dataFrameReader.schema(buildSchema()).json("input/pets.json");

        // Schema
        pets.printSchema();
        pets.show(10);

        // SELECT * 
        // FROM pets
        // WHERE species='canine'
        System.out.println("=== Display Canines ===");
        pets.filter(col("species").equalTo("canine")).show();


        session.stop();

当我运行程序时,我的列得到空值。我做错了什么? 谢谢

根 |-- id: 整数(可为空=真) |-- 物种:字符串(可为空=真) |-- 颜色:字符串(可为空=真) |-- 权重:双倍(可为空=真) |-- 名称:字符串(可为空 = true) +----+-------+-----+------+----+ | id|种类|颜色|重量|名称| +----+-------+-----+------+----+ |空|空|空|空|空| |空|空|空|空|空| +----+-------+-----+------+----+ === 展示犬齿 === +---+-------+-----+------+----+ | id|种类|颜色|重量|名称| +---+-------+-----+------+----+ +---+-------+-----+------+----+

【问题讨论】:

    标签: apache-spark


    【解决方案1】:

    事实证明,我在 json 数据中的数值周围有引号,这让事情变得很糟糕。当我将数据更改为:

    {"id":311,"species":"canine","color":"golden","weight":75,"name":"Captain"} {"id":928,"species":"feline","color":"gray","weight":8,"name":"Oscar"}

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-09-27
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-07-28
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多