【问题标题】:unable to cast columns in hive无法在蜂巢中投列
【发布时间】:2018-09-13 23:45:35
【问题描述】:

我使用 serde 将 csv 文件加载到 hive 表中。像往常一样,它将所有列类型创建为字符串。但是当我尝试将列转换为它们各自的数据类型时,它会抛出一个错误,尤其是在将字符串类型转换为数组类型时。

describe table ted; 
comments string from deserializer
description string from deserializer
duration string from deserializer
speaker string from deserializer
occupation string from deserializer
tags string from deserializer
views string from deserializer

create table tedx as select cast(comments as int) as comments, cast(description as string) as desc, cast(duration as int) as duration, cast(speaker as string) as speaker, cast(occupation as string) as occupation, cast(tags as array) as tags, cast(views as int) as views, from ted;

失败:ParseException line 7:13 无法识别“array”附近的输入 '

如何将标签列从字符串类型转换为数组类型?

【问题讨论】:

    标签: csv hadoop hive hive-serde


    【解决方案1】:

    要将字符串转换为数组,请使用 (string str, string pat) - 在 pat 周围拆分 str(pat 是正则表达式)。

    演示:

    hive> select split('1,2,3',',');
    OK
    ["1","2","3"]
    Time taken: 4.691 seconds, Fetched: 1 row(s)
    

    文档在这里:https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-07-29
      • 2019-05-20
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2021-10-20
      • 2019-12-22
      • 2019-02-07
      相关资源
      最近更新 更多