【问题标题】:How to convert the below tuple set into a formatted json using pig?如何使用 pig 将以下元组集转换为格式化的 json?
【发布时间】:2017-03-24 01:34:54
【问题描述】:

我有下面的猪包,里面有两个元组。

{(1,'Kumar','23000','Kandy')}
{(2,'Mahela','34000','Colombo')}

如何使用pig将上面的转换成下面的json键值格式?

{
   1: 'kumar,23000,kandy',
   2: 'mahela',34000,'colombo'
}

任何帮助将不胜感激。请注意,我需要将其存储在 JSON 文件中

在奥利弗的回答之后

{"fields":[{"name":"group","type":10,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"test_table_columns_needed","type":120,"description":"autogenerated from Pig Field Schema","schema":{"fields":[{"name":null,"type":110,"description":"autogenerated from Pig Field Schema","schema":{"fields":[{"name":"id","type":10,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"name","type":55,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"salary","type":55,"description":"autogenerated from Pig Field Schema","schema":null},{"name":"destination","type":55,"description":"autogenerated from Pig Field Schema","schema":null}],"version":0,"sortKeys":[],"sortKeyOrders":[]}}],"version":0,"sortKeys":[],"sortKeyOrders":[]}}],"version":0,"sortKeys":[],"sortKeyOrders":[]}   {"group":1,"test_table_columns_needed":[{"id":1,"name":"'Kumar'","salary":"'23000'","destination":"'Kandy'"}]}
 {"group":2,"test_table_columns_needed":   [{"id":2,"name":"'Mahela'","salary":"'34000'","destination":"'Colombo'"}]}

【问题讨论】:

    标签: json bash shell apache-pig bigdata


    【解决方案1】:

    这是根据输入数据的一种方式

    {(1,'Kumar','23000','Kandy')}
    {(2,'Mahela','34000','Colombo')}
    
    A = LOAD ' /user/root/test1110_1.txt'  as (t1:(ID:chararray,     Name:chararray,Sal: chararray, Loc:chararray));
    B = foreach A GENERATE t1.ID as ID, REPLACE(CONCAT(t1.Name ,',' ,     t1.Sal,',', t1.Loc ), '[\\\'\\(\\)]','') as Val:CHARARRAY ;
    STORE B INTO 'tuple_table.json'  USING JsonStorage();
    
    ([1#Kumar,23000,Kandy])
    ([2#Mahela,34000,Colombo])
    

    【讨论】:

      【解决方案2】:
      tuple_entry = LOAD 'tuple_array' 
          USING PigStorage() 
          AS ('ref: int, name:chararray, amount:int, surname:chararray');
      
      STORE tuple_entry 
          INTO 'tuple_table.json' 
          USING JsonStorage();
      

      修改文件 .pig_schema 以符合您的元组输入。

      【讨论】:

      • JsonStorage 存储 json,列名作为键,数据作为值。但我希望输出 json 看起来像上面那样。你能解释一下如何做到这一点吗?因为我是新手,所以对这些不太熟悉
      • 您只需 'DUMP tuple_entry;'
      • 用我做 JsonStorage 后得到的答案编辑了答案,它确实有列名和元信息:(
      • 在哪里可以找到该文件?你能帮忙举个例子吗:(通过模式我只能指定数据类型noh?不是输出格式
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2020-08-07
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多