【问题标题】:custom JSON output in pigpig中的自定义JSON输出
【发布时间】:2016-05-27 21:01:58
【问题描述】:

需要自定义 JSON 输出--

源文件 - abc.txt -

100002030,Tom,peter,eng,block 3, lane 5,california,10021

100003031,Tom,john,doc,block 2, lane 2,california,10021

100004032,Tom,jim,eng,block 1, lane 1,california,10021

100005033,Tom,trek,doc,block 2, lane 2,california,10021

100006034,Tom,peter,eng,block 6, lane 6,california,10021

我做了一些解析操作,下面是JSON输出

输出-

{"rollno":"100002","firstname":"Tom","lastname:"peter","qualification":"eng","Address":"block 3 lane 5 california","zipcode":"10021"}

{"rollno":"100002","firstname":"Tom","lastname:"john","qualification":"doc","Address":"block 2 lane 2 california","zipcode":"10021"}

{"rollno":"100004","firstname":"Tom","lastname:"jim","qualification":"eng","Address":"block 1 lane 1 california","zipcode":"10021"}

{"rollno":"100005","firstname":"Tom","lastname:"trek","qualification":"doc","Address":"block 2 lane 2 california","zipcode":"10021"}

{"rollno":"100006","firstname":"Tom","lastname:"peter","qualification":"eng","Address":"block 6 lane 6 california","zipcode":"10021"}

现在,我需要将文件名和来源附加到每个字段

预期输出-来源 = 社交

"file":"abc", "data": {"rollno.social":"100002","firstname.social":"Tom","lastname.social:"peter","qualification":"eng","Address.social":"block 3 lane 5 california","zipcode.social":"10021"}
"file":"abc", "data": {"rollno.social":"100002","firstname.social":"Tom","lastname.social:"john","qualification.social":"doc","Address.social":"block 2 lane 2 california","zipcode.social":"10021"}
"file":"abc", "data": 
{"rollno.social":"100004","firstname.social":"Tom","lastname.social:"jim","qualification.social":"eng","Address.social":"block 1 lane 1 california","zipcode.social":"10021"}

-------

请指教如何实现,具体为json字段输出中的每个字段添加源名称

【问题讨论】:

    标签: json apache-pig bigdata


    【解决方案1】:

    如果我是你,我会编写一个 UDF 来构造预期的 json 并将其写入磁盘。 https://pig.apache.org/docs/r0.14.0/udf.html#load-store-functions

    【讨论】:

    • 请提供除 UDF 之外的替代解决方案
    • 我不确定是否可以使用 '.' 投影字段名​​称在他们喜欢“rollno.social”作为'。用于访问元组中的字段。如果你不能用那个名字来投影它们,我不知道标准的 json 存储如何做你想要的。存储后的 json 后处理怎么样?
    猜你喜欢
    • 2012-10-11
    • 1970-01-01
    • 1970-01-01
    • 2017-05-18
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2016-04-20
    • 1970-01-01
    相关资源
    最近更新 更多