【问题标题】:How to merge arrays from two files into one array with jq?如何使用jq将两个文件中的数组合并为一个数组?
【发布时间】:2018-02-28 20:25:52
【问题描述】:

我想合并两个包含 JSON 的文件。它们每个都包含一个 JSON 对象数组。

registration.json

[
    { "name": "User1", "registration": "2009-04-18T21:55:40Z" },
    { "name": "User2", "registration": "2010-11-17T15:09:43Z" }
]

useredits.json

[
    { "name": "User1", "editcount": 164 },
    { "name": "User2", "editcount": 150 },
    { "name": "User3", "editcount": 10 }
]

在理想情况下,我希望通过合并操作获得以下结果:

[
    { "name": "User1", "editcount": 164, "registration": "2009-04-18T21:55:40Z" },
    { "name": "User2", "editcount": 150, "registration": "2010-11-17T15:09:43Z" }
]

我找到了https://github.com/stedolan/jq/issues/1247#issuecomment-348817802,但我得到了

jq: error: module not found: jq

【问题讨论】:

    标签: json join jq


    【解决方案1】:

    jq解决方案:

    jq -s '[ .[0] + .[1] | group_by(.name)[] 
              | select(length > 1) | add ]' registration.json useredits.json
    

    输出:

    [
      {
        "name": "User1",
        "registration": "2009-04-18T21:55:40Z",
        "editcount": 164
      },
      {
        "name": "User2",
        "registration": "2010-11-17T15:09:43Z",
        "editcount": 150
      }
    ]
    

    【讨论】:

      【解决方案2】:

      虽然没有严格回答问题,下面的命令

      jq -s 'flatten | group_by(.name) | map(reduce .[] as $x ({}; . * $x))'
            registration.json useredits.json
      

      产生这个输出:

      [
          { "name": "User1", "editcount": 164, "registration": "2009-04-18T21:55:40Z" },
          { "name": "User2", "editcount": 150, "registration": "2010-11-17T15:09:43Z" },
          { "name": "User3", "editcount": 10 }
      ]
      

      来源: jq - error when merging two JSON files "cannot be multiplied"

      【讨论】:

        【解决方案3】:

        以下假设您拥有 jq 1.5 或更高版本,并且:

        • joins.jq如下图是在~/.jq/目录下或者~/.jq/joins/目录下
        • 密码中没有名为 joins.jq 的文件
        • registration.json 已修复,使其成为有效的 JSON(顺便说一句,这可以由 jq 自己完成)。

        使用的调用将是:

        jq -s 'include "joins"; joins(.name)' registration.json useredits.json
        

        joins.jq

        # joins.jq Version 1 (12-12-2017)
        
        def distinct(s):
          reduce s as $x ({}; .[$x | (type[0:1] + tostring)] = $x)
          |.[];
        
        # Relational Join
        # joins/6 provides similar functionality to the SQL INNER JOIN statement:
        #   SELECT (Table1|p1), (Table2|p2)
        #     FROM Table1
        #     INNER JOIN Table2 ON (Table1|filter1) = (Table2|filter2)
        # where filter1, filter2, p1 and p2 are filters.
        
        # joins(s1; s2; filter1; filter2; p1; p2)
        # s1 and s2 are streams of objects corresponding to rows in Table1 and Table2;
        # filter1 and filter2 determine the join criteria;
        # p1 and p2 are filters determining the final results.
        # Input: ignored
        # Output: a stream of distinct pairs [p1, p2]
        # Note: items in s1 for which filter1 == null are ignored, otherwise all rows are considered.
        #
        def joins(s1; s2; filter1; filter2; p1; p2):
          def it: type[0:1] + tostring;
          def ix(s;f):
            reduce s as $x ({};  ($x|f) as $y | if $y == null then . else .[$y|it] += [$x] end);
          # combine two dictionaries using the cartesian product of distinct elements
          def merge:
            .[0] as $d1 | .[1] as $d2
            | ($d1|keys_unsorted[]) as $k
            | if $d2[$k] then distinct($d1[$k][]|p1) as $a | distinct($d2[$k][]|p2) as $b | [$a,$b]
              else empty end;
        
           [ix(s1; filter1), ix(s2; filter2)] | merge;
        
        def joins(s1; s2; filter1; filter2):
          joins(s1; s2; filter1; filter2; .; .) | add ;
        
        # Input: an array of two arrays of objects
        # Output: a stream of the joined objects
        def joins(filter1; filter2):
          joins(.[0][]; .[1][]; filter1; filter2);
        
        # Input: an array of arrays of objects.
        # Output: a stream of the joined objects where f defines the join criterion.
        def joins(f):
          # j/0 is defined so TCO is applicable
          def j:
            if length < 2 then .[][]
            else [[ joins(.[0][]; .[1][]; f; f)]] + .[2:] | j
            end;
           j ;
        

        【讨论】:

        • 为了更便携和更具可读性,我选择使用 RomanPerekhrest 的答案。感谢您回答我的问题!
        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2017-04-24
        • 2018-10-25
        • 1970-01-01
        • 1970-01-01
        • 2021-07-16
        • 1970-01-01
        相关资源
        最近更新 更多