这是我使用的解决方案,它是一个自定义频率函数,即:
- 通过 JQ 表达式(桶
key)对 JSON 值/对象数组进行存储桶/分箱
- 提供桶
count(频率)
- 为每个存储桶提供
percentage 的项目(四舍五入到小数点后 2 位)
- 提供原始的
items,它们被分箱到桶中,并且
- 按桶的
count 降序对桶进行排序。
def freq(expr):
length as $total_count
| group_by(expr)
| map({
key: (.[0] | expr),
count: length,
percent: (((length / $total_count * 10000 + 0.5) | floor) / 100),
items: .
})
| sort_by(-.count)
;
例如,在我的$HOME/.jq 中定义了上述内容,查询:
jq -n '
[
{"complianceState": "a", "other": 0.5},
{"complianceState": "b", "other": 1.2},
{"complianceState": "a", "other": 1.7},
{"complianceState": "c", "other": 5.3},
{"complianceState": "b", "other": 1.5},
{"complianceState": "e", "other": 0.6},
{"complianceState": "c", "other": 3.4},
{"complianceState": "c", "other": 5.9}
] | freq(.complianceState)'
会产生
[
{
"key": "c",
"count": 3,
"percent": 37.5,
"items": [
{"complianceState": "c", "other": 5.3},
{"complianceState": "c", "other": 3.4},
{"complianceState": "c", "other": 5.9}
]
},
{
"key": "a",
"count": 2,
"percent": 25,
"items": [
{"complianceState": "a", "other": 0.5},
{"complianceState": "a", "other": 1.7}
]
},
{
"key": "b",
"count": 2,
"percent": 25,
"items": [
{"complianceState": "b", "other": 1.2},
{"complianceState": "b", "other": 1.5}
]
},
{
"key": "e",
"count": 1,
"percent": 12.5,
"items": [
{"complianceState": "e", "other": 0.6}
]
}
]
对于您的情况,您需要使用-s 将输入吞入一个 JSON 数组。从那里,您可以将输出转换为所需的格式。例如
jq -s 'freq(.complianceState)
| map({key, value: .count})
| from_entries
' all.json
请注意,使用freq 函数,您可以按任意表达式进行分组。例如freq((.other / 1.5) | floor),如果您希望获得类似直方图的分箱。