【发布时间】:2017-07-12 09:19:29
【问题描述】:
我想从一个巨大的日志文件中 grep 两个单词的组合,这些单词是分散的并且没有任何特定的顺序。
示例日志:
{"1a":"2017-01-28 00:00:00","2a":"sample","a":"12345","b":"2017-02-06","c":"2017-02-06T17:51:02.454-08:00","d":"Mozilla/5.0
; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1","e":"2017-02-06
","f":"03","g":"example","h":"logA","i":"IFX","j":"a85","k":"12345678"},
{"1a":"2017-01-28 00:00:11","2a":"sample","a":"12345","b":"2017-02-06","c":"2017-02-06T17:51:02.454-08:00","d":"Mozilla/5.0
; en-US; rv:1.9.0.1) Gecko/2008070208 Firefox/3.0.1","e":"2017-02-06
","f":"03","g":"example","h":"logB","i":"IFX","j":"a85","k":"12345678"}
在这个文件中,我想 grep "1a":"<value>" 和 "h":"<value of logA or logB>" 不应该有任何重复。
预期输出:
"1a":"2017-01-28 00:00:00" "h":"logA"
"1a":"2017-01-28 00:00:11" "h":"logB"
我尝试以这种方式使用 egrep,但它给出了整行:
egrep -oE '1a\|"h"' but this does not give the required output.
awk /pattern1/ && /pattern2/ filename #no use
感谢您的帮助
【问题讨论】:
-
DO NOT使用文本/流处理器/编辑器来解析JSON,使用像jq这样的适当解析器 -
以正确的
JSON格式格式化您的文本输入并安装jq -
另外,正则表达式不是设计模式。标记已删除。