【问题标题】:Extract only patterns using regex, bash, multiline仅使用正则表达式、bash、多行提取模式
【发布时间】:2017-03-13 13:58:19
【问题描述】:

我有一些日志(女巫非常大,有些可能有 1GB+),我想制作一个能够将所有日志提取到给定正则表达式的脚本。

日志示例:

27 Oct 2016 18:04:05,215 DEBUG org.apache.beehive.netui.pageflow.PageFlowPageFilter[[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)'] - Filtering request for path /framework/skeletons/application/setVariables.jsp
27 Oct 2016 18:04:05,215 DEBUG org.apache.beehive.netui.pageflow.DefaultPageFlowEventReporter[[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)'] - beginPageRequest: Request=com.bea.p13n.servlets.PortalServletFilter$ParamFilteredRequest@18449c85, Response=org.apache.taglibs.standard.tag
.common.core.ImportSupport$ImportResponseWrapper@18378bd8
<Oct 27, 2016 6:04:05 PM EEST> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "185" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 185181 ms
[
GET /application/appmanager/portal/myportal HTTP/1.1
.......
]
>
27 Oct 2016 18:04:05,215 DEBUG org.apache.beehive.netui.pageflow.PageFlowPageFilter[[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)'] - Filtering request for path /framework/skeletons/application/setVariables.jsp
27 Oct 2016 18:04:05,215 DEBUG org.apache.beehive.netui.pageflow.DefaultPageFlowEventReporter[[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)'] - beginPageRequest: Request=com.bea.p13n.servlets.PortalServletFilter$ParamFilteredRequest@18449c85, Response=org.apache.taglibs.standard.tag
.common.core.ImportSupport$ImportResponseWrapper@18378bd8
<Oct 27, 2016 6:04:05 PM EEST> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "185" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 185181 ms
[
GET /application/appmanager/portal/myportal HTTP/1.1
.......
]
>
<Oct 27, 2016 6:04:05 PM EEST> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "185" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 185181 ms
[
GET /application/appmanager/portal/myportal HTTP/1.1
.......
]
>
27 Oct 2016 18:04:05,215 DEBUG org.apache.beehive.netui.pageflow.PageFlowPageFilter[[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)'] - Filtering request for path /framework/skeletons/application/setVariables.jsp
27 Oct 2016 18:04:05,215 DEBUG org.apache.beehive.netui.pageflow.DefaultPageFlowEventReporter[[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)'] - beginPageRequest: Request=com.bea.p13n.servlets.PortalServletFilter$ParamFilteredRequest@18449c85, Response=org.apache.taglibs.standard.tag
.common.core.ImportSupport$ImportResponseWrapper@18378bd8
<Oct 27, 2016 6:04:05 PM EEST> <Error> <WebLogicServer> <BEA-000337> <[STUCK] ExecuteThread: '120' for queue: 'weblogic.kernel.Default (self-tuning)' has been busy for "185" seconds working on the request "Workmanager: default, Version: 0, Scheduled=true, Started=true, Started time: 185181 ms
[
GET /application/appmanager/portal/myportal HTTP/1.1
.......
]

我想让这个模式可配置,例如能够搜索某种类型的错误。

现在我知道这个正则表达式匹配我需要的任何日志(文件中的最后匹配有问题):

.+?(?=(^\d{2}\s[A-Za-z]{3}\s\d{4}.)|(^<[A-Za-z]{3}\s\d{2},\s\d{4}\s))

但我不知道如何说它只匹配错误日志。上面的正则表达式返回给我一个完整的日志(这很好),但我只想匹配它包含一些字符序列,如“错误”

谢谢

【问题讨论】:

  • 为什么不通过 second 模式匹配来寻找'ERROR'?
  • 不确定如何提取它或我应该使用什么命令。我希望此匹配将其发送到文件
  • 不清楚如何知道如何使用复杂的正则表达式,但不知道如何过滤纯字符串。 grep 'fancyregex' file | grep -i error 有没有给你任何想法?祝你好运。

标签: regex bash pattern-matching extract multiline


【解决方案1】:

您想匹配以日期开头且还包含关键字Error 的行吗?

grep -E '(^[[:digit:]]{2} [[:alpha:]]{3} [[:digit:]]{4}|^<[[:alpha:]]{3} [[:digit:]]{2}, [[:digit:]]{4}).*Error' file.log

可能想要插入-i 以区分大小写:

grep -iE '...' file.log

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2021-06-11
    • 2015-11-21
    • 1970-01-01
    • 1970-01-01
    • 2021-06-27
    • 1970-01-01
    相关资源
    最近更新 更多