【问题标题】:Regex to handle all Multiline exception in rubular fluentd正则表达式处理 rubular fluentd 中的所有多行异常
【发布时间】:2021-05-21 13:32:23
【问题描述】:

我设计了正则表达式以匹配 fluentd 解析器的所有多行异常或警告消息字段,如下所示

(SLF4J:\s.*|[a-zA-z_]*\..*\.*\s.*\s.*|Caused\sby:\s|\s+at\s.*|\s+\.\.\. (\d)+ more)

它匹配不必要的字段。

我想匹配所有异常或警告多行的开头。 简而言之:最新的多行将从文件的开头开始读取,因为 JSON.JSON 总是以 {" 开头,因此会得到下一行。当我们看到以 {" 开头的行时,我们将停止阅读多行

两种情况下一个正则表达式或两种情况下两个正则表达式都可以

演示链接

正则表达式位于:https://rubular.com/r/O26Wm6mc7z51re

正则表达式位于:https://rubular.com/r/v6Q7iwZqmNDAAx

测试字符串是:

java.lang.InterruptedException: Timeout while waiting for epoch from quorum
        at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1227)
        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:482)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1284)
        ... 19 more
{"log_timestamp": "2021-02-18T11:33:23.114+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "QuorumPeer[myid=2](plain=/0.0.0.0:2181)(secure=disabled)", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "PeerState set to LOOKING"}
{"log_timestamp": "2021-02-18T11:33:23.115+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "WorkerSender[myid=2]", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "Failed to resolve address: zk-2.zk-headless.intam.svc.cluster.local"}
java.net.UnknownHostException: zk-2.zk-headless.intam.svc.cluster.local
        at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
        at java.net.InetAddress.getAllByName(InetAddress.java:1193)
        at java.net.InetAddress.getAllByName(InetAddress.java:1127)
        at java.net.InetAddress.getByName(InetAddress.java:1077)
        at org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer.recreateSocketAddresses(QuorumPeer.java:194)
        at org.apache.zookeeper.server.quorum.QuorumPeer.recreateSocketAddresses(QuorumPeer.java:764)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:699)
        at org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:618)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:477)
        at org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:456)
        at java.lang.Thread.run(Thread.java:748)
{"log_timestamp": "2021-02-18T11:33:23.115+0000", "log_level": "WARN", "process_id": "zookeeper#2", "process_name": "zookeeper", "thread_id": 1, "thread_name": "WorkerSender[myid=2]", "action_name": "org.apache.zookeeper.server.quorum.QuorumPeer", "log_message": "Failed to resolve address: zk-2.zk-headless.sxc.svc.cluster.local"}

预期匹配: demo1:https://rubular.com/r/O26Wm6mc7z51re

java.lang.InterruptedException: Timeout while waiting for epoch from quorum
        at org.apache.zookeeper.server.quorum.Leader.getEpochToPropose(Leader.java:1227)
        at org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:482)
        at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1284)
        ... 19 more

对于 demo2:https://rubular.com/r/v6Q7iwZqmNDAAx

SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/spark/jars/logback-classic-1.2.3.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/spark/jars/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type 

【问题讨论】:

    标签: java regex regex-lookarounds regex-negation rubular


    【解决方案1】:

    您可以使用带有捕获组和反向引用的单一模式来获取这两个部分

    ^(SLF4J:|java\.lang\.InterruptedException:).*(?:\R(?!\1|{).*)*
    

    模式匹配:

    • ^ 字符串开始
    • (SLF4J:|java\.lang\.InterruptedException).* 在匹配任一备选方案的第 1 组中捕获
    • (?:非捕获组
      • \R(?!\1|{).* 匹配一个换行符并断言该字符串不以 wat 开头,即在组 1 中捕获或 {
    • )* 关闭组并可选择重复以匹配所有行

    Regex demo

    查看first partsecond part 的正则表达式匹配。

    注意在 Java 中将反斜杠加倍

    String regex = "^(SLF4J:|java\\.lang\\.InterruptedException:).*(?:\\R(?!\\1|\\{).*)*";
    

    不跨越 SLF4J 或不同类型的异常,在字符串的开头用点分隔的字符串表示:

    ^(?:SLF4J:|\w+(?:\.\w+)+).*(?:\R(?!(?:SLF4J:|\w+(?:\.\w+)+)|{).*)*
    

    Regex demo

    【讨论】:

    • 异常可以以任何名称开头。在我们的示例 java.lang.InterruptedException 中,它可以是 xxx.yyy.zzz 或可以是 org.apache.xxx。我们可以将硬编码值设为通用跨度>
    • org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:699) dorg.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManorg.apache.zookeeper.server. quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:699) ddddddddd 并没有停在 699) , 也是下一行 ddd., 如有异常我们可以停 699)
    • 我只想要多行的所有第一次出现。根据您给定的正则表达式,它可在以下位置获得:rubular.com/r/xNbOmFdTFV8Hlx
    • @SKumar 喜欢这个? rubular.com/r/d6uRi8GsDFPwk1
    • 例如上面我只想要这么多不是全部
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2020-06-25
    • 2011-09-30
    • 1970-01-01
    • 2022-11-14
    • 1970-01-01
    相关资源
    最近更新 更多