【问题标题】:Extract Regex until sequence from log file从日志文件中提取正则表达式直到序列
【发布时间】:2017-01-22 10:33:27
【问题描述】:

我有以下日志文​​件,我需要使用正则表达式定义日志格式,以便我可以使用它来提取日志条目。

_20131005_022047874 ALEPO@ALEPO3 **Exception ServiceConnection / createService methord javax.xml.ws.WebServiceException: Failed to access the WSDL at: http://212.118.158.21:8080/tunnel-web/axis/Portlet_ase_FunctionalDomainService?wsdl. It failed with: 
    Connection refused.
    at com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.tryWithMex(RuntimeWSDLParser.java:151)
    at com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.parse(RuntimeWSDLParser.java:133)
    at com.sun.xml.internal.ws.client.WSServiceDelegate.parseWSDL(WSServiceDelegate.java:254)
    at com.sun.xml.internal.ws.client.WSServiceDelegate.<init>(WSServiceDelegate.java:217)
    at com.sun.xml.internal.ws.client.WSServiceDelegate.<init>(WSServiceDelegate.java:165)
    at com.sun.xml.internal.ws.spi.ProviderImpl.createServiceDelegate(ProviderImpl.java:93)
    at javax.xml.ws.Service.<init>(Service.java:56)
    at javax.xml.ws.Service.create(Service.java:680)
    at com.stc.alepo.client.ServiceConnection.createService(ServiceConnection.java:75)
    at com.stc.alepo.client.WSSoapHandler.<init>(WSSoapHandler.java:73)
    at com.stc.alepo.client.WSProcessManager.<init>(WSProcessManager.java:114)
    at com.stc.alepo.client.IcmsAlepoRealTime.start(IcmsAlepoRealTime.java:439)
    at com.stc.alepo.client.IcmsAlepoRealTime.main(IcmsAlepoRealTime.java:97)
Caused by: java.net.ConnectException: Connection refused
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
    at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
    at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
    at java.net.Socket.connect(Socket.java:529)
    at java.net.Socket.connect(Socket.java:478)
    at sun.net.NetworkClient.doConnect(NetworkClient.java:163)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:394)
    at sun.net.www.http.HttpClient.openServer(HttpClient.java:529)
    at sun.net.www.http.HttpClient.<init>(HttpClient.java:233)
    at sun.net.www.http.HttpClient.New(HttpClient.java:306)
    at sun.net.www.http.HttpClient.New(HttpClient.java:323)
    at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:970)
    at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:911)
    at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:836)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1172)
    at java.net.URL.openStream(URL.java:1010)
    at com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.createReader(RuntimeWSDLParser.java:793)
    at com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.resolveWSDL(RuntimeWSDLParser.java:251)
    at com.sun.xml.internal.ws.wsdl.parser.RuntimeWSDLParser.parse(RuntimeWSDLParser.java:118)
    ... 11 more

_20131005_022047874 ALEPO@ALEPO3 **Exception DCPSoapHandler / constructor methord [Ljava.lang.StackTraceElement;@25b65b7f
_20131005_022047875 ALEPO@ALEPO3 WS17249866 **Exception DCPSoapHandler / invokeSOAPMessage methord java.lang.NullPointerException
    at com.stc.alepo.client.WSSoapHandler.invokeSOAPMessage(WSSoapHandler.java:110)
    at com.stc.alepo.client.WSProcessManager.getWSReply(WSProcessManager.java:174)
    at com.stc.alepo.client.IcmsAlepoRealTime.start(IcmsAlepoRealTime.java:441)
    at com.stc.alepo.client.IcmsAlepoRealTime.main(IcmsAlepoRealTime.java:97)

我已经定义了下面的正则表达式来匹配每个条目的第一行之外的时间戳,但是我需要第二组来包含包括多行在内的消息的其余部分,

(_\d{1,8}_\w+) (.*)

如何匹配第二组以提取所有字符,直到第一组再次出现,或者执行此用例的最佳实践是什么。我有很多日志,我需要以相同的方式定义第二组,但时间戳格式可能会改变日志。

提前致谢。

【问题讨论】:

  • Java 还是 JavaScript?
  • 你最好使用日志解析器。
  • 这工作正常:^(_\d{1,8}_\w+)\s*(.*(?:\r?\n(?!_\d{1, 8}_\w+).*)*)

标签: javascript java regex logging


【解决方案1】:

您可以使用正则表达式将时间戳捕获到 1 组中,并将其后所有不以时间戳模式开头的行捕获到第 2 组中:

/^(_\d{1,8}_\w+)\s*(.*(?:\r?\n(?!_\d{1,8}_\w+).*)*)/gm

请参阅regex demo

详情

  • ^ - 行首
  • (_\d{1,8}_\w+) - 第 1 组(时间戳):_,1 到 8 位数字,_ 和 1+ 字字符
  • \s* - 0+ 个空格
  • (.*(?:\r?\n(?!_\d{1,8}_\w+).*)*) - 第 2 组(全部到下一个时间戳):
    • .* - 除换行符以外的任何 0+ 个字符
    • (?:\r?\n(?!_\d{1,8}_\w+).*)* - 0+ 序列:
      • \r?\n(?!_\d{1,8}_\w+) - 换行符后面没有时间戳模式
      • .* - 除换行符以外的任何 0+ 个字符

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2015-12-09
    • 1970-01-01
    • 2020-09-21
    • 1970-01-01
    相关资源
    最近更新 更多