【问题标题】:How can I extract the gif files requested by a GET request with Http response 200 from a log?如何从日志中提取带有 Http 响应 200 的 GET 请求请求的 gif 文件?
【发布时间】:2020-05-02 05:28:37
【问题描述】:

我有下一个日志文件,我需要提取 GET 请求请求的 gif 文件,其状态为 200。

unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:06 ‐0400] "GET /shuttle/countdown/ HTTP/1.0" 200 3985
burger.letters.com ‐ ‐ [01/Jul/1995:00:00:11 ‐0400] "GET /shuttle/countdown/liftoff.html HTTP/1.0" 304 0
burger.letters.com ‐ ‐ [01/Jul/1995:00:00:12 ‐0400] "GET /images/NASA‐logosmall.gif HTTP/1.0" 304 0
burger.letters.com ‐ ‐ [01/Jul/1995:00:00:12 ‐0400] "GET/shuttle/countdown/video/livevideo.gif HTTP/1.0" 200 0
d104.aa.net ‐ ‐ [01/Jul/1995:00:00:13 ‐0400] "GET /shuttle/countdown/HTTP/1.0" 200 3985
unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:14 ‐0400] "GET/shuttle/countdown/count.gif HTTP/1.0" 200 40310
unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:14 ‐0400] "GET /images/NASA‐logosmall.gif HTTP/1.0" 200 786
unicomp6.unicomp.net ‐ ‐ [01/Jul/1995:00:00:14 ‐0400] "GET /images/KSC‐logosmall.gif HTTP/1.0" 200 1204
d104.aa.net ‐ ‐ [01/Jul/1995:00:00:15 ‐0400] "GET/shuttle/countdown/count.gif HTTP/1.0" 200 40310
d104.aa.net ‐ ‐ [01/Jul/1995:00:00:15 ‐0400] "GET /images/NASA‐logosmall.gif HTTP/1.0" 200 786

从上面的例子中,响应必须是:

livevideo.gif
count.gif
NASA-logo.gif
KSC-logosmall.gif

正如您在响应中看到的那样,没有重复项,例如,在第 6 行中,我们有 Get 请求的 count.gif 记录,状态为 200,第 9 行和我们的响应中发生了同样的情况只有一个 count.gif 记录。

【问题讨论】:

    标签: java string collections


    【解决方案1】:

    尝试一次读取一行文件以从每一行中提取文件名。正则表达式在这里会很有用。

    将 gif 文件名存储在 Set 中以自动消除重复项。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2016-09-14
      • 2021-08-28
      • 1970-01-01
      • 1970-01-01
      • 2019-02-09
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多