【问题标题】:How to optimize script for colorizing log files如何优化为日志文件着色的脚本
【发布时间】:2016-05-20 21:28:38
【问题描述】:

以下脚本为隐藏在日志文件中的 SQL 命令和其他几个“标签”着色:

red="\x1b[31m"
green="\x1b[32m"
yellow="\x1b[33m"
blue="\x1b[34m"
white="\x1b[37m"

BLACK="\x1b[30;1m"
RED="\x1b[31;1m"
GREEN="\x1b[32;1m"
YELLOW="\x1b[33;1m"
BLUE="\x1b[34;1m"
CYAN="\x1b[36;1m"
WHITE="\x1b[37;1m"

onred="\x1b[41m"

lblack="\x1b[90m"
lred="\x1b[91m"
lgreen="\x1b[92m"
lyellow="\x1b[93m"
lblue="\x1b[94m"
lmagenta="\x1b[95m"
lcyan="\x1b[96m"
lwhite="\x1b[97m"

reset_color="\x1b[0m"

sed -r "s/'[^']*'/${CYAN}&${reset_color}/g;
        s/[a-z_]*_id/${white}&${reset_color}/g;

        s/(.*\[)(AbstractApplicationContext)(\].*)/${BLACK}\\1${reset_color}${yellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ActionService)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(Authenticated)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(CascadeHandlerImpl)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ClasspathHacker)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ConfigManagerImpl|ConfigManagerLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ContextFilter|ContextImpl|ContextLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DataSourceRestrictionConverter)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DatabaseLoader)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DefaultListableBeanFactory)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(DispatchFilter)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(FileHelper|FileIndex)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(LicenseManagerImpl)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(LocalizedStringsLoader)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(LoggingPropertyPlaceholderConfigurer)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(PooledDbDriverImpl)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(ProjectLoader)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(PropertiesLoaderSupport)(\].*)/${BLACK}\\1${reset_color}${lcyan}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(RenderingFilter)(\].*)/${BLACK}\\1${reset_color}${lwhite}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(SecurityControllerImpl|SecurityServiceImpl)(\].*)/${BLACK}\\1${reset_color}${lblue}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(WorkflowRulesContainerImpl|WorkflowRulesContainerLoader)(\].*)/${BLACK}\\1${reset_color}${lred}\\2${reset_color}${BLACK}\\3${reset_color}/g;
        s/(.*\[)(XmlBeanDefinitionReader)(\].*)/${BLACK}\\1${reset_color}${lgreen}\\2${reset_color}${BLACK}\\3${reset_color}/g;

        s/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/${yellow}\\1${reset_color}\\2${yellow}\\4${reset_color} ${YELLOW}\\5${reset_color}\\6${yellow}\\7${reset_color}/g;
        s/LEFT OUTER JOIN [^ ]* ON/${yellow}&${reset_color}/g;
        s/ORDER BY/${yellow}&${reset_color}/g;
        s/(ASC|DESC)/${yellow}&${reset_color}/g;
        s/GROUP BY/${yellow}&${reset_color}/g;
        s/(INSERT INTO) ([^ ]*)(.*)(VALUES)/${green}\\1${reset_color} ${GREEN}\\2${reset_color}\\3${green}\\4${reset_color}/g;
        s/(INSERT INTO) ([^ ]*)(.*)/${green}\\1${reset_color} ${GREEN}\\2${reset_color}\\3${reset_color}/g;
        s/(UPDATE) *([^ ]*) (SET|set)/${blue}\\1${reset_color} ${BLUE}\\2${reset_color} ${blue}\\3${reset_color}/g;
        s/DELETE FROM *[^ ]* WHERE/${RED}&${reset_color}/g;

        s/\*\*\*ROLLBACK\*\*\*/${white}${onred}&${reset_color}/g;
        s/\[ERROR\]/${WHITE}${onred}&${reset_color}/g;
        s/SQLServerException:/${WHITE}${onred}&${reset_color}/g"

虽然,在 1.9 MB 的日志文件上运行时,执行需要 1:41(1 分 41 秒)...而简单的 cat 运行时间不到 0:01...

如果我删除“AbstractApplicationContext”和“XmlBeanDefinitionReader”之间的块,结果相同(不到 1 秒)。

我不明白为什么那个特定的块会产生如此大的不同!?

有没有办法优化这样的着色脚本?

示例文件提取(复制以使其成为大文件):

[INFO ][2016-05-20 16:17:51,346][ContextLoader] - [Root WebApplicationContext: initialization started]
[INFO ][2016-05-20 16:17:51,505][XmlBeanDefinitionReader] - [Loading XML bean definitions from ServletContext resource [/WEB-INF/config/context/appContext.xml]]
[INFO ][2016-05-20 16:17:52,986][PropertiesLoaderSupport] - [Loading properties file from class path resource [config/mail.properties]]
[INFO ][2016-05-20 16:17:55,900][ConfigManagerLoader] - [Reading XML config]
[INFO ][2016-05-20 16:17:55,991][ConfigManagerLoader] - [Reading XML config: OK]
[WARN ][2016-05-20 16:17:56,384][ConfigManagerLoader] - [Low max memory=477102080. Java max memory=1000 MB is recommended for production use, as a minimum.]
[INFO ][2016-05-20 16:17:58,309][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]]
[INFO ][2016-05-20 16:17:58,337][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]: OK, strings:759]
[INFO ][2016-05-20 16:17:58,641][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]]
[INFO ][2016-05-20 16:17:58,768][LocalizedStringsLoader] - [Loading localized strings for locale=[fr_FR]: OK, strings:46436]
[INFO ][2016-05-20 16:17:58,830][LocalizedStringsLoader] - [Loading localized strings for locale=[nl_NL]]
[INFO ][2016-05-20 16:17:58,946][LocalizedStringsLoader] - [Loading localized strings for locale=[nl_NL]: OK, strings:46436]
[INFO ][2016-05-20 16:17:59,434][PropertiesLoaderSupport] - [Loading properties file from class path resource [config/mail.properties]]
[INFO ][2016-05-20 16:18:00,476][XmlBeanDefinitionReader] - [Loading XML bean definitions from class path resource [project-child-context.xml]]
[DEBUG][2016-05-20 16:18:01,259][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:18:01,340][DbConnectionImpl] - [Updated: -1 records]
[INFO ][2016-05-20 16:18:01,363][DatabaseImpl] - [Database loaded: data]
[INFO ][2016-05-20 16:18:01,379][DatabaseLoader] - [Loading Database [data]: OK]
[INFO ][2016-05-20 16:18:01,393][DatabaseLoader] - [Loading Database [schema]]
[DEBUG][2016-05-20 16:18:01,865][DbConnectionImpl] - [SELECT column FROM table WHERE table_name = 'sample']
[DEBUG][2016-05-20 16:18:01,894][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:18:01,898][DbConnectionImpl] - [Updated: -1 records]
[INFO ][2016-05-20 16:18:06,241][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[checkRequestDuplicates]]
[INFO ][2016-05-20 16:18:06,384][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[getStatistic]]
[INFO ][2016-05-20 16:18:06,971][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[saveRecord]]
[INFO ][2016-05-20 16:18:07,126][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[EmailService]]
[INFO ][2016-05-20 16:18:07,542][WorkflowRulesContainerLoader] - [Loading Workflow Rule, ruleId=[LocalizationRead]]
[INFO ][2016-05-20 16:18:09,578][FileIndex$1] - [File index loading started]
[DEBUG][2016-05-20 16:18:19,406][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:18:19,410][DbConnectionImpl] - [Updated: -1 records]
[INFO ][2016-05-20 16:18:22,201][LicenseManagerImpl] - [Checkout concurrent license=[Main]]
[DEBUG][2016-05-20 16:18:22,209][SecurityControllerImpl] - [Determine next request]
[DEBUG][2016-05-20 16:18:22,239][RenderingFilter] - [Rendering mode]
[DEBUG][2016-05-20 16:18:22,253][Authenticated] - [User is authenticated]
[DEBUG][2016-05-20 16:18:22,306][FileHelper] - [Find file=[hovertip.js]]
[DEBUG][2016-05-20 16:18:22,399][FileHelper] - [Find file=[hovertip.css]]
[DEBUG][2016-05-20 16:22:18,263][DbConnectionImpl] - [INSERT INTO notifications_log (columns, columns) VALUES ('2016-05-20', 'ERROR')]
[DEBUG][2016-05-20 16:22:18,334][DbConnectionImpl] - [Updated: 1 records]
[DEBUG][2016-05-20 16:22:18,393][DbConnectionImpl] - [***COMMIT***]
[DEBUG][2016-05-20 16:22:18,549][DbConnectionImpl] - [***ROLLBACK***]
[DEBUG][2016-05-20 16:23:37,659][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:23:37,662][DbConnectionImpl] - [Updated: -1 records]
[DEBUG][2016-05-20 16:23:37,886][DataSourceImpl] - [SELECT col_id FROM table_1 LEFT OUTER JOIN table_2 ON table_1.col_id=table_2.col_id]
[DEBUG][2016-05-20 16:23:37,926][DbConnectionImpl] - [***COMMIT***]
[DEBUG][2016-05-20 16:23:37,930][ContextFilter] - [---------- Request: processing finished]
[DEBUG][2016-05-20 16:38:38,033][DbConnectionImpl] - [UPDATE users SET pwd = NULL WHERE user_name = 'me']
[DEBUG][2016-05-20 16:38:38,051][DbConnectionImpl] - [Updated: 1 records]
[DEBUG][2016-05-20 16:38:38,058][DbConnectionImpl] - [SET CONCAT_NULL_YIELDS_NULL OFF]
[DEBUG][2016-05-20 16:38:38,063][DbConnectionImpl] - [Updated: -1 records]
[DEBUG][2016-05-20 17:43:25,087][DbConnectionImpl] - [***COMMIT***]
[DEBUG][2016-05-20 17:43:25,096][ContextFilter] - [---------- Request: processing finished]

【问题讨论】:

  • 有趣的问题,你能分享一个示例输入文件供我们在本地检查吗?越大越好……

标签: bash shell sed colors


【解决方案1】:

这需要很长时间,因为您要将每一行输入与 37 个正则表达式中的每一个进行比较!使用 awk 并且只做一次测试,例如:

$ cat file
[INFO ][ts 1][ContextLoader] - [rest 1]
[INFO ][ts 2][XmlBeanDefinitionReader] - [rest 2]
[INFO ][ts 3][PropertiesLoaderSupport] - [rest 3]
$
$ cat tst.awk
BEGIN {
    red    = "<red>"            # "\x1b[31m"
    green  = "<green>"          # "\x1b[32m"
    yellow = "<yellow>"         # "\x1b[33m"
    black  = "<black>"          # "\x1b[30;1m"
    reset  = "<reset>"          # "\x1b[0m"

    color["ContextLoader"] = red
    color["XmlBeanDefinitionReader"] = green
    color["PropertiesLoaderSupport"] = yellow
}
match($0,/((\[[^]]+\]){2}\[)([^]]+)(.*)/,a) {
    print black a[1] reset color[a[3]] a[3] reset black a[4] reset
}

$ awk -f tst.awk file
<black>[INFO ][ts 1][<reset><red>ContextLoader<reset><black>] - [rest 1]<reset>
<black>[INFO ][ts 2][<reset><green>XmlBeanDefinitionReader<reset><black>] - [rest 2]<reset>
<black>[INFO ][ts 3][<reset><yellow>PropertiesLoaderSupport<reset><black>] - [rest 3]<reset>

上面隔离了您关心的每一行部分中出现的任何字符串,然后在哈希中查找该字符串的颜色以进行打印。它使用 GNU awk 作为第三个参数来匹配(),这是对其他 awk 的简单调整。

我使用了像 &lt;red&gt; 这样的颜色名称,以便您可以看到输出。在您的真实系统中,您只需编写 red = "\x1b[31m" 而不是 red = "&lt;red&gt;" # "\x1b[31m" 等。

要更新上述内容以处理 SQL 语句,模拟 sed 脚本中的逻辑:

s/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/${yellow}\\1${reset_color}\\2${yellow}\\4${reset_color} ${YELLOW}\\5${reset_color}\\6${yellow}\\7${reset_color}/g;
s/LEFT OUTER JOIN [^ ]* ON/${yellow}&${reset_color}/g;

类似于(此答案中现有 awk 脚本的第一个块):

....
match($0,/((\[[^]]+\]){2}\[)([^]]+)(.*)/,a) {
    print black a[1] reset color[a[3]] a[3] reset black a[4] reset
    next
}
match($0,/(SELECT|select)(.*)((FROM|from) ([^ )]*))([^\]]*)(WHERE|where)?/,a) {
    print yellow a[1] reset a[2] yellow a[4] reset yellow a[5] reset a[6] yellow reset
    next
}
match($0,/LEFT OUTER JOIN [^ ]* ON/,a) {
    print yellow a[0] reset
    next
}
....

注意每个块末尾的“next”语句 - 它告诉 awk 停止处理当前输入行并返回到它的隐式 while read line 工作循环的开头。为了提高效率,我们使用它来阻止 awk 在它已经成功地将正则表达式与该行匹配后再次分析该行。

【讨论】:

  • 仅供参考,执行此操作需要 5 到 6 秒才能显示一个大文件——在不突出显示的情况下需要 4 秒(只是 cat)。所以,这变得可以忍受......
  • 我应该把所有的东西都放在 AWK 中吗?或者,例如,保留 AWK 中突出显示的行(就像您所做的那样)并在 Sed 中突出显示 SQL 关键字?和管道?
  • 全部在 awk 中完成。使用 awk 时不需要 sed。当您必须对使用 sed 的单个行进行简单替换时,对于比使用 awk 更复杂的任何内容 - 您永远不会同时使用两者。如果您在示例输入中包含了一些 SQL 语句,我也会向您展示如何执行此操作,但您可能会弄清楚(单独的 match() { action } 块)。如果您需要 awk 的参考资料,请获取 Arnold Robbins 的《Effective Awk Programming, 4th Edition》一书(所有其他 awk 书籍都已过时)。
  • 我在答案底部添加了一些代码来处理 SQL 语句。正如您所看到的,鉴于您现有的 sed 脚本,这一点非常明显。
  • 感谢您改进示例。但是,由于您的 next 命令,它并没有真正起作用。例如,我需要为 SQL 关键字和以 _id 结尾的 SQL 列名着色。不能一次性完成,因为INSERT INTO 后面可能跟着SELECT,因为SELECT 可能包括一个、两个甚至更多JOIN,因为可以在任何地方指定许多_id 列,如可以有子SELECT等。我已经完成了一个包含所有复杂案例的示例文件。而且我希望脚本比普通的cat 慢不超过 20%,这样我就可以真正在运行中使用它,通过管道发送tail -f 命令。
【解决方案2】:

减少相似行

s/(.*\[)(ActionService)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;
s/(.*\[)(Authenticated)(\].*)/${BLACK}\\1${reset_color}${lyellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;

and so forth

一行

s/(.*\[)(ActionService|Authenticated)(\].*)/${BLACK}\\1${reset_color}${yellow}\\2${reset_color}${BLACK}\\3${reset_color}/g;

以获得更高的性能。

【讨论】:

    猜你喜欢
    • 2018-08-15
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2013-03-19
    • 2019-01-11
    • 2010-09-27
    • 1970-01-01
    相关资源
    最近更新 更多