【问题标题】:Building a connection log system建立连接日志系统
【发布时间】:2018-06-06 01:43:17
【问题描述】:

我正在构建一个“智能”日志系统,我可以在其中监控客户连接,例如启动和停止与服务器的连接建立时间。

原始日志

Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: terminating... - peer is not responding
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info,account customer1 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: disconnected
Dec 19 00:00:07 172.16.20.24 pppoe,info PPPoE connection established from 60:E3:27:A2:60:09
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info,account customer2 logged in, 10.171.3.185
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 00:00:13 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 00:00:14 172.16.20.24 pppoe,ppp,error <ccfa>: user customer3 authentication failed
Dec 19 00:00:32 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 00:00:36 172.16.20.24 pppoe,ppp,error <ccfb>: user customer3 authentication failed
Dec 19 00:01:06 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41

对我来说重要的是:用 connecteddisconnected 字符串捕获行。

我知道了:

import os
import re
import sys

f = open('log.log','r')
log = []
for line in f:
 if re.search(r': connected|: disconnected',line):
  ob = dict()
  ob['USER'] = re.search(r'<pppoe(.*?)>',line).group(0).replace("<pppoe-","").replace(">","")
  ob['DATA'] = re.search(r'^\w{3} \d{2} \d{2}:\d{2}:\d{2}',line).group(0)
  ob['CONNECTION'] = re.search(r': .*',line).group(0).replace(": ", "")
  log.append(ob)

我还在学习,所以这不是最出色的正则表达式,但没关系! 需要现在细化这个日志列表,想要得到这个样本:

{"connection" : [{
"start" : "Dec 19 10:12:58", 
"username" : "customer2"}

{"connection" : [{
"start" : "Dec 20 10:12:58", 
"username" : "customer1"}

{"connection" : [{
"start" : "Dec 19 10:12:58", 
"stop" : Dec 22 10:04:35",
"username" : "customer4"}

{"connection" : [{
"start" : "Dec 19 10:12:58",
"stop" : "Dec 24 10:04:35" 
"username" : "customer3"}

我的障碍,

  • RAW 日志不断生成,我需要确定是否有一些 用户已存在。 如果是:更新连接(客户 2 丢弃了他的连接,需要注册它!)但是如果他有常量丢弃连接会发生什么?

例如:

Dec 19 10:20:58 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected    
Dec 19 01:00:36 172.16.20.24 pppoe,ppp,error <ccfb>: user customer3 authentication failed
Dec 19 01:01:06 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 10:21:38 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 10:21:48 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 10:22:38 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: authenticated
Dec 19 10:22:58 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected  

第一次断开,添加简单。

{"connection" : [{
"start" : "Dec 19 10:12:58"
"stop" : "Dec 19 10:20:58", 
"username" : "customer2"}

在下一次身份验证中,我需要搜索这个特定的用户,插入新的“开始”连接时间并删除“停止”。以此类推。

{"connection" : [{
"start" : "Dec 19 10:21:48" 
"username" : "customer2"}
  • 我的下一个挑战者,它创建这个新的精炼列表。

试着做这个,但不起作用!

conn = []
for l in log:
 obcon = dict()
 if not obcon:
    obcon['USER'] = l['USER']
    if l['DATA'] == 'connected':
        obcon['START'] = l['DATA']      
        obcon['STOP'] = ""
    else:
        obcon['STOP'] = l['DATA']
 conn.append(obcon)

在构建新列表之前,我需要检查是否存在某个用户,如果不存在,让我们构建它!我用来识别启动/停止连接的 ['CONNECTION']:

Disconnected -> STOP
Connected -> START

我不知道我是否需要更具体。 需要想法。请!

【问题讨论】:

    标签: python regex list dictionary logging


    【解决方案1】:

    在我看来,var log 应该是 dict 类型,因为它可以帮助您更轻松地找到现有的用户数据。
    接下来,您在任何地方都使用了re(...).group(0),即entire matching string。例如,在提取用户名时,您写了'&lt;pppoe(.*?)&gt;',但它位于group(1)(在正则表达式中,括号用于匹配提取)。

    我的建议是(注意 - 我删除了 sysos 的导入,因为它们没有被使用):

    import re
    
    f = open('log.log', 'r')
    log = dict()
    for line in f:
        reg = re.search(r': ((?:dis)?connected)', line) # finds connected or disconnected
        if reg is not None:
            user = re.search(r'<pppoe-(.*?)>', line).group(1)
            # if the user in the log, get it, else create it with empty dict
            ob = log.setdefault(user, dict({'USER': user})) 
            ob['CONNECTION'] = reg.group(1)
            time = re.search(r'^\w{3} \d{2} \d{2}:\d{2}:\d{2}', line).group(0)
            if ob['CONNECTION'].startswith('dis'):
                ob['END'] = time
            else:
                ob['START'] = time
                if 'END' in ob:
                    ob.pop('END')
    

    如果日志文件是:

    Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: terminating... - peer is not responding
    Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info,account customer1 logged out, 4486 1009521 23444247 12573 18159
    Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: disconnected
    Dec 19 00:00:07 172.16.20.24 pppoe,info PPPoE connection established from 00:00:00:00:00:00
    Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info,account customer2 logged in, 127.0.0.1
    Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
    Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
    Dec 19 00:00:13 172.16.20.24 pppoe,info PPPoE connection established from 00:00:00:00:00:00
    Dec 19 00:00:14 172.16.20.24 pppoe,ppp,error <ccfa>: user customer3 authentication failed
    Dec 19 00:02:03 172.16.20.24 pppoe,ppp,info,account customer2 logged out, 4486 1009521 23444247 12573 18159
    Dec 19 00:02:03 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected
    Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info,account customer3 logged in, 127.0.0.1
    Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: authenticated
    Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: connected
    

    log 的值将是:

    {
        'customer1': {
            'CONNECTION': 'disconnected',
            'END': 'Dec 19 00:00:03',
            'USER': 'customer1'
        }, 
        'customer3': {
            'START': 'Dec 19 00:02:08',
            'CONNECTION': 'connected',
            'USER': 'customer3'
        }, 
        'customer2': {
            'START': 'Dec 19 00:00:08',
            'CONNECTION': 'disconnected',
            'END': 'Dec 19 00:02:03', 
            'USER': 'customer2'
        }
    }
    

    【讨论】:

    • 这太棒了!但我还有一个问题。我有11个类似的LOG,你怎么看?我是否需要将其放入一个函数中并将其“实例化”到每个 LOG 中?我的意思是,每次在日志中添加一行时,它会再次执行 FOR 吗?你明白还是我很困惑?
    • @TMoraes 嗯,我无法完全理解您的需求。请尝试再解释一遍
    • 没关系,我会尝试另一种方式!再次感谢兄弟:)
    猜你喜欢
    • 1970-01-01
    • 2014-12-12
    • 2010-12-06
    • 2011-10-26
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2014-04-30
    • 2013-01-03
    相关资源
    最近更新 更多