【问题标题】:Duplicated keys on Pyparsing Dict (dhcpd.conf)Pyparsing Dict (dhcpd.conf) 上的重复键
【发布时间】:2021-10-22 19:12:36
【问题描述】:

我正在尝试使用 pyparsing 将 dhcpd.conf 文件解析为 python dict。这是一个语法示例:

ddns-update-style none;
log-facility local7;
deny unknown-clients;
deny booting;

subnet 10.42.200.0 netmask 255.255.255.0 {
    
    max-lease-time 28800;
    option routers 10.42.200.1;
    option domain-name-servers 8.8.8.8, 8.8.4.4;
    range 10.42.200.10 10.42.200.200;

}

shared-network 224-29 {
    subnet 10.17.224.0 netmask 255.255.255.0 {
        option routers rtr-224.example.org;
    }
    subnet 10.0.29.0 netmask 255.255.255.0 {
        option routers rtr-29.example.org;
    }
    pool {
        allow members of "foo";
        range 10.17.224.10 10.17.224.250;
    }
    pool {
        deny members of "foo";
        range 10.0.29.10 10.0.29.230;
    }
}

如您所见,可以在此 dhcpd.conf 上创建多个“池”。但是,如果我将它们解析为字典,一个将覆盖另一个,因为它们将具有相同的键(“池”)。此外,可能会发生多个“拒绝”语句,并且在解析时会再次被覆盖,从而导致每个嵌套级别只有 1 个“拒绝”语句。

理想的解析字典应该是这样的:

{
    'ddns-update-style': 'none',
    'log-facility': 'local7',
    'deny': [
        'booting',
        'unknown-clients'
    ],
    'shared-networks': { 
        '224-29': {
            'subnet 10.0.29.0 netmask 255.255.255.0': {
                'options': { 
                    'routers': 'rtr-29.example.org'
                },
            },
            'subnet 10.17.224.0 netmask 255.255.255.0': {
                'options': {
                    'routers': 'rtr-224.example.org'
                }
            },
            'pools': [
                {
                    'allow': ['members of "foo"'],
                    'range': '10.17.224.10 10.17.224.250',
                },
                {
                    'deny': ['members of "foo"'],
                    'range': '10.0.29.10 10.0.29.230'
                }
                
            ]
        
            
        }
    },
    'subnet 10.42.200.0 netmask 255.255.255.0': {
        'max-lease-time': '28800',
        'options': { 
            'domain-name-servers': ['8.8.8.8', '8.8.4.4'],
            'routers': '10.42.200.1',
        }
        'range': '10.42.200.10 10.42.200.200'
    }
}

到目前为止,我能够通过将命名部分与其名称结合来解决这个问题。 (如选项和共享网络)但这对于池或允许/拒绝语句是不可能的。

这是我当前的输出和代码:

{
    'ddns-update-style': 'none',
    'deny': 'booting',
    'log-facility': 'local7',
    'shared-network 224-29': {
        'pool': {
            'deny': 'members of "foo"',
            'range': '10.0.29.10 10.0.29.230'
        },
        'subnet 10.0.29.0 netmask 255.255.255.0': {'option routers': 'rtr-29.example.org'},
        'subnet 10.17.224.0 netmask 255.255.255.0': {'option routers': 'rtr-224.example.org'}
    },
    'subnet 10.42.200.0 netmask 255.255.255.0': {
        'max-lease-time': '28800',
        'option domain-name-servers': ['8.8.8.8', '8.8.4.4'],
        'option routers': '10.42.200.1',
        'range': '10.42.200.10 10.42.200.200'
    }
}

代码:

from pprint import pprint
from pyparsing import (
    CharsNotIn,
    Literal,
    White,
    Word,
    Combine,
    ZeroOrMore,
    OneOrMore,
    Forward,
    Dict,
    ParseException,
    Group,
    Suppress,
    delimitedList,
    nums,
    restOfLine,
    alphanums,
    pyparsing_common,
)
prop = delimitedList(Word(alphanums + "-_.!@#$%^&*"))
ipAddress = pyparsing_common.ipv4_address | pyparsing_common.ipv6_address
macAddress = Combine(
    Word(alphanums, exact=2)
    + ":"
    + Word(alphanums, exact=2)
    + ":"
    + Word(alphanums, exact=2)
    + ":"
    + Word(alphanums, exact=2)
    + ":"
    + Word(alphanums, exact=2)
    + ":"
    + Word(alphanums, exact=2)
)

list_ipAddress = delimitedList(ipAddress)
name = Word(alphanums + "-_")
value = list_ipAddress | macAddress | prop | CharsNotIn(";")
comment = "#" + restOfLine

LBRACE, RBRACE, WHITE_SPACE = map(Suppress, "{} ")

struct = Forward()

hw_eth = Group(
    Combine(Literal("hardware") + White(" ") + name) + macAddress
)

name_value = Group(name + value)

subnet = Group(
    Combine(
        "subnet"
        + White(" ")
        + ipAddress
        + White(" ")
        + "netmask"
        + White(" ")
        + ipAddress
    )
    + struct
)

option = Group(Combine("option" + White(" ") + name) + value)

host = Group(Combine("host" + White(" ") + name) + struct)

ip_range = Group(
    Literal("range") + Combine(ipAddress + White(" ") + ipAddress)
)

shared_network = Group(
    Combine(Literal("shared-network") + White(" ") + name) + struct
)

allow_deny = Group(
    (Literal("allow") | Literal("deny"))
    + ZeroOrMore(WHITE_SPACE)
    + CharsNotIn(";")
)

named_struct = Group(name + struct)

struct << Dict(
    LBRACE
    + ZeroOrMore(
        subnet
        | host
        | shared_network
        | ip_range
        | option
        | hw_eth
        | named_struct
        | allow_deny
        | name_value
    )
    + RBRACE
)

parser = Dict(
    OneOrMore(
        subnet
        | host
        | shared_network
        | ip_range
        | option
        | hw_eth
        | named_struct
        | allow_deny
        | name_value
    )
)
parser.ignore(comment)
parser.ignore(";")

try:
    result = parser.parseString(data)
except ParseException as pe:
    str_pe = str(pe)
    if "found end of text" in str_pe:
        pass
    else:
        raise pe
pprint(result.asDict())

【问题讨论】:

    标签: python dictionary parsing pyparsing


    【解决方案1】:

    这是用 MultiDict 代替 Dict 的缩写语法。您也许可以使用 MultiDict 或类似的东西来解决您的问题。 (另外,mac_address 可以更简单地写成 Combine(Word(hexnums, exact=2) + (":" + Word(hexnums, exact=2))*5) 或者直接使用 pyparsing_common 中的那个。)

    import pyparsing as pp
    
    ppc = pp.common
    LBRACE, RBRACE = map(pp.Suppress, "{}")
    
    _1to32 = pp.oneOf([str(i) for i in range(1, 32+1)])
    _1to32 = pp.Regex(r"[12]\d|3[012]|[1-9]")
    cidr = pp.Combine(ppc.ipv4_address + '/' + _1to32)
    
    ALLOW, DENY, POOL, RANGE = map(pp.Keyword, "allow deny pool range".split())
    range_expr = pp.Group(RANGE + (cidr | ppc.ipv4_address*2))
    
    allow_expr = pp.Group(ALLOW - ppc.ipv4_address)
    deny_expr = pp.Group(DENY - ppc.ipv4_address)
    
    
    # kind of like a defaultdict version of Dict
    class MultiDict(pp.ParseElementEnhance):
        def postParse(self, instring, loc, tokenlist):
            ret = pp.ParseResults([])
    
            for gp in tokenlist:
                key = gp[0]
                if key not in ret:
                    ret[key] = pp.ParseResults([])
                ret[key].append(gp[1:])
    
            # flatten lists of single-item lists
            for key in ret.keys():
                if all(len(value) == 1 for value in ret[key]):
                    ret[key] = pp.ParseResults([x[0] for x in ret[key]])
                else:
                    ret[key] = pp.ParseResults(ret[key])
            return ret
    
    # using the '&' operator creates an Each expression, which 
    # will parse these expressions in any order, including if they
    # are not all grouped
    pool_def = LBRACE + MultiDict(
        allow_expr[...]
        & deny_expr[...]
        & range_expr
    ) + RBRACE
    
    pool_expr = pp.Group(POOL - pp.Group(pool_def))
    parser = MultiDict(pool_expr[...])
    
    data = """\
    pool { 
       allow 1.1.1.1
       range 3.3.3.3/24
       deny 4.4.4.4
       allow 2.2.2.2
    }
    pool {
       deny 4.4.4.4
       deny 5.5.5.5
       range 6.6.6.6 7.7.7.7
    }
    """
    
    result = parser.parse_string(data, parse_all=True)
    
    import pprint
    pprint.pprint(result.as_dict())
    

    打印

    {'pool': [{'allow': ['1.1.1.1', '2.2.2.2'],
               'deny': ['4.4.4.4'],
               'range': ['3.3.3.3/24']},
              {'deny': ['4.4.4.4', '5.5.5.5'], 'range': [['6.6.6.6', '7.7.7.7']]}]}
    
    

    【讨论】:

    • 我无法让它正常工作,你能尝试修改我提供的代码让它工作吗?
    猜你喜欢
    • 2017-09-08
    • 2018-10-12
    • 2018-07-27
    • 2023-03-18
    • 1970-01-01
    • 1970-01-01
    • 2017-09-13
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多