【发布时间】:2019-11-28 12:35:22
【问题描述】:
我正在尝试匹配以下情况,但案例 6 和案例 8 除外:
case 1 - deliverto should match
case 2 - deliveryto : should match
case 3 - deliveryto: should match
case 4 - delivery to : should match
case 5 - delivery address : should match
case 6 - delivery order : should NOT match
case 7 - ship to: should match
case 8 - delivery inst : should NOT match
case 9 - delivery should match
case 10 - remit to : should match
case 11 - send to: should match
case 12 - remitto: should match
case 13 - delivery: should match
case 14 - deliver: should match
case 15 - delv. : should match
我的逻辑是:匹配第一个块 [ship 或 send 或 remit 或deliver 或 delivery 或 delv.(点是可选的)] 如果第二个块 [to 或 @ 987654330@] 在此之后找到,甚至没有找到第二个块,但如果在第一个块之后找到第三个块 [order 或 inst],则不要使用第一个块 [ship 或 ...]。
我对第 3 块使用了否定前瞻,然后对第 2 块使用了可选的积极前瞻。这是我一直在尝试的正则表达式:
pattern = r"(send|remit|ship|delivery|deliver|delv\.?)\s?(?!(Order|inst))(?=(to|address)?)\:?"
我面临的第一个问题是:即使第一个块后面跟着第三个块,正则表达式也会匹配。
第二个问题是:如果可能的情况在一个列表中并且我在它们上尝试re.finditer(),则可选的第二个块不匹配:
l = ['case 1 - deliverto', 'case 2 - deliveryto :', 'case 3 - deliveryto: ', 'case 4 - delivery to :', 'case 5 - delivery address :', 'case 6 - delivery order :', 'case 7 - ship to:', 'case 8 - delivery inst :', 'case 9 - delivery ', 'case 10 - remit to :', 'case 11 - send to:', 'case 12 - remitto:', 'case 13 - delivery: ', 'case 14 - deliver: ', 'case 15 - delv. :']
for i in l:
print([i.group() for i in re.finditer(patern, i, re.IGNORECASE)])
给:
['deliver']
['delivery']
['delivery']
['delivery ']
['delivery ']
['delivery']
['ship ']
['delivery']
['delivery ']
['remit ']
['send ']
['remit']
['delivery:']
['deliver:']
['delv. :']
如果找到,我需要匹配可选的to 或address 块。我在正则表达式中做错了什么?
有关实施的详细信息,请查看此regex101 站点。谢谢。
【问题讨论】:
标签: python regex python-3.x regex-lookarounds