【发布时间】:2021-01-14 14:13:18
【问题描述】:
下面有我想从中提取月份的文本(在本例中为 7 月)。
word_pattern 确保文本包含这些词,
而month_pattern 将提取月份。所以首先我验证文本段落
包含某些单词,如果是的话,我会尝试提取month
当模式单独使用时,它们会得到匹配,但如果我尝试将它们组合起来 我最终没有匹配。 我不知道我做错了什么。
import re
text = ''' The number of shares of the
registrant’s common stock outstanding as
of July 31, 2017 was 52,833,429.'''
# patterns
word_pattern = r'(?=.*outstanding[.,]?)(?=.*common)(?=.*shares)'
month_pattern = r'(Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)'
pattern = word_pattern + month_pattern
print(re.search(pattern, text, flags = re.IGNORECASE|re.DOTALL))
预期结果:
【问题讨论】:
标签: python regex regex-lookarounds