【发布时间】:2019-10-03 02:42:17
【问题描述】:
import re
s = '99year old 93yo 100 yo 97y.o. and his wife is 93 y.o. 20 y.o 90old 23 year old 29 years old but not 25-year-old and 91year old cousin is 99 now and 90-year-old or 102 year old'
reg = r'(?:9\d|1\d{2})(?:\s|-)?years?(?:\s|-)?old'
r1 = re.findall(reg,s)
r1
['99year old', '91year old', '90-year-old', '102 year old']
以下代码运行良好,取自extracting age variations using regex
我的目标是提取r1 中列出的元素以及以y.o. 或yo 结尾的任何90 以上 数字。我想要的输出是
['99year old', '93yo', '100 yo', '97y.o., '93 y.o.', '91year old', '90-year-old', '102 year old']
我已尝试将reg 更改如下,但这并不能安静地工作
reg = r'(?:9\d|1\d{2})(?:\s|-)?years?(?:\s|-)?old(?:9\d|1\d{2})y.o.|(?:9\d|1\d{2})yo'
如何更改 reg 以获得我想要的输出?
【问题讨论】:
标签: regex python-3.x string text