【发布时间】:2019-12-28 05:36:13
【问题描述】:
import pandas as pd
dataframe = pd.DataFrame({'Data' : ['A 90-year-old or 96-year-old and 110-year-old is 90 days ',
'For all 82-year-old is the 94-year-old why 28A ',
'But the fact is 101-year-old 109-year-old cool 100',],
'ID': [1,2,3]
})
#tried this regex
dataframe['New'] = dataframe['Data'].str.replace(r'\d+(-year-old)', r'>90')
dataframe
Data ID New
0 A 90-year-old or 96-year-old and 110-year-old is 90 days 1 A >90 or >90 and >90 is 90 days
1 For all 82-year-old is the 94-year-old why 28A 2 For all >90 is the >90 why 28A
2 But the fact is 101-year-old 109-year-old cool 100 3 But the fact is >90 >90 cool 100
我正在尝试使用正则表达式来更改所有 90 岁以上的年龄。因此,例如,90-year-old 将更改为 >90。但82-year-old 或任何 90 岁以下的人都不应该这样做。如上所示,我已接近我想要的,但 82-year-old 仍会更改为 >90 但它不应该
如何在这行代码中更改我的正则表达式
dataframe['New'] = dataframe['Data'].str.replace(r'\d+(-year-old)', r'>90')
以便仅将90-year-old 及以上(例如91-year-old、98-year-old、105-year-old 等)更改为>90?
【问题讨论】:
标签: python regex string pandas replace