【问题标题】:changing age using regex pandas使用正则表达式 pandas 改变年龄
【发布时间】:2019-12-28 05:36:13
【问题描述】:
    import pandas as pd
    dataframe = pd.DataFrame({'Data' : ['A 90-year-old or 96-year-old and 110-year-old is 90 days  ', 
                                       'For all 82-year-old is the 94-year-old why 28A ', 
                                       'But the fact is 101-year-old 109-year-old cool 100',],
                          'ID': [1,2,3]

                         })
#tried this regex
dataframe['New'] = dataframe['Data'].str.replace(r'\d+(-year-old)', r'>90')

dataframe
    Data                                                      ID    New
0   A 90-year-old or 96-year-old and 110-year-old is 90 days  1 A >90 or >90 and >90 is 90 days
1   For all 82-year-old is the 94-year-old why 28A            2 For all >90 is the >90 why 28A
2   But the fact is 101-year-old 109-year-old cool 100        3 But the fact is >90 >90 cool 100

我正在尝试使用正则表达式来更改所有 90 岁以上的年龄。因此,例如,90-year-old 将更改为 >90。但82-year-old 或任何 90 岁以下的人都不应该这样做。如上所示,我已接近我想要的,但 82-year-old 仍会更改为 >90 但它不应该

如何在这行代码中更改我的正则表达式

   dataframe['New'] = dataframe['Data'].str.replace(r'\d+(-year-old)', r'>90')

以便90-year-old 及以上(例如91-year-old98-year-old105-year-old 等)更改为>90

【问题讨论】:

    标签: python regex string pandas replace


    【解决方案1】:

    您可以使用涵盖两种情况的正则表达式来指定这一点:9[1-9]\d{3,}

    dataframe['New'] = dataframe['Data'].str.replace(r'<b>(9[1-9]|\d{3,})</b>(-year-old)', r'&gt;90')

    第一部分9[1-9] 因此匹配9199 之间的所有值,第二部分匹配所有三位或更多数字(1234 当然非常不太可能) .

    对于给定的样本数据,我们得到:

    >>> dataframe['Data'].str.replace(r'(9[1-9]|\d{3,})(-year-old)', r'>90')
    0    A 90-year-old or >90 and >90 is 90 days  
    1      For all 82-year-old is the >90 why 28A 
    2             But the fact is >90 >90 cool 100
    Name: Data, dtype: object
    

    如果要包含90,可以将正则表达式更改为:

    >>> dataframe['Data'].str.replace(r'(9\d|\d{3,})(-year-old)', r'>90')
    0          A >90 or >90 and >90 is 90 days  
    1    For all 82-year-old is the >90 why 28A 
    2           But the fact is >90 >90 cool 100
    Name: Data, dtype: object
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-06-10
      • 1970-01-01
      • 2021-09-14
      • 2019-12-24
      • 1970-01-01
      • 2019-05-06
      相关资源
      最近更新 更多