【问题标题】:Change True/False value to discrete value in pandas dataframe with np.where()使用 np.where() 在 pandas 数据框中将 True/False 值更改为离散值
【发布时间】:2016-12-02 00:38:12
【问题描述】:

我正在尝试为大学名称列表分配州名:

df = pd.DataFrame({'College': pd.Series(['University of Michigan', 'University of Florida', 'Iowa State'])})
State = ['Michigan', 'Iowa']
df['State'] = np.where(df['College'].str.contains('|'.join(State)),
    'state','--')

我想将匹配时出现的“州”值替换为州的实际名称。示例:密歇根大学 -> 密歇根大学(而不是“州”)。最终,“状态”将包含所有 50 个状态,因此我不能为每个状态名称编写 50 个“np.where”语句。

感谢您的帮助。

【问题讨论】:

    标签: python pandas contains assign extraction


    【解决方案1】:

    你可以在这里使用str.extract,而不是np.where

    In [290]: df['State'] = df['College'].str.extract('({})'.format('|'.join(State)), expand=True)
    
    In [291]: df
    Out[291]: 
                      College     State
    0  University of Michigan  Michigan
    1   University of Florida       NaN
    2              Iowa State      Iowa
    

    【讨论】:

      【解决方案2】:
      States = [
                  'Washington' 'Wisconsin' 'West Virginia' 'Florida' 'Wyoming'
                  'New Hampshire' 'New Jersey' 'New Mexico' 'National' 'North Carolina'
                  'North Dakota' 'Nebraska' 'New York' 'Rhode Island' 'Nevada' 'Guam'
                  'Colorado' 'California' 'Georgia' 'Connecticut' 'Oklahoma' 'Ohio' 'Kansas'
                  'South Carolina' 'Kentucky' 'Oregon' 'South Dakota' 'Delaware'
                  'District of Columbia' 'Hawaii' 'Puerto Rico' 'Texas' 'Louisiana'
                  'Tennessee' 'Pennsylvania' 'Virginia' 'Virgin Islands' 'Alaska' 'Alabama'
                  'American Samoa' 'Arkansas' 'Vermont' 'Illinois' 'Indiana' 'Iowa'
                  'Arizona' 'Idaho' 'Maine' 'Maryland' 'Massachusetts' 'Utah' 'Missouri'
                  'Minnesota' 'Michigan' 'Montana' 'Northern Mariana Islands' 'Mississippi'
      ]
      
      state_str = '|'.join(States)
      df.update(df.College.str.extract(r'(?P<State>{})'.format(state_str), expand=True))
      
      df
      

      【讨论】:

        猜你喜欢
        • 2021-02-15
        • 1970-01-01
        • 1970-01-01
        • 2021-11-08
        • 2022-11-02
        • 1970-01-01
        • 2022-11-17
        • 1970-01-01
        • 2013-04-03
        相关资源
        最近更新 更多