【问题标题】:edit columnnames that include duplicate special characters编辑包含重复特殊字符的列名
【发布时间】:2021-05-26 12:21:55
【问题描述】:

我有一些列名在不同的空格处包含两个问号,例如'你那时候几岁?你什么时候开始上大学的? - 我需要确定哪些列中有两个问号。欢迎任何提示!谢谢

数据

df = pd.DataFrame(data={'id': [1, 2, 3, 4, 5], 'how old were you? when you started university?': [1,2,3,4,5], 'how old were you when you finished university?': [1,2,3,4,5], 'at what age? did you start your first job?': [1,2,3,4,5]})

预期输出

df1 = pd.DataFrame(data={'id': [1, 2, 3, 4, 5], 'how old were you when you finished university?': [1,2,3,4,5]})

【问题讨论】:

    标签: python pandas duplicates columnname drop


    【解决方案1】:

    列表理解的一个想法:

    df = df[[c for c in df.columns if c.count("?") < 2]]
    print (df)
       id  how old were you when you finished university?
    0   1                                               1
    1   2                                               2
    2   3                                               3
    3   4                                               4
    4   5                                               5
    

    【讨论】:

      【解决方案2】:

      您可以使用布尔索引:

      x = df.loc[:, df.columns.str.count(r"\?") < 2]
      print(x)
      

      打印:

         id  how old were you when you finished university?
      0   1                                               1
      1   2                                               2
      2   3                                               3
      3   4                                               4
      4   5                                               5
      

      【讨论】:

        【解决方案3】:

        如果你想获取所有包含多个问号的列,你可以使用如下:

        [c for c in df.columns if c.count("?")&gt;1]

        编辑:如果你想替换多余的“?”但保留结尾“?”,使用这个:

        df.rename(columns = {c: c.replace("?", "")+"?" for c in df.columns if c.find("?")&gt;0})

        【讨论】:

          【解决方案4】:
          df = df.drop([col for col in df.columns if col.count("?")>1], axis=1)
          

          【讨论】:

            猜你喜欢
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 1970-01-01
            • 2022-12-03
            • 1970-01-01
            • 2020-12-25
            相关资源
            最近更新 更多