【问题标题】:Changing the Contents of a Pandas Column Based on Another Column根据另一列更改 Pandas 列的内容
【发布时间】:2020-10-29 09:27:51
【问题描述】:

我有一个类似于以下内容的 pandas 数据框:

Neighborhood      High School      ...
WOODLEY           LIBERTY
WOODLEY 
COUNTRY CLUB  
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      HERITAGE
COUNTRY CLUB      TUSCORORA
...

如您所见,有些条目是空白的或不正确的,所以我正在尝试修复这些。我首先创建了一个如下所示的函数。

def cleanHS(dat):
    if dat.Neighborhood == "WOODLEY":
        dat["High School"] == "LIBERTY"
    elif dat.Neighborhood == "COUNTRY CLUB":
        dat["High School"] == "HERITAGE"
    ...

    return dat

然后我调用该函数。

dirty["High School"] = dirty["High School"].map(cleanHS)

这是我收到属性错误的地方: AttributeError: 'str' object has no attribute 'Neighborhood'

我该如何解决这个问题?

【问题讨论】:

    标签: python python-3.x pandas dataframe mapping


    【解决方案1】:

    这里不需要循环。您可以创建一个键值对字典,从NeighbourhoodmappingHigh School 的更正值

    d = {"WOODLEY": "LIBERTY", "COUNTRY CLUB": "HERITAGE"}
    dirty['High School'] = dirty['Neighborhood'].map(d)
    

    输出

    Neighborhood      High School
    WOODLEY           LIBERTY
    WOODLEY           LIBERTY
    COUNTRY CLUB      HERITAGE
    COUNTRY CLUB      HERITAGE
    COUNTRY CLUB      HERITAGE
    COUNTRY CLUB      HERITAGE
    

    【讨论】:

      【解决方案2】:

      这是正确答案。使用字典进行映射很容易(如另一个答案所示)。

      cleanHS = {"WOODLEY": "LIBERTY", "COUNTRY CLUB": "HERITAGE", ...}
      

      但是,为了正确映射这两个列,必须包含邻域列。这是因为您将 High School 的值映射到其他值,但映射值的起始列应该是 Neighborhood。

      dirty["High School"] = dirty["Neighborhood"].map(cleanHS)
      

      【讨论】:

        猜你喜欢
        • 2015-04-22
        • 2018-04-15
        • 1970-01-01
        • 1970-01-01
        • 2012-10-15
        • 2019-04-04
        • 1970-01-01
        • 2020-09-24
        • 2020-08-20
        相关资源
        最近更新 更多