【问题标题】:Alter the column of dataframe by matching other column with a dictionary通过将其他列与字典匹配来更改数据框的列
【发布时间】:2019-02-24 20:11:29
【问题描述】:

我有一个情况:

postStr = """{
                     "zoneId":"0",
                     "id":["a","b","c","d","f","g"],
                     "currencycode":["USD"],

                }"""


postData = json.loads(postStr, object_pairs_hook=OrderedDict)

我有一个数据框:

df = {
'id':['a','b','c','d','f','g','h','i','j','k'],
'B':['c','d','e','d','d','c','s','e','s','q'],
'S':['f','g','h','j','e','j','t','r','p','p']
}
df1 = pd.DataFrame(df)

现在我想要一个数据框架,如果 id 在字典中,则 B 对应列变为 XX

输出:

    df = {
'id':['a','b','c','d','f','g','h','i','j','k'],
'B' :['XX','XX','XX','XX','XX','c','s','e','s','q'],
'S' :['f','g','h','j','e','j','t','r','p','p']
}
df1 = pd.DataFrame(df)

请帮忙

【问题讨论】:

    标签: python pandas dataframe anaconda data-science


    【解决方案1】:

    我认为需要isinloc

    df1.loc[df1['id'].isin(postData['id']), 'id'] = 'XX'
    print (df1)
       id  B  S
    0  XX  c  f
    1  XX  d  g
    2  XX  e  h
    3  XX  d  j
    4  XX  d  e
    5  XX  c  j
    6   h  s  t
    7   i  e  r
    8   j  s  p
    9   k  q  p
    

    如果想要更动态的解决方案 - 对 DataFrame 和字典中的列名使用 intersection 并在循环中设置值:

    postStr = """{
                         "S":["f","h"],
                         "id":["a","b","c","d","f","g"],
                         "currencycode":["USD"]
    
                    }"""
    
    postData = json.loads(postStr, object_pairs_hook=OrderedDict)
    print (postData)
    OrderedDict([('S', ['f', 'h']), 
                 ('id', ['a', 'b', 'c', 'd', 'f', 'g']), 
                 ('currencycode', ['USD'])])
    
    df = {
    'id':['a','b','c','d','f','g','h','i','j','k'],
    'B':['c','d','e','d','d','c','s','e','s','q'],
    'S':['f','g','h','j','e','j','t','r','p','p']
    }
    df1 = pd.DataFrame(df)
    
    for col in df1.columns.intersection(postData.keys()):
        df1.loc[df1[col].isin(postData[col]), col] = 'XX'
    print (df1)
       id  B   S
    0  XX  c  XX
    1  XX  d   g
    2  XX  e  XX
    3  XX  d   j
    4  XX  d   e
    5  XX  c   j
    6   h  s   t
    7   i  e   r
    8   j  s   p
    9   k  q   p
    

    【讨论】:

    • 我这样做了,但得到:TypeError: string indices must be integers
    • self.resp_df.loc[self.resp_df['id'].isin(post_req['id']), 'returnType'] = 'CTR'
    • @ayushgupta - 演员阵容如何像self.resp_df.loc[self.resp_df['id'].isin(list(post_req['id'])), 'returnType'] = 'CTR' 一样列出?
    • self.resp_df.loc[self.resp_df['id'].isin(list(post_req['id'])), 'returnType'] = 'CTR' TypeError: 字符串索引必须是整数
    • @ayushgupta - print (type(postData['id'])) 是什么?
    猜你喜欢
    • 1970-01-01
    • 2023-03-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-05-06
    • 2013-06-20
    • 1970-01-01
    • 2016-12-27
    相关资源
    最近更新 更多