【问题标题】：How to replace particular values in dataframe column from a dictionary?如何从字典中替换数据框列中的特定值？
【发布时间】：2020-02-12 19:01:55
【问题描述】：

所以，我有一个如下方式的表格：

Col1     Col2
ABS      45
CDC      23
POP      15

现在，我有一本字典 aa = {'A':'AD','P':'PL','C':'LC'}。因此，仅对于匹配的关键部分，我希望列中的值发生变化。对于与字典键不匹配的其他字母，应保持不变。

决赛桌应该是这样的：

Col1     Col2
ADBS     45
LCDLC    23
PLOPL    15

我正在尝试使用以下代码，但它不起作用。

df['Col1'].str.extract[r'([A-Z]+)'].map(aa)

【问题讨论】：

标签： python regex pandas dictionary

【解决方案1】：

解决方案

df = pd.DataFrame({'Col1': ['ABS', 'CDC', 'POP'], 
                   'Col2': [45, 23, 15], 
                  })

keys = aa.keys()
df.Col1 = [''.join([aa.get(e) if (e in keys) else e for e in list(ee)]) for ee in df.Col1.tolist()]
df

输出：

解压精简列表理解

让我们以更易读的形式写下列表推导式。我们创建了一个函数do_something 来了解列表理解的第一部分发生了什么。第二部分 (for ee in df.Col1.tolist()) 本质上是遍历数据框 df 的列 'Col1' 中的每一行。

def do_something(x):
    # here x is like 'ABS'
    xx = '.join([aa.get(e) if (e in keys) else e for e in list(x)])
    return xx
df.Col1 = [do_something(ee) for ee in df.Col1.tolist()]

开箱`do_something(x)`

函数do_something(x) 执行以下操作。如果您尝试使用x = 'ABS' 会更容易。 do_something 中的 ''.join(some_list) 加入了生成的列表。下面的代码块将说明这一点。

x = 'ABS'
print(do_something(x))
[aa.get(e) if (e in keys) else e for e in list(x)]

输出：

ADBS
['AD', 'B', 'S']

那么核心逻辑是什么？

以下代码块逐步向您展示了该逻辑的工作原理。显然，解决方案开头引入的list comprehension 将nested for loops 压缩为一行，因此应该优先于以下内容。

keys = aa.keys()
packlist = list()
for ee in df.Col1.tolist():
    # Here we iterate over each element of 
    # the dataframe's column (df.Col1)

    # make a temporary list
    templist = list()
    for e in list(ee):
        # here e is a single character of the string ee
        # example: list('ABS') = ['A', 'B', 'S']
        if e in keys:
            # if e is one of the keys in the dict aa
            # append the corresponding value to templist
            templist.append(aa.get(e))
        else:
            # if e is not a key in the dict aa
            # append e itself to templist
            templist.append(e)
    # append a copy of templist to packlist
    packlist.append(templist.copy())

# Finally assign the list: packlist to df.Col1 
# to update the column values
df.Col1 = packlist

参考文献

列表和字典推导是任何 Python 程序员在编码时都会发现的一些非常强大的工具。他们有能力将原本复杂的代码块巧妙地压缩成一两行。我建议你看看以下内容。

【讨论】：

好的。我将在解决方案中添加一些解释。你的问题是关于列表理解的吗？如果我在嵌套的 for 循环中编写相同的内容，是否有助于更清楚地了解正在发生的事情？
是的，这会很有帮助。非常感谢。
@SayantanGhosh 你能否也请upvote回答？谢谢你。附言为使列表理解更加清晰，添加了额外的注释。
非常感谢。这真的很有帮助。
@SayantanGhosh 添加了一个参考部分，其中包含指向 list 和 dict comprehensions 的链接。你可以看看它。学习列表理解对我来说非常方便。

【解决方案2】：

你可以使用下面的替换来做到这一点

df = pd.DataFrame([['ABS', '45'], ['CDC', '23'], ['POP', '15']], columns=('Col1', 'Col2'))
aa = {'A':'AD','P':'PL','C':'LC'}
pat = "|".join(aa.keys())
df["Col1"].str.replace(pat, lambda x: aa.get(x[0], x[0]))

【讨论】：

这只是我给出的一个例子，我使用的字典很大，不可能把所有的值都写在括号里。
你的意思是你不能像上面那样得到aa字典？

解决方案

解压精简列表理解

开箱do_something(x)

那么核心逻辑是什么？

参考文献

开箱`do_something(x)`