如何将来自不同列的两个值连接成一列答案

【问题标题】：How to concatenate two values from different columns into a single column如何将来自不同列的两个值连接成一列
【发布时间】：2019-05-22 20:05:47
【问题描述】：

我有数据框，其中有两个字符串列，需要连接到单个列

2 列中有 3 个值。

1.Column Comment_vol由Blank、Pass和VolA组成

2.Column Comment_wt 由 wtA,Pass 组成

现在我需要一个列，

当Comment_vol 列有空白，且Comment wt 列有任何值时，comment_wt 列取值，vise vsersa
当两列值都有Pass时，应该取Pass
如果同时有 VolA 和 wtA，则应该两者兼得

输入：

  Comment_vol    Comment_wt     
  Pass           wtA            
                 Pass            
  VolA           Pass           
  Pass           Pass           
                 wtA            
  VolA           wtA

输出：

  Comment_vol    Comment_wt     Comment_final
  Pass           wtA            wtA
                 Pass           Pass 
  VolA           Pass           VolA
  Pass           Pass           Pass
                 wtA            wtA
  VolA           wtA            VolA, WtA

代码：

 df['Comment'] = df['comment_vol'].str.cat(df['comment_wt'], sep =" ")

【问题讨论】：

标签： python-3.x pandas dataframe concatenation

【解决方案1】：

def concatcolumns(x):
    vol = str(x[0])
    wt = str(x[1])
    if vol in ['nan', 'Pass']:
        return wt
    elif wt == 'Pass':
        return vol
    else:
        return ", ".join(x)

df['Comment'] = df[['Comment_vol', 'Comment_wt']].apply(lambda x: concatcolumns(x),axis=1)

【讨论】：

【解决方案2】：

编辑：添加说明

df.Comment_vol.str.strip().isin(['Pass', '']) 去除任何前后空格并使用isin 检查列Comment_vol 中的值是“通过”还是“”。我使用strip 来确保您的数据是否包含诸如“Pass”或“VolA”之类的词（注意前后空格），它仍然有效。这将返回一个布尔系列，True 在“通过”或“”上，否则为 False。将此分配给n

df.Comment_wt.str.strip().isin(['Pass', '']) 相同，但应用于列Comment_wt 并分配给m

'~' 是否定运算符，~n 表示Comment_vol 中的任何单词既不是'Pass'也不是''

np.select([n, ~n & m], [df.Comment_wt, df.Commnt_vol], df.Comment_vol.str.cat(df.Comment_wt, sep=', '))等价于逻辑

if n:
    df.Comment_wt
elif ~n & m: #`Comment_vol` is NOT 'Pass' or '' and  df.Comment_wt is 'Pass' or ''
    df.Commnt_vol
else:
    df.Comment_vol.str.cat(df.Comment_wt, sep=', ') #concat both columns using `,'

这个np.select返回数组如下：

np.select([n, ~n & m], [df.Comment_wt, df.Comment_vol], df.Comment_vol.str.cat(df.Comment_wt, sep=', '))

Out[350]: array(['wtA', 'Pass', 'VolA', 'Pass', 'wtA', 'VolA, wtA'], dtype=objec
t)

此数组用于创建df 的Comment_final 列

您可以阅读np.select 的文档以获取更多信息https://docs.scipy.org/doc/numpy/reference/generated/numpy.select.html

原文：
如果我正确理解您的描述和输出，这是使用np.select 的经典案例

n = df.Comment_vol.str.strip().isin(['Pass', ''])
m = df.Comment_wt.str.strip().isin(['Pass', ''])

df['Comment_final'] = np.select([n, ~n & m], [df.Comment_wt, df.Comment_vol], df.Comment_vol.str.cat(df.Comment_wt, sep=', '))


Out[591]:
  Comment_vol Comment_wt Comment_final
0        Pass        wtA           wtA
1                   Pass          Pass
2        VolA       Pass          VolA
3        Pass       Pass          Pass
4                    wtA           wtA
5        VolA        wtA     VolA, wtA

【讨论】：

你能解释一下代码吗，因为我对此很陌生
谢谢您的解释