Python Pandas为选定列的行最大值添加列[重复]答案

【问题标题】：Python Pandas add column for row-wise max value of selected columns [duplicate]Python Pandas为选定列的行最大值添加列[重复]
【发布时间】：2013-11-30 17:41:14
【问题描述】：

data = {'name' : ['bill', 'joe', 'steve'],
    'test1' : [85, 75, 85],
    'test2' : [35, 45, 83],
     'test3' : [51, 61, 45]}
frame = pd.DataFrame(data)

我想添加一个显示每行最大值的新列。

想要的输出：

 name test1 test2 test3 HighScore
 bill  75    75    85    85
 joe   35    45    83    83 
 steve  51   61    45    61

有时

frame['HighScore'] = max(data['test1'], data['test2'], data['test3'])

有效，但大多数时候会出现此错误：

ValueError：具有多个元素的数组的真值不明确。使用 a.any() 或 a.all()

为什么它有时只起作用？还有其他方法吗？

【问题讨论】：

更快的解决方案以及此特定操作的性能比较可以在this answer中找到。

标签： python python-2.7 pandas max

【解决方案1】：

>>> frame['HighScore'] = frame[['test1','test2','test3']].max(axis=1)
>>> frame
    name  test1  test2  test3  HighScore
0   bill     85     35     51         85
1    joe     75     45     61         75
2  steve     85     83     45         85

【讨论】：

我不知道 (axis=1) 是做什么的？
@RanjanR.Lamichhane 简而言之，max(axis=1) 获得按行的最大值，而max(axis=0) 获得按列的最大值。看看pandas.pydata.org/pandas-docs/stable/generated/…

【解决方案2】：

>>> frame['HighScore'] = frame[['test1','test2','test3']].apply(max, axis=1)
>>> frame
    name  test1  test2  test3  HighScore
0   bill     85     35     51        85
1    joe     75     45     61        75
2  steve     85     83     45        85

【讨论】：

默认情况下计算最大值时忽略 NA，此方法效果更好
有没有办法获取列名。例如HighScore = 85 在第一行列名是 test1 的高分

【解决方案3】：

如果要确定df 中多个列之间的max 或min 值，则使用：

df['Z']=df[['A','B','C']].apply(np.max,axis=1)

【讨论】：