从每组的另一列获取对应于 idxmax 的列值答案

【问题标题】：Get column value corresponding to the idxmax from another column per group从每组的另一列获取对应于 idxmax 的列值
【发布时间】：2018-12-11 13:42:03
【问题描述】：

我有一个由 3 列和 n 行组成的数据框。

分组前我的数据框看起来像

Index    Max_Mass (kg/m)    Max_Diameter (m)
1             10                   1
2             20                   2
3             30                   3

200           5                    4
201           60                   3
202           20                   2

300           90                   1
301           3                    1
302           10                   1

400           100                  1
401           10                   1
402           10                   1

我通过每 100 行切割一次数据框来对数据框进行分组，以便我可以使用以下方法每 100 行找到特定列的最大值：

groups = output_df.groupby(pd.cut(output_df.index, range(0,len(output_df), 100)))

我正在使用以下内容来查找“最大质量 (kg/m)”列的最大值：

groups.max()['Max Mass (kg/m)']

我现在想创建另一个包含找到的最大值和该值的索引的 df。如何检索索引？我尝试过使用以下内容，但据我了解，它仅适用于单个值，而上面的行返回给我所有最大值的列。

(groups.max()['Max Mass (kg/m)']).getidx()

我的预期输出（对于上面的 DataFrame）将是

我要创建的新数据框应如下所示；

Index    Max_Mass (kg/m)    Max_Diameter (m)
3             30                   3
201           60                   3
300           90                   1
400           100                  1

【问题讨论】：

不清楚您的数据或预期输出应该是什么样子。请问可以提供minimal reproducible example吗？
@coldspeed 更新
在示例数据中，您每 3 行切割一次。
@coldspeed 在示例数据中，有 402 多行，在我的原始编辑中，我在每三行之后放置了一个“...”，以表示延续到下一组 200行。
是的，我明白了。虽然，如果你每 400 行分组，那么你应该只有两个组：第一个是 b/w 0-399 行，第二个是 b/w 400 和 402 行（据我所知，你的帖子） .请再检查一遍好吗？

标签： python pandas dataframe group-by pandas-groupby

【解决方案1】：

内嵌评论。

# Initialise the grouper.
grouper = df.Index // 100
# Get list of indices corresponding to the max using `apply`.
idx = df.groupby(grouper).apply(
          lambda x: x.set_index('Index')['Max_Mass (kg/m)'].idxmax())
# Compute the max and update the other columns based on `idx` computed previously.
v = df.groupby(grouper, as_index=False)['Max_Mass (kg/m)'].max()
v['Index'] = idx.values
v['Max_Diameter (m)'] = df.loc[df.Index.isin(v.Index), 'Max_Diameter (m)'].values

print(v)
   Max_Mass (kg/m)  Index  Max_Diameter (m)
0               30      3                 3
1               60    201                 3
2               90    300                 1
3              100    400                 1

【讨论】：

【解决方案2】：

您可以使用groups.idxmax()，而不是使用groups.max()。然后使用索引来获取最大值。现在您拥有所需的一切。

【讨论】：

我建议以代码多于句子的方式回答问题...获得此标签代表的最佳方式。