如何在 Python 的数据框列中找到某些内容的百分比？答案

【问题标题】：How can I find the percentage of something in a column of dataframe in Python?如何在 Python 的数据框列中找到某些内容的百分比？
【发布时间】：2020-12-12 23:48:05
【问题描述】：

我有以下数据框

import pandas as pd

df = pd.DataFrame({'Volcano Name': ['a', 'b', 'a', 'c', 'b', 'b', 'e', 'd', 'b', 'e', 'e'],
                   'Start Year': [1960, 1962, 1961, 1961, 1961, 1960, 1959, 1959, 1958, 1960, 1958],
                   'VEI': [0.0, 3.0,3.0,2.0, 3.0, 1.0, 1.0, 0.0, 2.0, 1.0, 2.0],
                   'Lat': [31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31]})

如何通过 VEI 找到每座火山的百分比？这里有类似的问题，但无法弄清楚如何在我的实现。

我想我应该从类似的东西开始

df.groupby('VEI').count()

or

df.pivot_table( index=['Volcano Name','VEI'], columns='Volcano Name')

谢谢

【问题讨论】：

您的预期输出是什么？
VEI 如何映射到这样的百分比？目前尚不清楚确切的公式是什么......
对于按VEI 分组的Vulacano Name 的百分比：df.groupby('VEI')['Volcano Name'].apply(pd.Series.value_counts, {'normalize':True})?但很难说这是否是你想要的。
@Michael Szczesny 谢谢你，几乎是我需要的。我的数据是更大数据的样本。你的代码确实有效。但是我无法获取指定“火山名称”的信息。它返回此错误。 grouped.get_group('a') >> AttributeError: 'Series' object has no attribute 'get_group'

标签： python pandas group-by pivot

【解决方案1】：

此 sn-p 按火山名称对数据条目进行分组，汇总每个火山的 VEI，并根据总和/所有 VEI 值计算该值的百分比。这可能不是您想要的（请参阅您的问题的 cmets），但该方法有望轻松地根据您的需求进行调整。

sum_vei = df["VEI"].sum()
result = 100*df.groupby('Volcano Name')["VEI"].sum()/sum_vei)

【讨论】：