在熊猫中附加带有 value_counts() 的列答案

【问题标题】：Appending Column With value_counts() in pandas在熊猫中附加带有 value_counts() 的列
【发布时间】：2018-11-20 03:21:00
【问题描述】：

我有一个名为 output 的数据框，如下所示：

   created_at
0  1/8/2017 0:00
1  1/8/2017 0:00
2  1/8/2017 0:00
3  1/8/2017 0:00
4  1/8/2017 0:00
5  1/8/2017 1:00
6  1/8/2017 2:00
7  1/8/2017 3:00

我想计算特定时间出现在名为 df3 的数据框中的次数。结果如下：

1/8/2017 0:00    5
1/8/2017 1:00    1
1/8/2017 3:00    1
1/8/2017 2:00    1

我想要的是在 df3 中添加两个名为 created_at 和 count 的标题。

我首先做的是从 输出数据帧 中删除重复项并对值进行排序，得到如下结果：

   created_at
0  1/8/2017 0:00
5  1/8/2017 1:00
6  1/8/2017 2:00
7  1/8/2017 3:00

现在我在 输出数据框 中添加了 count 列，但得到的结果如下：

   created_at        count
0  1/8/2017 0:00     NaN
5  1/8/2017 1:00     NaN
6  1/8/2017 2:00     NaN
7  1/8/2017 3:00     NaN

我想要实现的是一个名为 result 的数据框，它应该如下所示：

   created_at        count
0  1/8/2017 0:00     5
5  1/8/2017 1:00     1
6  1/8/2017 2:00     1
7  1/8/2017 3:00     1

我该怎么做？我的代码如下：

import pandas as pd

df1 = pd.read_csv(path1)
df2 = pd.read_csv(path2)
output = pd.merge(df1, df2, how="inner", on="created_at")
df3 = output.created_at.value_counts()

output = output.drop_duplicates()
output = output.sort_values(by=['created_at'])
output['count'] = df3


print(output,'\n\n')

我们将不胜感激任何和所有的帮助

谢谢

【问题讨论】：

结果df不是我所拥有的，而是我想要的输出

标签： python pandas dataframe append

【解决方案1】：

在调用value_counts 之后，使用rename_axis 和reset_index。

df.created_at.value_counts().rename_axis('created_at').reset_index(name='count')

      created_at  count
0  1/8/2017 0:00      5
1  1/8/2017 2:00      1
2  1/8/2017 1:00      1
3  1/8/2017 3:00      1

或者，使用groupby + agg：

df.groupby('created_at').created_at.agg([('count', 'count')]).reset_index()

      created_at  count
0  1/8/2017 0:00      5
1  1/8/2017 1:00      1
2  1/8/2017 2:00      1
3  1/8/2017 3:00      1

【讨论】：