csv文件中的pandas / numpy算术平均值答案

【问题标题】：pandas / numpy arithmetic mean in csv filecsv文件中的pandas / numpy算术平均值
【发布时间】：2015-12-01 08:44:07
【问题描述】：

我有一个 csv 文件，其中包含 3000 行和 5 列，每周都会不断地附加更多行。

我要做的是每周查找最后 1000 行的最后一列的算术平均值。（所以当每周添加新行时，它只会取最近 1000 行的平均值）

我应该如何构造 pandas 或 numpy 数组来实现这一点？

df = pd.read_csv(fds.csv, index_col=False, header=0)
df_1 = df['Results']
#How should I write the next line of codes to get the average for the most 1000 rows?

【问题讨论】：

Based on the documentation，您应该利用pandas 的tails 功能来获取最后1000 行。
谢谢！我知道头尾功能。但没想到我会以这种方式使用它。谢谢马特。
酷。如果下面的答案对您有所帮助，请务必接受它（或者如果有人发布了更好的答案）作为答案。

标签： numpy pandas average mean

【解决方案1】：

我在与我的pandas 安装的机器不同的机器上，所以我要继续记忆，但我认为你想要做的是......

df = pd.read_csv(fds.csv, index_col=False, header=0)
df_1 = df['Results']
#Let's pretend your 5th column has a name (header) of `Stuff`
last_thousand = df_1.tail(1000)
np.mean(last_thousand.Stuff)

【讨论】：

【解决方案2】：

使用mean() 会快一点：

df = pd.read_csv("fds.csv", header = 0)
results = df.tail(1000).mean()

结果将包含最后 1000 行中每一列的平均值。如果你想要更多的统计数据，也可以使用describe()：

resutls = df.tail(1000).describe().unstack()

【讨论】：

【解决方案3】：

所以基本上我需要使用 pandas tail 函数。我下面的代码有效。

df = pd.read_csv(fds.csv, index_col=False, header=0)
df_1 = df['Results']    
numpy.average(df_1.tail(1000))

【讨论】：