【发布时间】:2017-08-25 01:06:05
【问题描述】:
我想从由datetime 索引的DataFrame 的每一列中获取每周最频繁的值。我知道当DataFrame 的条目都是int 或float 时可以做到这一点。但我正在寻找一种不利用int 或float 数据类型的通用方法。
这是一个示例,其中DataFrame 中的每个条目都是tuple:
2015-11-15 00:00:00 (3, 10.0, 0) nan
2015-11-16 00:00:00 nan nan
2015-11-17 00:00:00 nan nan
2015-11-18 00:00:00 (3, 10.0, 0) nan
2015-11-19 00:00:00 (3, 10.0, 0) nan
2015-11-20 00:00:00 (4, 8.2, 0) nan
2015-11-21 00:00:00 (4, 8.2, 0) nan
2015-11-22 00:00:00 (4, 8.2, 0) (1, 1.4, 1)
2015-11-23 00:00:00 (3, 18.0, 1) (3, 10.0, 0)
2015-11-26 00:00:00 (4, 8.2, 0) (1, 1.4, 1)
2015-11-27 00:00:00 (4, 8.2, 0) (3, 10.0, 0)
2015-11-28 00:00:00 nan (1, 1.4, 1)
2015-11-29 00:00:00 (4, 8.2, 0) (3, 10.0, 0)
2015-11-30 00:00:00 (4, 8.2, 0) (1, 1.4, 1)
这应该被转换成一个DataFrame,由一周内出现频率最高的元组组成,如下所示:
2015-11-15 00:00:00 (3, 10.0, 0) nan
2015-11-22 00:00:00 (4, 8.2, 0) (1, 1.4, 1)
我的偏好是效率,速度在我的应用程序中真的很重要。
编辑
3046920017503 3046920017541
index
2015-11-15 NaN NaN
2015-11-16 NaN NaN
2015-11-17 NaN NaN
2015-11-18 NaN NaN
2015-11-19 NaN NaN
2015-11-20 NaN NaN
2015-11-21 NaN NaN
2015-11-22 NaN NaN
2015-11-23 NaN NaN
2015-11-24 NaN NaN
2015-11-25 NaN NaN
2015-11-26 NaN NaN
2015-11-27 NaN NaN
2015-11-28 NaN NaN
2015-11-29 NaN NaN
2015-11-30 NaN NaN
2015-12-01 (3, 10.0, 0) (3, 10.0, 0)
2015-12-02 (3, 10.0, 0) (3, 10.0, 0)
2015-12-03 (3, 10.0, 0) (3, 10.0, 0)
2015-12-04 (3, 10.0, 0) (3, 10.0, 0)
2015-12-05 (3, 10.0, 0) (3, 10.0, 0)
2015-12-06 (3, 10.0, 0) (3, 10.0, 0)
应该转化为:
2015-11-15 NaN NaN
2015-11-22 NaN NaN
2015-11-29 (3, 10.0, 0) (3, 10.0, 0)
但建议的方法会产生:
3046920017503 3046920017541
index
2015-12-05 (3, 10.0, 0) (3, 10.0, 0)
2015-12-12 (3, 10.0, 0) (3, 10.0, 0)
【问题讨论】:
标签: python-3.x pandas datetime dataframe apply