【发布时间】:2016-08-05 14:01:47
【问题描述】:
我有来自 MetOffice 的给定天气数据。示例如下:
示例
OB_END_TIME,ID,ID_TYPE,OB_HOUR_COUNT,VERSION_NUM,MET_DOMAIN_NAME,SRC_ID,REC_ST_IND,PRCP_AMT,PRCP_DUR,PRCP_AMT_Q,PRCP_DUR_Q,METO_STMP_TIME,MIDAS_STMP_ETIME,PRCP_AMT_J
2016-01-01 00:00, 001586, RAIN, 1, 1, SREW, 315, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 005349, RAIN, 1, 1, AWSHRLY, 310, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 008888, RAIN, 1, 1, AWSHRLY, 289, 1011, 0, , 1, , 2016-01-01 02:04, 0, ,
2016-01-01 00:00, 009182, CLBR, 1, 1, AWSHRLY, 57254, 1011, 0, , 1, , 2016-01-01 04:20, 0, ,
2016-01-01 00:00, 010990, RAIN, 1, 1, SREW, 16589, 1011, 0, , 1, , 2015-12-31 23:52, 0, ,
2016-01-01 00:00, 017221, RAIN, 1, 1, SREW, 30523, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 024726, RAIN, 1, 1, AWSHRLY, 326, 1011, 0, , 1, , 2016-01-01 02:04, 0, ,
2016-01-01 00:00, 034592, RAIN, 1, 1, SREW, 17344, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 036375, RAIN, 1, 1, SREW, 358, 1011, 0, , 1, , 2015-12-31 23:53, 0, ,
2016-01-01 00:00, 037240, RAIN, 1, 1, AWSHRLY, 360, 1011, 0, , 1, , 2015-12-31 23:53, 0, ,
2016-01-01 00:00, 038131, RAIN, 1, 1, SREW, 373, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 038180, RAIN, 1, 1, AWSHRLY, 367, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 042647, RAIN, 1, 1, AWSHRLY, 56986, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 044287, RAIN, 1, 1, SREW, 370, 1011, 0, , 1, , 2015-12-31 23:52, 0, ,
2016-01-01 00:00, 047714, RAIN, 1, 1, AWSHRLY, 339, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 052426, RAIN, 1, 1, AWSHRLY, 30750, 1011, 0, , 1, , 2016-01-01 01:59, 0, ,
2016-01-01 00:00, 053903, RAIN, 1, 1, SREW, 17314, 1011, 0, , 1, , 2016-01-01 01:36, 0, ,
2016-01-01 00:00, 056005, RAIN, 1, 1, SREW, 16596, 1011, 0, , 1, , 2015-12-31 23:51, 0, ,
2016-01-01 00:00, 056349, RAIN, 1, 1, SREW, 342, 1011, 0, , 1, , 2016-01-01 01:49, 0, ,
2016-01-01 00:00, 057294, RAIN, 1, 1, SREW, 346, 1011, 0, , 1, , 2016-01-01 01:36, 0, ,
2016-01-01 00:00, 057936, RAIN, 1, 1, AWSHRLY, 25726, 1011, 0, , 1, , 2016-01-01 02:04, 0, ,
2016-01-01 00:00, 064286, RAIN, 1, 1, AWSHRLY, 534, 1011, 0, , 1, , 2015-12-31 23:55, 0, ,
2016-01-01 00:00, 075266, RAIN, 1, 1, SREW, 513, 1011, 0, , 10001, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 075633, RAIN, 1, 1, AWSHRLY, 516, 1011, 0, , 1, , 2016-01-01 02:02, 0, ,
2016-01-01 00:00, 082584, RAIN, 1, 1, AWSHRLY, 525, 1011, 0, , 1, , 2016-01-01 02:05, 0, ,
2016-01-01 00:00, 085160, RAIN, 1, 1, SREW, 56958, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 087398, RAIN, 1, 1, AWSHRLY, 529, 1011, 0, , 1, , 2016-01-01 02:00, 0, ,
2016-01-01 00:00, 089543, RAIN, 1, 1, AWSHRLY, 622, 1011, 0, , 1, , 2015-12-31 23:57, 0, ,
2016-01-01 00:00, 095456, RAIN, 1, 1, AWSHRLY, 56424, 1011, 0, , 1, , 2015-12-31 23:58, 0, ,
2016-01-01 00:00, 096514, RAIN, 1, 1, SREW, 19187, 1011, 0, , 1, , 2015-12-31 23:52, 0, ,
2016-01-01 00:00, 099173, RAIN, 1, 1, AWSHRLY, 30529, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 100789, RAIN, 1, 1, AWSHRLY, 542, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 102844, RAIN, 1, 1, SREW, 30690, 1011, 0, , 1, , 2015-12-31 23:56, 0, ,
2016-01-01 00:00, 107737, CLBR, 1, 1, AWSHRLY, 61973, 1011, 0, , 1, , 2016-01-01 09:38, 13, ,
2016-01-01 00:00, 115901, RAIN, 1, 1, AWSHRLY, 554, 1011, 0, , 1, , 2016-01-01 02:06, 0, ,
2016-01-01 00:00, 117629, RAIN, 1, 1, SREW, 556, 1011, 0, , 1, , 2015-12-31 23:51, 0, ,
2016-01-01 00:00, 124664, RAIN, 1, 1, AWSHRLY, 19204, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 130062, RAIN, 1, 1, AWSHRLY, 382, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 138518, RAIN, 1, 1, SREW, 16725, 1011, 0, , 1, , 2015-12-31 23:52, 0, ,
2016-01-01 00:00, 142002, RAIN, 1, 1, SREW, 384, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 142508, RAIN, 1, 1, SREW, 381, 1011, 0, , 1, , 2015-12-31 23:51, 0, ,
2016-01-01 00:00, 146128, RAIN, 1, 1, SREW, 393, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 146453, RAIN, 1, 1, SREW, 386, 1011, 0, , 1, , 2016-01-01 01:36, 0, ,
2016-01-01 00:00, 157291, RAIN, 1, 1, SREW, 395, 1011, 0, , 1, , 2015-12-31 23:53, 0, ,
2016-01-01 00:00, 160108, RAIN, 1, 1, AWSHRLY, 578, 1011, 0, , 1, , 2016-01-01 02:05, 0, ,
2016-01-01 00:00, 164016, RAIN, 1, 1, SREW, 583, 1011, 0, , 1, , 2015-12-31 23:51, 0, ,
2016-01-01 00:00, 169350, RAIN, 1, 1, AWSHRLY, 57093, 1011, 0, , 1, , 2016-01-01 02:06, 0, ,
2016-01-01 00:00, 172604, RAIN, 1, 1, AWSHRLY, 458, 1011, 0, , 1, , 2015-12-31 23:53, 0, ,
2016-01-01 00:00, 174063, RAIN, 1, 1, SREW, 461, 1011, 0, , 1, , 2015-12-31 23:53, 0, ,
2016-01-01 00:00, 183798, RAIN, 1, 1, AWSHRLY, 455, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 186331, RAIN, 1, 1, AWSHRLY, 435, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 190541, RAIN, 1, 1, AWSHRLY, 413, 1011, 0, , 1, , 2016-01-01 02:00, 0, ,
2016-01-01 00:00, 193361, RAIN, 1, 1, SREW, 409, 1011, 0, , 1, , 2016-01-01 01:36, 0, ,
2016-01-01 00:00, 196391, RAIN, 1, 1, AWSHRLY, 456, 1011, 0, , 1, , 2016-01-01 02:06, 0, ,
2016-01-01 00:00, 200640, CLBR, 1, 1, AWSHRLY, 61843, 1011, , , 7000, , 2016-01-01 09:38, 13, ,
2016-01-01 00:00, 201424, RAIN, 1, 1, AWSHRLY, 30476, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 203770, RAIN, 1, 1, SREW, 421, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 203879, RAIN, 1, 1, AWSHRLY, 426, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 206102, RAIN, 1, 1, AWSHRLY, 61986, 1011, 0, , 1, , 2016-01-01 02:04, 0, ,
2016-01-01 00:00, 214288, RAIN, 1, 1, AWSHRLY, 4911, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 214511, CLBR, 1, 1, AWSHRLY, 61915, 1011, 0, , 1, , 2016-01-01 09:38, 13, ,
2016-01-01 00:00, 220392, RAIN, 1, 1, AWSHRLY, 442, 1011, 0, , 1, , 2016-01-01 02:01, 0, ,
2016-01-01 00:00, 221993, RAIN, 1, 1, SREW, 440, 1011, 0, , 1, , 2016-01-01 01:49, 0, ,
2016-01-01 00:00, 224244, RAIN, 1, 1, AWSHRLY, 436, 1011, 0, , 1, , 2016-01-01 00:00, 0, ,
2016-01-01 00:00, 232673, RAIN, 1, 1, AWSHRLY, 487, 1011, 0, , 1, , 2015-12-31 23:59, 0, ,
2016-01-01 00:00, 233754, RAIN, 1, 1, SREW, 19188, 1011, 0, , 1, , 2015-12-31 23:54, 0, ,
2016-01-01 00:00, 236429, RAIN, 1, 1, SREW, 498, 1011, 0, , 1, , 2015-12-31 23:52, 0, ,
数据每小时更新一次,从 2016-01-01 到 2016-03-31 。我已经尝试按OB_END_TIME 的**小时计算唯一SRC_ID 值 ** 相关代码如下:
import csv, collections
from datetime import datetime
#creating a dictionary to store the counter values
dd = collections.defaultdict((lambda: collections.defaultdict(int)))
with open('filename.csv') as fd:
rd = csv.reader(fd)
next(rd)
for row in rd:
hour = row[0][11:]
src_id = row[6]
dd[hour][src_id] += 1
for hour in sorted(dd.keys()):
for src_id in sorted(dd[hour].keys()):
print hour, src_id, dd[hour][src_id]
我想在这里做三件事:
- 对于
hour变量,我想使用python 的DateTime模块,以便我可以读取OB_END_TIME列中的小时值。在这里,我使用字符串索引来获取小时值,但现在我希望使用该函数。 - 对于代码中的
src_id变量,我想按名称读取列作为标识符而不是位置。也就是说,我希望 src_id 变量读取SRC_ID列而不提及 [6] 索引。 - 对于
dd = collections.defaultdict((lambda: collections.defaultdict(int)))是否有任何有效的方法可以做到这一点。我们可以初始化一个普通变量并做同样的事情吗?
我希望我能清楚我的疑虑。如有任何疑问,请通过 cmets 告知我。我完全理解我的一些要求可能很棘手,但只是想要一个高效的代码。
PS:我认为csv.DictReader 可以解决我的第二个查询,但我不知道如何在本节中实现它。
注意:我不能使用外部模块,如 Pandas、Numpy、SQLite 等。我只能使用 Python Installer 附带的模块。我的编辑器是 Python IDLE。
【问题讨论】:
-
为什么不在这里使用 pandas 来提高效率..
-
@Anvesh 我有一些系统限制。我不能使用 Pandas 或 Numpy 来做同样的事情。我必须使用通用的基本 python 方式。我希望我能够清除这种情况。
-
将数据导入内存SQLite数据库并使用
sqlite3模块进行分析。 -
让我编辑问题以供进一步参考。 @LutzHorn 我不能使用任何显式模块来完成工作。我必须只使用基本的 Python。 DateTime 和其他模块只能使用。不能使用 SQLite、Pandas、Numpy 等。
-
sqlite3是一个包含在 Python 标准库中的模块。
标签: python python-2.7 csv datetime