【发布时间】:2015-03-14 23:21:30
【问题描述】:
我有一个名为“文件名”的 csv 文件,并希望将这些数据读取为 64float,但“小时”列除外。我使用 pd.read_csv - 函数和转换器对其进行了管理。
df = pd.read_csv("../data/filename.csv",
delimiter = ';',
date_parser = ['hour'],
skiprows = 1,
converters={'column1': lambda x: float(x.replace ('.','').replace(',','.'))})
现在,我有两点:
第一:
分隔符与 ; 一起使用,但是如果我在记事本中查看我的数据,则有',',而不是';'。但是如果我选择',',我会得到:'pandas.parser.CParserError:错误标记数据。 C 错误:第 13 行应有 7 个字段,看到 9'
第二:
如果我想对所有列使用转换器,我怎样才能得到这个?!什么是正确的术语? 我尝试在 readin 函数中使用 dtype = float,但我得到 'AttributeError: 'NoneType' object has no attribute 'dtype'' 发生了什么?这就是为什么我想用转换器来管理它的原因。
数据:
,小时,光伏,陆上风,海上风,PV.1,陆上风.1,风 离岸.1,PV.2,陆上风.2,离岸风.2 0,1,0.0,"12,985.0","9,614.0",0.0,"32,825.5","9,495.7",0.0,"13,110.3","10,855.5" 1,2,0.0,"12,908.9","9,290.8",0.0,"36,052.3","9,589.1",0.0,"13,670.2","10,828.6" 2,3,0.0,"12,740.9","8,886.9",0.0,"38,540.9","10,087.3",0.0,"14,610.8","10,828.6" 3,4,0.0,"12,485.3","8,644.5",0.0,"40,734.0","10,087.3",0.0,"15,638.3","10,343.7" 4,5,0.0,"11,188.5","8,079.0",0.0,"42,688.0","10,087.3",0.0,"16,809.4","10,343.7" 5,6,0.0,"11,219.0","7,594.2",0.0,"43,333.5","10,025.0",0.0,"18,266.9","10,343.7"
【问题讨论】:
-
您可以发布示例数据吗,所有行的格式是否相同是另一个问题。如果
read_csv无法进行转换,最好在将其作为字符串读入后进行转换 -
好的。在记事本中,数据如下所示:
-
,小时,光伏,陆上风,海上风,PV.1,陆上风.1,海上风.1,PV.2,陆上风.2,海上风.2 0,1, 0.0,"12,985.0","9,614.0",0.0,"32,825.5","9,495.7",0.0,"13,110.3","10,855.5" 1,2,0.0,"12,908.9","9,290.8",0.0,"36,052.3"," 9,589.1",0.0,"13,670.2","10,828.6","12,740.9","8,886.9",0.0,"38,540.9","10,087.3",0.0,"14,610.8","10,828.6" 3,4,0.0 ,"12,485.3","8,644.5",0.0,"40,734.0","10,087.3",0.0,"15,638.3","10,343.7" 4,5,0.0,"11,188.5","8,079.0",0.0,"42,688.0","10,087.3 ",0.0,"16,809.4","10,343.7" 5,6,0.0,"11,219.0","7,594.2",0.0,"43,333.5","10,025.0",0.0,"18,266.9","10,343.7"