【发布时间】:2016-04-08 16:40:01
【问题描述】:
请查看以下 csv 文件的前 2 行。第一行是字段名,第二行是实际数据的第一行。
我正在尝试遍历第一行,然后将值按原始顺序存储到数组中。
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country label
59 Private 307423 9th 5 Never-married Other-service Not-in-family Black Male 0 0 50 United-States 0
reader = csv.DictReader(csvfile)
train_x = []
train_y = []
dic = {}
for row in reader:
row_x = []
for title in row.keys():
l = ['workclass','education','marital-status','occupation', 'relationship', 'race', 'sex', 'native-country']
if title in l:
value = get_dict[title][row[title]]
row_x.append(value)
elif title == 'label':
train_y.append(float(row['label']))
else:
row_x.append(float(row[title]))
train_x.append(row_x)
这是我在第一行得到的:
[3, 5, 59.0, 0.0, 0, 50.0, 4, 35, 5.0, 0.0, 8, 307423.0, 4, 3]
如您所见,字段的顺序是错误的。 (注意美国是 35,Private 是 3...)
为了方便,这里也复制了 csv 行:
age workclass fnlwgt education education-num marital-status occupation relationship race sex capital-gain capital-loss hours-per-week native-country label
59 Private 307423 9th 5 Never-married Other-service Not-in-family Black Male 0 0 50 United-States 0
【问题讨论】: