【发布时间】:2019-12-11 04:41:03
【问题描述】:
我目前正在学习如何使用 python 和 jupyter notebook。我想创建自己的数据集。代码如下(取自本网站:How to create my own datasets using in scikit-learn?):
import numpy as np
import csv
from sklearn.datasets.base import Bunch
def load_movies_dataset():
with open('Documents/movies_dataset.csv') as csv_file:
data_file = csv.reader(csv_file)
temp = next(data_file)
n_samples = int(temp[0])
n_features = int(temp[1])
data = np.empty((n_samples, n_features))
target = np.empty((n_samples,), dtype=np.int)
for i, sample in enumerate(data_file):
data[i] = np.asarray(sample[:-1], dtype=np.int)
target[i] = np.asarray(sample[-1], dtype=np.int)
return Bunch(data=data, target=target)
这是我正在使用的 csv 文件:
"6","2","numKicks","numKisses"
"3","104","0"
"2","100","0"
"1","81","0"
"101","10","1"
"99","5","1"
"98","2","1"
此示例根据踢球次数和亲吻次数确定电影是爱情片 (0) 还是动作片 (1)。
这是我用来测试数据集创建的代码:
md = load_movies_dataset()
X = md.data
y = md.target
X
这是输出:
array([[ 3., 104.],
[ 2., 100.],
[ 1., 81.],
[101., 10.],
[ 99., 5.],
[ 98., 2.]])
我的问题是,为什么数组显示中有句号?
【问题讨论】:
标签: python scikit-learn jupyter-notebook