【发布时间】:2019-12-10 17:18:31
【问题描述】:
谁能帮我理解这个函数的作用?
我了解行打印,但之后我有点迷路了。从train_data开始。
def stratifiedShuffleSplit_data(X, y):
sss = StratifiedShuffleSplit(n_splits=5, test_size=0.5, random_state=0)
for train_index, test_index in sss.split(X, y):
print("len(TRAIN):", len(train_index), "len(TEST):", len(test_index))
print("TRAIN:", train_index, "TEST:", test_index)
train_data = [df.loc[ind] for ind in train_index]
test_data = [df.loc[ind] for ind in test_index]
save_datarows(train_data, datafile+".train")
save_datarows(test_data, datafile+".test")
【问题讨论】:
-
所以,您的主要疑问是“train_data = [df.loc[ind] for ind in train_index]”这一行,对吧?
-
是的,最后两个
标签: python scikit-learn training-data k-fold