【问题标题】:what does it mean by putting two variable in a for-in loop in python在python的for-in循环中放置两个变量是什么意思
【发布时间】:2017-06-27 02:21:31
【问题描述】:

我正在阅读Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems。在一个示例中,我在 for 循环中看到了这种语法。

from sklearn.model_selection import StratifiedShuffleSplit

split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)
for train_index, test_index in split.split(housing, housing["income_cat"]):
    strat_train_set = housing.loc[train_index]
    strat_test_set = housing.loc[test_index]

我打印出了 train_index 和 test_index,它们是索引数组。 这个for循环是什么意思? train_index 和 test_index 有不同数量的元素,迭代如何工作? 这段代码是否等同于下面的代码?

from sklearn.model_selection import StratifiedShuffleSplit

split = StratifiedShuffleSplit(n_splits=1, test_size=0.2, random_state=42)
train_index, test_index = split.split(housing, housing["income_cat"]):
strat_train_set = housing.loc[train_index]
strat_test_set = housing.loc[test_index]

【问题讨论】:

  • 我猜split.split(housing, housing["income_cat"])返回一个二元组值,在for循环中执行train_index, test_index将这两个值分别解包到两个变量中。

标签: python pandas numpy


【解决方案1】:

这是一个 for 循环中有 2 个变量的简单案例:

In [173]: for a,b in [[0,1],[10,12]]:
     ...:     print(a,b)
     ...:     
0 1
10 12

如果出于同样的原因工作:

In [174]: a,b = [10,12]

迭代返回某种元组或列表,a,b in ... 将这两个值解包到匹配数量的变量中。

for i, v in enumerate(['a','b','c']):
    print(i,v)

是循环解包的另一种常见用法。

【讨论】:

    【解决方案2】:

    以下代码引用自sklearn的手册enter link description here

    import numpy as np
    from sklearn.model_selection import StratifiedShuffleSplit
    X = np.array([[1, 2], [3, 4], [1, 2], [3, 4], [1, 2], [3, 4]])
    y = np.array([0, 0, 0, 1, 1, 1])
    sss = StratifiedShuffleSplit(n_splits=5, test_size=0.5, random_state=0)
    sss.get_n_splits(X, y)
    5
    print(sss)
    StratifiedShuffleSplit(n_splits=5, random_state=0, ...)
    for train_index, test_index in sss.split(X, y):
        print("TRAIN:", train_index, "TEST:", test_index)
        X_train, X_test = X[train_index], X[test_index]
        y_train, y_test = y[train_index], y[test_index]
    
    TRAIN: [5 2 3] TEST: [4 1 0]
    TRAIN: [5 1 4] TEST: [0 2 3]
    TRAIN: [5 0 2] TEST: [4 3 1]
    TRAIN: [4 1 0] TEST: [2 3 5]
    TRAIN: [0 5 1] TEST: [3 4 2]
    

    For 循环运行 n_splits 次。

    【讨论】:

      猜你喜欢
      • 2015-08-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2016-07-11
      • 2011-08-06
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多