【问题标题】:Python child class inherit and override from arbitrary parentPython子类从任意父类继承和覆盖
【发布时间】:2021-07-29 02:19:31
【问题描述】:

我正在使用 sklearn 估算器,它继承自 sklearn.base.BaseEstimator 并具有相当标准的界面。我想要做的一个例子是覆盖 .fit() 和 .predict() 方法来回归对数转换的目标,如下所示:

Estimator = sklearn.some_regression_estimator

class LogFit(Estimator):
    """subclass the sklearn regression estimator to fit and predict using 
       log-transformed target variable
    """

    def __init__(self, **kwargs):
        super().__init__(kwargs)

    def fit(X, y=None, **kwargs):
        super().fit(X, np.log(y), **kwargs)
        return self

    def predict(X):
       return np.exp(super().predict(X))

我不一定事先知道将使用哪个估算器,只知道它会根据 sklearn 估算器约定运行。我也不想为每个可能的估计器重新编写上述子类,并且多重继承似乎不正确,因为LogFit 的每个实例都仅从单个父级继承。

我知道我可以编写一个包装类(然后使用覆盖的 fit() 和 predict() 方法子类 it),例如:

class EstimatorWrapper():
    """Wrapper class that has an estimator as a property"""

    def __init__(self, estimator_instance):
        self.estimator = estimator_instance
    
    def fit(self, X, y=None, **kwargs):
        self.fit(X, y, **kwargs)
        return self
   ...

但在这一点上,我现在必须负责确保 EstimatorWrapper 类的行为就像基本估计器类一样,这样我就可以在不知道 sklearn 的其余机器知道的情况下使用 fit() 和 predict() 的 LogFit 版本区别。再说一次,如果我不知道每个可能的 estimator_instance 中存在哪些特定的方法/属性,似乎我必须以某种方式破解 EstimatorWrapper() 来动态定义其属性,而我真正想做的只是调整fit() 和 predict() 函数的行为。

我是否缺少一种编写子类的简单方法,该子类在实例化之前不知道其父类,还是根本不允许这样做?我找不到任何关于如何做前者的例子

【问题讨论】:

    标签: python class inheritance scikit-learn subclass


    【解决方案1】:

    它绕过了这个问题,但这里有第三个选项可能会很好地探索:TransformedTargetRegressor,它包装了一个回归量并让您定义 forwardinverse 应用于目标变量的转换函数。

    这是一个使用np.log1p 和逆np.expm1 的示例:

    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.model_selection import train_test_split
    from sklearn.svm import SVR
    from sklearn.tree import DecisionTreeRegressor
    from sklearn.neural_network import MLPRegressor
    from sklearn.compose import TransformedTargetRegressor
    
    X = np.array([-0.890,-0.798,-0.610,-0.502,-0.450,-0.382,-0.342,-0.282,-0.226,-0.158,-0.0500,0.106,0.202,0.322,0.418,0.482,0.530,0.590,0.666,0.754,0.834,0.914,0.970,-0.694,-0.762,-0.782,-0.854,-0.850,-0.658,-0.666,-0.482,-0.506,-0.398,-0.478,-0.350,-0.410,-0.334,-0.378,-0.314,-0.346,-0.230,-0.250,-0.150,-0.146,-0.0500,-0.0700,0.0180,0.0300,0.110,0.178,0.214,0.290,0.318,0.406,0.422,0.482,0.478,0.486,0.546,0.566,0.674,0.658,0.702,0.726,0.838,0.882,0.950,0.934,0.882,0.786,0.786,0.718,0.682,0.570,0.502,0.426,0.234,-0.470,-0.430,-0.390,-0.354]).reshape(-1, 1)
    y = np.array([0.663,0.679,0.675,0.627,0.535,0.331,0.171,0.0111,-0.169,-0.273,-0.421,-0.497,-0.505,-0.457,-0.361,-0.237,-0.141,-0.0129,0.0751,0.127,0.143,0.147,0.143,0.691,0.723,0.671,0.707,0.611,0.727,0.595,0.683,0.567,0.571,0.375,0.391,0.255,0.255,0.0831,0.0991,-0.0689,-0.0729,-0.225,-0.217,-0.309,-0.293,-0.453,-0.417,-0.533,-0.433,-0.577,-0.393,-0.489,-0.341,-0.421,-0.313,-0.353,-0.109,-0.117,-0.169,-0.0369,-0.141,0.159,0.0991,0.139,0.155,0.0511,0.231,0.123,0.187,0.00707,0.139,-0.0249,-0.00893,-0.205,-0.121,-0.261,-0.417,0.527,0.491,0.391,0.0591]).ravel()
    plt.scatter(X, y)
    X_train, X_test, y_train, y_test = train_test_split(X, y)
    
    
    for regressor in [DecisionTreeRegressor(max_depth=3), MLPRegressor(max_iter=500), SVR()]:
    
        regr = TransformedTargetRegressor(
            regressor=regressor,
            func=np.log1p,
            inverse_func=np.expm1,
        )
        regr.fit(X_train, y_train)
        score = round(regr.score(X_test, y_test), 3)
    
        # Some visualization
        data = np.linspace(X.min(), X.max(), num=250).reshape(-1, 1)
        reg_line = regr.predict(data)
        plt.plot(data, reg_line, label=f"R2={score}, {regr.regressor.__class__.__name__}")
    
    plt.legend()
    plt.show()
    

    结果:

    【讨论】:

    • 啊,这很有趣,谢谢!我没有看到那个。它确实解决了示例问题,并以 sklearn-native 格式启动。
    • 希望对您有所帮助!它确实围绕这个问题,但我也不确定解决这个问题的好方法会是什么样子——尤其是因为大多数回归器在它们的 init 方法中有不同的参数。跨度>
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-09-27
    • 1970-01-01
    • 2012-11-20
    • 1970-01-01
    相关资源
    最近更新 更多