【发布时间】:2018-09-21 02:09:08
【问题描述】:
我正在做一个小项目,我正在尝试应用 SMOTE“Synthetic Minority Over-sampling Technique”,我的数据不平衡..
我为 SMOTE 函数创建了一个自定义的 transformerMixin ..
class smote(BaseEstimator, TransformerMixin):
def fit(self, X, y=None):
print(X.shape, ' ', type(X)) # (57, 28) <class 'numpy.ndarray'>
print(len(y), ' ', type) # 57 <class 'list'>
smote = SMOTE(kind='regular', n_jobs=-1)
X, y = smote.fit_sample(X, y)
return X
def transform(self, X):
return X
model = Pipeline([
('posFeat1', featureVECTOR()),
('sca1', StandardScaler()),
('smote', smote()),
('classification', SGDClassifier(loss='hinge', max_iter=1, random_state = 38, tol = None))
])
model.fit(train_df, train_df['label'].values.tolist())
predicted = model.predict(test_df)
我在 FIT 函数上实现了 SMOTE,因为我不希望它应用于测试数据..
不幸的是,我收到了这个错误:
model.fit(train_df, train_df['label'].values.tolist())
File "C:\Python35\lib\site-packages\sklearn\pipeline.py", line 248, in fit
Xt, fit_params = self._fit(X, y, **fit_params)
File "C:\Python35\lib\site-packages\sklearn\pipeline.py", line 213, in _fit
**fit_params_steps[name])
File "C:\Python35\lib\site-packages\sklearn\externals\joblib\memory.py", line 362, in __call__
return self.func(*args, **kwargs)
File "C:\Python35\lib\site-packages\sklearn\pipeline.py", line 581, in _fit_transform_one
res = transformer.fit_transform(X, y, **fit_params)
File "C:\Python35\lib\site-packages\sklearn\base.py", line 520, in fit_transform
return self.fit(X, y, **fit_params).transform(X)
AttributeError: 'numpy.ndarray' object has no attribute 'transform'
【问题讨论】:
标签: python scikit-learn pipeline