【发布时间】:2020-02-17 03:38:35
【问题描述】:
我正在使用SVM classifier 制作一个香蕉检测器项目。我有358 用于训练的图像样本,并使用test-size=0.2、random_state=42 进行训练测试拆分。
我已用0 或1 将每个图像标记为文件名postfix。但是,classification_report(...) 总是返回:
Accuracy: 0.7352941176470589
UndefinedMetricWarning: Precision and F-score are ill-defined and being set to 0.0 in labels with no predicted samples.
'precision', 'predicted', average, warn_for)
precision recall f1-score support
0 0.74 1.00 0.85 50
1 0.00 0.00 0.00 18
accuracy 0.74 68
macro avg 0.37 0.50 0.42 68
weighted avg 0.54 0.74 0.62 68
1 类在表摘要中始终有 0.00。
我的完整源代码:
import os
import zipfile
import numpy as np
from sklearn import svm
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score
from sklearn.externals import joblib
import cv2
zip_ref = zipfile.ZipFile("dataset.zip", "r")
zip_ref.extractall()
zip_ref.close()
path = "bananas_dataset"
img_files = [(os.path.join(root, name))
for root, dirs, files in os.walk(path)
for name in files if name.endswith((".jpg"))]
winSize = (32, 32)
blockSize = (16, 16)
blockStride = (8, 8)
cellSize = (8, 8)
nbins = 9
derivAperture = 1
winSigma = -1.
histogramNormType = 0
L2HysThreshold = 0.2
gammaCorrection = 1
nlevels = 64
useSignedGradients = True
hog = cv2.HOGDescriptor(winSize, blockSize, blockStride,
cellSize, nbins, derivAperture, winSigma, histogramNormType,
L2HysThreshold, gammaCorrection, nlevels, useSignedGradients)
features = np.zeros((1, 324), np.float32)
labels = np.zeros(1, np.int64)
for i in img_files:
img = cv2.imread(i)
resized_img = cv2.resize(img, winSize)
descriptor = np.transpose(hog.compute(resized_img))
features = np.vstack((features, descriptor))
labels = np.vstack((labels, int(i[-5])))
features = np.delete(features, (0), axis=0)
labels = np.delete(labels, (0), axis=0).ravel()
X_train, X_test, y_train, y_test = train_test_split(features,
labels,
test_size=0.2,
random_state=42)
print("X_train: {}, y_train: {}".format(X_train.shape, y_train.shape))
print("X_test: {}, y_test: {}".format(X_test.shape, y_test.shape))
clf = svm.SVC()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print("Accuracy: {}".format(accuracy_score(y_test, y_pred)))
print("Classification report:")
print(classification_report(y_test, y_pred))
joblib.dump(clf, "banana_hog_svm_clf.pkl")
这导致我的预测过程总是返回类 0 作为结果。为什么会这样?
【问题讨论】:
-
我认为 SVM 不推荐用于此类任务。通常在计算机视觉问题中,您需要卷积神经网络(提取特征)。
-
我采用了github.com/lmzh123/ships_detection 方法,因为它与我的任务相似。它适用于 SVM。
标签: python machine-learning svm