【发布时间】:2017-06-12 13:06:53
【问题描述】:
我正在使用 Keras 和 Python 进行分类,然后进行对象检测。我已经以 80% 以上的准确率对猫/狗进行了分类,我对目前的结果还可以。我的问题是如何从输入图像中检测猫或狗?我完全糊涂了。我想使用我自己的身高,而不是来自互联网的预训练。
这是我目前的代码:
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
#########################################################################################################
#VALUES
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'
nb_train_samples = 2000 #1000 cats/dogs
nb_validation_samples = 800 #400cats/dogs
nb_epoch = 50
#########################################################################################################
#MODEL
model = Sequential()
model.add(Convolution2D(32, 3, 3, input_shape=(3, img_width, img_height)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
##########################################################################################################
#TEST AUGMENTATION
img = load_img('data/train/cats/cat.0.jpg') # this is a PIL image
x = img_to_array(img) # this is a Numpy array with shape (3, 150, 150)
x = x.reshape((1,) + x.shape) # this is a Numpy array with shape (1, 3, 150, 150)
# the .flow() command below generates batches of randomly transformed images
# and saves the results to the `preview/` directory
i = 0
for batch in train_datagen.flow(x, batch_size=1,
save_to_dir='data/TEST AUGMENTATION', save_prefix='cat', save_format='jpeg'):
i += 1
if i > 20:
break # otherwise the generator would loop indefinitely
##########################################################################################################
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)
#PREPARE TRAINING DATA
train_generator = train_datagen.flow_from_directory(
train_data_dir, #data/train
target_size=(img_width, img_height), #RESIZE to 150/150
batch_size=32,
class_mode='binary') #since we are using binarycrosentropy need binary labels
#PREPARE VALIDATION DATA
validation_generator = test_datagen.flow_from_directory(
validation_data_dir, #data/validation
target_size=(img_width, img_height), #RESIZE 150/150
batch_size=32,
class_mode='binary')
#START model.fit
history =model.fit_generator(
train_generator, #train data
samples_per_epoch=nb_train_samples,
nb_epoch=nb_epoch,
validation_data=validation_generator, #validation data
nb_val_samples=nb_validation_samples)
model.save_weights('savedweights.h5')
# list all data in history
print(history.history.keys())
#ACC VS VAL_ACC
plt.plot(history.history['acc'])
plt.plot(history.history['val_acc'])
plt.title('model accuracy ACC VS VAL_ACC')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
# summarize history for loss
#LOSS VS VAL_LOSS
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss LOSS vs VAL_LOSS')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
model.load_weights('first_try.h5')
所以现在既然我对猫和狗进行了分类,我需要如何以及做什么来输入图像并通过它找到带有边界框的猫或狗?我对此完全陌生,甚至不确定我是否以正确的方式解决了这个问题? 谢谢。
更新 嗨,很抱歉这么晚才发布结果,这几天无法解决这个问题。 我正在导入图像并将其重塑为 1,3,150,150 形状,因为 150,150 形状会带来错误:
Exception: Error when checking : expected convolution2d_input_1 to have 4 dimensions, but got array with shape (150L, 150L)
导入图片:
#load test image
img=load_img('data/prediction/cat.155.jpg')
#reshape to 1,3,150,150
img = np.arange(1* 150 * 150).reshape((1,3,150, 150))
#check shape
print(img.shape)
然后我将 def predict_function(x) 更改为:
def predict_function(x):
# example of prediction function for simplicity, you
# should probably use `return model.predict(x)`
# random.seed(x[0][0])
# return random.random()
return model.predict(img)
现在当我跑步时:
best_box = get_best_bounding_box(img, predict_function)
print('best bounding box %r' % (best_box, ))
我得到的输出是最佳边界框:无
所以我就跑了:
model.predict(img)
然后得到以下内容:
model.predict(img)
Out[54]: array([[ 0.]], dtype=float32)
所以它根本不检查它是猫还是狗......有什么想法吗?
注意:当 def predict)function(x) 使用时:
random.seed(x[0][0])
return random.random()
我确实得到了输出,它复选框并给出了最好的。
【问题讨论】:
标签: python classification keras object-detection