Python中使用CNN的手势识别准确度问题答案

【问题标题】：The accuracy problem of hand sign gestures recognition with using CNN in PythonPython中使用CNN的手势识别准确度问题
【发布时间】：2020-06-08 17:05:59
【问题描述】：

我正在我的大学做我的高级项目，我只有 2 天的时间来解决这个问题。我在 Python 中使用 CNN 创建了一个手势识别。我使用了 78000 张具有 50x50px 值的图像。但我被困在了最后我模型的一部分。我无法提高我的准确性。当我开始用 100 个 epoch 训练数据时，前 15 个 epoch 显示 0,039 准确度，这太可怕了，因为我没有等待火车结束。也许发生这种情况是因为 conv2d 或 pooling 的值，因为我不知道如何将正确的值放入 conv2d、pooling 等。

我是新手，无法解决问题。如果您能帮助我，我将不胜感激

我写的代码如下；

from keras.models import Sequential
from keras.layers import Convolution2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.preprocessing.image import ImageDataGenerator
import tensorflow as tf
import pickle
import cv2
import os
import matplotlib.pyplot as plt
import numpy as np
from tqdm import  tqdm
from sklearn.model_selection import train_test_split
from PIL import Image
from numpy import asarray






DATADIR = "asl_alphabet_train"

CATEGORIES = ["A","B","C","D","E","F","G","H","I","J","K","L","M","N","O","P","Q","R","S","T","U","V","W","X","Y","Z"]


X_train = []
y_train = []
X_test=[]
y_test=[]

IMG_SIZE=50
def create_training_data():
    for category in CATEGORIES:

        path = os.path.join(DATADIR,category)  # create path to dogs and cats
        class_num = CATEGORIES.index(category)  # get the classification  (0 or a 1). 

        for img in tqdm(os.listdir(path)):  # iterate over each image per dogs and cats
            try:
                img_array = cv2.imread(os.path.join(path,img))  # convert to array
                #new_array = cv2.resize(img_array, (28, 50 ))  # resize to normalize data size
                X_train.append(img_array)  # add this to our trainingdata
                # add this to our X_train
                y_train.append(class_num)  # add this to our X_train

            except Exception as e:  # in the interest in keeping the output clean...
                pass


create_training_data()
X_train = asarray(X_train)
y_train = asarray(y_train)




"""
nsamples, nx, ny = X_train.shape
X_train = X_train.reshape((nsamples,nx*ny))
"""




X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size=0.2,random_state=0)


N = y_train.size
M = y_train.max()+1

resultArray = np.zeros((N,M),int)
idx =  (np.arange(N)*M) + y_train
resultArray.ravel()[idx] = 1
y_train=resultArray




classifier=Sequential()
#convolution step
classifier.add(Convolution2D(filters=96, input_shape=(50,50,3), kernel_size=(11,11), padding='valid',activation="relu"))
#pooling step
classifier.add(MaxPooling2D(pool_size=(2,2)))

#convolution step
classifier.add(Convolution2D(filters=256,kernel_size=(11,11),padding="valid",activation="relu"))
#pooling step
classifier.add(MaxPooling2D(pool_size=(2,2)))

classifier.add(Convolution2D(filters=384,kernel_size=(3,3),padding="valid",activation="relu"))
classifier.add(MaxPooling2D(pool_size=(2,2)))


#flatten step
classifier.add(Flatten())
#Dense(Fully connected step)
classifier.add(Dense(output_dim=128,activation="relu"))
#Dropout to decrease the possibility of overfitting
classifier.add(Dropout(0.5))
#Dense to determine the output 
classifier.add(Dense(output_dim=26,activation="softmax"))

#compile step
classifier.compile(optimizer="adam",loss="categorical_crossentropy",metrics=["accuracy"])

    enter code here

classifier.fit(X_train,y_train,epochs=100,batch_size=32)
filename="CNN_TEST.sav"
pickle.dump(classifier, open(filename, 'wb'))


y_pred=classifier.predict(X_test)
print(y_pred)

【问题讨论】：

这似乎相当广泛。您是在寻求编程本身的帮助，还是更多的理论问题？

标签： python tensorflow keras conv-neural-network

【解决方案1】：

推荐以下：

1) 减小模型前两个卷积层的内核大小。

2) 我相信 MaxPooling 层在每个卷积层之后都不是必需的。请务必验证这一点。

3) 0.5 的 DropOut 可能会丢弃大量基本神经元，您可能需要降低它。

4) 改变 epoch 的数量，看看你的模型每次的表现如何。

在每次尝试时绘制“训练准确度与验证准确度”和“训练损失与验证损失”，并查看您的模型是否过拟合或欠拟合。

【讨论】：