创建一个空数组，其中包含 0 行和 4096 列这个数字从特征答案

【问题标题】：create an empty array with 0 row and 4096 columns this number from feature创建一个空数组，其中包含 0 行和 4096 列这个数字从特征
【发布时间】：2020-06-05 19:16:59
【问题描述】：

我仍然是 ML 的初学者，我在这里从包含乳腺癌图像的本地文件夹中读取图像，然后计算每个图像的特征，将图像像素转换为 numpy 数组，并为 VGG 模型准备图像。

我在这里创建了一个空数组，其中包含 0 行和 4096 列的这个数字。

我只需要它是 2048 而不是 4096

list_features1 = np.empty((0,4096), float)

当我将数字更改为 2048 时，它会返回：

ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2048 and the array at index 1 has size 4096

代码：

def read_images(folder_path, classlbl):
       # load all images into a list
        images = []
        # dimensions of images
        img_width, img_height = 224, 224
        class1=[]
        for img in os.listdir(folder_path):
            img = os.path.join(folder_path, img)
            img = load_img(img, target_size=(img_width, img_height))
            class1.append(classlbl)# class one.
        #    img = image.img_to_array(img)
        #    img = np.expand_dims(img, axis=0)
            images.append(img)
        return images, class1  
    #compute features for each image. 
def computefeatures(model,image):
   # convert the image pixels to a numpy array
    image = img_to_array(image)
    # reshape data for the model
    image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
    # prepare the image for the VGG model
    image = preprocess_input(image)

    # get extracted features
    features = model.predict(image)
    return features
# load model
model = VGG16()
# remove the output layer
model.layers.pop()
model = Model(inputs=model.inputs, outputs=model.layers[-1].output)

# call the image read and 
folder_path = 'C:path'

classlbl=5

images, class1 =read_images(folder_path, classlbl)
# call the fucntion to compute the features for each image. 
list_features1=[]
list_features1 = np.empty((0,4096), float)# create an empty array with 0 row and 4096 columns this number from fature
# extraction from vg16 
for img in range(len(images)):
    f2=computefeatures(model,images[img])
    #list_features1=np.append(list_features1, f2, axis=1)
    list_features1 = np.append(list_features1, f2, axis=0)

classes1 = []
count = 0
for i in range(156):
    if count >= 0 and count <= 156:
        classes1.append(5)
    count = count + 1
print(len(classes1))

df1= pd.DataFrame(list_features1,columns=list(range(1,4097)))
df1['class'] = classes1
dfs = {'40X':df1,} 
# list_features1 :row , 
writer = pd.ExcelWriter('multi-classLCResNet.xlsx')
for sheet_name in dfs.keys():
    dfs[sheet_name].to_excel(writer,sheet_name=sheet_name,index=False)
writer.save()

当前输出：

1       2       3       4       4096
0.12    0.23    0.345   0.5372  0.21111
0.2313  0.321   0.214   0.3542  0.46756
.
.

想要的输出：

1       2       3       4       2048
0.12    0.23    0.345   0.5372  0.21111
0.2313  0.321   0.214   0.3542  0.46756
.
.

【问题讨论】：

不要使用 np.append。坚持使用您在其他地方使用的列表附加。重复np.append 效率较低。
@hpaulj，我会，但你能告诉我如何解决这个问题吗？
“空”是什么意思？函数np.empty 用于创建具有未初始化元素的张量，而不是没有元素的张量
你在说什么问题？我看不到 4096 来自哪里（也看不到它的一半）。 np.empty((0,2048), float)。如果您坚持使用您首先定义的list_features1=[]，则无需创建此 (0,n) 数组。
@tstanisl 如果我将其设置为np.empty((0,2048), float)，它将返回错误：ValueError：连接轴的所有输入数组维度必须完全匹配，但沿维度 1，索引 0 处的数组具有大小2048，索引 1 处的数组大小为 4096

标签： python-3.x numpy scipy

【解决方案1】：

您必须以某种方式将 f2 的大小从 4096 更改为 2048。根据您的应用程序，您可以跳过每个第二个元素：

变化：

list_features1 = np.empty((0,4096), float)
for img in range(len(images)):
    f2=computefeatures(model,images[img])
    list_features1 = np.append(list_features1, f2, axis=0)

到

list_features1 = np.empty((0,2048), float)
for img in range(len(images)):
    f2=computefeatures(model,images[img])
    f2 = f2[...,::2] # skip every 2-nd element along the last dimension
    list_features1 = np.append(list_features1, f2, axis=0)

使用np.append 是一个非常不理想的解决方案，但它是一个单独的问题。

【讨论】：