【发布时间】:2020-06-05 19:16:59
【问题描述】:
我仍然是 ML 的初学者,我在这里从包含乳腺癌图像的本地文件夹中读取图像,然后计算每个图像的特征,将图像像素转换为 numpy 数组,并为 VGG 模型准备图像。
我在这里创建了一个空数组,其中包含 0 行和 4096 列的这个数字。
我只需要它是 2048 而不是 4096
list_features1 = np.empty((0,4096), float)
当我将数字更改为 2048 时,它会返回:
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 2048 and the array at index 1 has size 4096
代码:
def read_images(folder_path, classlbl):
# load all images into a list
images = []
# dimensions of images
img_width, img_height = 224, 224
class1=[]
for img in os.listdir(folder_path):
img = os.path.join(folder_path, img)
img = load_img(img, target_size=(img_width, img_height))
class1.append(classlbl)# class one.
# img = image.img_to_array(img)
# img = np.expand_dims(img, axis=0)
images.append(img)
return images, class1
#compute features for each image.
def computefeatures(model,image):
# convert the image pixels to a numpy array
image = img_to_array(image)
# reshape data for the model
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
# prepare the image for the VGG model
image = preprocess_input(image)
# get extracted features
features = model.predict(image)
return features
# load model
model = VGG16()
# remove the output layer
model.layers.pop()
model = Model(inputs=model.inputs, outputs=model.layers[-1].output)
# call the image read and
folder_path = 'C:path'
classlbl=5
images, class1 =read_images(folder_path, classlbl)
# call the fucntion to compute the features for each image.
list_features1=[]
list_features1 = np.empty((0,4096), float)# create an empty array with 0 row and 4096 columns this number from fature
# extraction from vg16
for img in range(len(images)):
f2=computefeatures(model,images[img])
#list_features1=np.append(list_features1, f2, axis=1)
list_features1 = np.append(list_features1, f2, axis=0)
classes1 = []
count = 0
for i in range(156):
if count >= 0 and count <= 156:
classes1.append(5)
count = count + 1
print(len(classes1))
df1= pd.DataFrame(list_features1,columns=list(range(1,4097)))
df1['class'] = classes1
dfs = {'40X':df1,}
# list_features1 :row ,
writer = pd.ExcelWriter('multi-classLCResNet.xlsx')
for sheet_name in dfs.keys():
dfs[sheet_name].to_excel(writer,sheet_name=sheet_name,index=False)
writer.save()
当前输出:
1 2 3 4 4096
0.12 0.23 0.345 0.5372 0.21111
0.2313 0.321 0.214 0.3542 0.46756
.
.
想要的输出:
1 2 3 4 2048
0.12 0.23 0.345 0.5372 0.21111
0.2313 0.321 0.214 0.3542 0.46756
.
.
【问题讨论】:
-
不要使用
np.append。坚持使用您在其他地方使用的列表附加。重复np.append效率较低。 -
@hpaulj,我会,但你能告诉我如何解决这个问题吗?
-
“空”是什么意思?函数
np.empty用于创建具有未初始化元素的张量,而不是没有元素的张量 -
你在说什么问题?我看不到 4096 来自哪里(也看不到它的一半)。
np.empty((0,2048), float)。如果您坚持使用您首先定义的list_features1=[],则无需创建此 (0,n) 数组。 -
@tstanisl 如果我将其设置为
np.empty((0,2048), float),它将返回错误:ValueError:连接轴的所有输入数组维度必须完全匹配,但沿维度 1,索引 0 处的数组具有大小2048,索引 1 处的数组大小为 4096
标签: python-3.x numpy scipy