【发布时间】:2016-09-19 19:45:23
【问题描述】:
我正在尝试运行链接中给出的代码
https://github.com/bd622/DiscretHashing
离散散列是一种用于近似最近邻搜索的降维方法。我想在http://yann.lecun.com/exdb/mnist/ 中提供的MNIST 数据库上加载实现。我已经从压缩的 gz 格式中提取了这些文件。
问题 1:
使用Reading MNIST Image Database binary file in MATLAB提供的方法读取MNIST数据库
我收到以下错误:
Error using fread
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in Reading (line 7)
A = fread(fid, 1, 'uint32');
代码如下:
clear all;
close all;
%//Open file
fid = fopen('t10k-images-idx3-ubyte', 'r');
A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));
%//For each image, store into an individual cell
imageCellArray = cell(1, totalImages);
for k = 1 : totalImages
%//Read in numRows*numCols pixels at a time
A = fread(fid, numRows*numCols, 'uint8');
%//Reshape so that it becomes a matrix
%//We are actually reading this in column major format
%//so we need to transpose this at the end
imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end
%//Close the file
fclose(fid);
更新:问题1解决了,修改后的代码是
clear all;
close all;
%//Open file
fid = fopen('t10k-images.idx3-ubyte', 'r');
A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));
%//Read in total number of images
%//A = fread(fid, 4, 'uint8');
%//totalImages = sum(bitshift(A', [24 16 8 0]));
%//OR
A = fread(fid, 1, 'uint32');
totalImages = swapbytes(uint32(A));
%//Read in number of rows
%//A = fread(fid, 4, 'uint8');
%//numRows = sum(bitshift(A', [24 16 8 0]));
%//OR
A = fread(fid, 1, 'uint32');
numRows = swapbytes(uint32(A));
%//Read in number of columns
%//A = fread(fid, 4, 'uint8');
%//numCols = sum(bitshift(A', [24 16 8 0]));
%// OR
A = fread(fid, 1, 'uint32');
numCols = swapbytes(uint32(A));
for k = 1 : totalImages
%//Read in numRows*numCols pixels at a time
A = fread(fid, numRows*numCols, 'uint8');
%//Reshape so that it becomes a matrix
%//We are actually reading this in column major format
%//so we need to transpose this at the end
imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end
%//Close the file
fclose(fid);
问题 2:
我不明白如何在代码中应用 MNIST 的 4 个文件。代码包含变量
traindata = double(traindata);
testdata = double(testdata);
如何准备 MNIST 数据库以便我可以应用到实施中?
更新:我实施了解决方案,但我不断收到此错误
Error using fread
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in mnist_parse (line 11)
A = fread(fid1, 1, 'uint32');
这些是文件
demo.m % 这是调用函数读取MNIST数据的主文件
clear all
clc
[Trainimages, Trainlabels] = mnist_parse('C:\Users\Desktop\MNIST\train-images-idx3-ubyte', 'C:\Users\Desktop\MNIST\train-labels-idx1-ubyte');
[Testimages, Testlabels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');
k=5;
digit = images(:,:,k);
lbl = label(k);
function [images, labels] = mnist_parse(path_to_digits, path_to_labels)
% Open files
fid1 = fopen(path_to_digits, 'r');
% The labels file
fid2 = fopen(path_to_labels, 'r');
% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);
A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);
% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);
% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));
% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));
fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);
% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
% Read in numRows*numCols pixels at a time
A = fread(fid1, numRows*numCols, 'uint8');
% Reshape so that it becomes a matrix
% We are actually reading this in column major format
% so we need to transpose this at the end
images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end
% Read in the labels
labels = fread(fid2, totalImages, 'uint8');
% Close the files
fclose(fid1);
fclose(fid2);
end
【问题讨论】:
-
这个错误是不言自明的,你在一个无效的文件名上使用了
fopen。确保't10k-images-idx3-ubyte'是文件的完整 名称,并且它位于您当前的 MATLAB 路径中。否则请确保它是您要打开的文件的完整绝对路径。 -
@excaza:第一个问题和文件读取操作的错误解决了。文件名确实有问题。但是现在我不知道如何使用数据库,我无法理解如何使用这4个文件。我相信 traindata 变量将包含文件 train-images.idx3-ubyte 。那么哪一个是测试数据,然后我应该如何使用 2 个标签数据库文件?请帮忙
-
@rayryeng:您能否告诉我为什么在执行您的答案时由于文件读取操作而出现错误?我已经在问题中提出了新的更新。感谢您的时间和精力。
标签: image matlab image-processing mnist