如何在 MATLAB 中加载 MNIST 数字和标签数据？答案

【问题标题】：How do I load in the MNIST digits and label data in MATLAB?如何在 MATLAB 中加载 MNIST 数字和标签数据？
【发布时间】：2016-09-19 19:45:23
【问题描述】：

我正在尝试运行链接中给出的代码

https://github.com/bd622/DiscretHashing

离散散列是一种用于近似最近邻搜索的降维方法。我想在http://yann.lecun.com/exdb/mnist/ 中提供的MNIST 数据库上加载实现。我已经从压缩的 gz 格式中提取了这些文件。

问题 1：

使用Reading MNIST Image Database binary file in MATLAB提供的方法读取MNIST数据库

我收到以下错误：

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in Reading (line 7)
A = fread(fid, 1, 'uint32');

代码如下：

clear all;
close all;

%//Open file
fid = fopen('t10k-images-idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//For each image, store into an individual cell
imageCellArray = cell(1, totalImages);
for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

更新：问题1解决了，修改后的代码是

clear all;
close all;

%//Open file
fid = fopen('t10k-images.idx3-ubyte', 'r');

A = fread(fid, 1, 'uint32');
magicNumber = swapbytes(uint32(A));

%//Read in total number of images
%//A = fread(fid, 4, 'uint8');
%//totalImages = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
totalImages = swapbytes(uint32(A));

%//Read in number of rows
%//A = fread(fid, 4, 'uint8');
%//numRows = sum(bitshift(A', [24 16 8 0]));

%//OR
A = fread(fid, 1, 'uint32');
numRows = swapbytes(uint32(A));

%//Read in number of columns
%//A = fread(fid, 4, 'uint8');
%//numCols = sum(bitshift(A', [24 16 8 0]));

%// OR
A = fread(fid, 1, 'uint32');
numCols = swapbytes(uint32(A));

for k = 1 : totalImages
    %//Read in numRows*numCols pixels at a time
    A = fread(fid, numRows*numCols, 'uint8');
    %//Reshape so that it becomes a matrix
    %//We are actually reading this in column major format
    %//so we need to transpose this at the end
    imageCellArray{k} = reshape(uint8(A), numCols, numRows)';
end

%//Close the file
fclose(fid);

问题 2：

我不明白如何在代码中应用 MNIST 的 4 个文件。代码包含变量

traindata = double(traindata);
testdata = double(testdata);

如何准备 MNIST 数据库以便我可以应用到实施中？

更新：我实施了解决方案，但我不断收到此错误

Error using fread
Invalid file identifier.  Use fopen to generate a valid file identifier.

Error in mnist_parse (line 11)
A = fread(fid1, 1, 'uint32');

这些是文件

demo.m % 这是调用函数读取MNIST数据的主文件

clear all
clc
[Trainimages, Trainlabels] = mnist_parse('C:\Users\Desktop\MNIST\train-images-idx3-ubyte', 'C:\Users\Desktop\MNIST\train-labels-idx1-ubyte');

[Testimages, Testlabels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

k=5;
digit = images(:,:,k);
lbl = label(k);

 function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

【问题讨论】：

这个错误是不言自明的，你在一个无效的文件名上使用了fopen。确保 't10k-images-idx3-ubyte' 是文件的完整名称，并且它位于您当前的 MATLAB 路径中。否则请确保它是您要打开的文件的完整绝对路径。
@excaza：第一个问题和文件读取操作的错误解决了。文件名确实有问题。但是现在我不知道如何使用数据库，我无法理解如何使用这4个文件。我相信 traindata 变量将包含文件 train-images.idx3-ubyte 。那么哪一个是测试数据，然后我应该如何使用 2 个标签数据库文件？请帮忙
@rayryeng：您能否告诉我为什么在执行您的答案时由于文件读取操作而出现错误？我已经在问题中提出了新的更新。感谢您的时间和精力。

标签： image matlab image-processing mnist

【解决方案1】：

我是您所说的方法#1 的原作者。读取训练数据和测试标签的过程非常简单。在读取图像方面，您上面显示的代码可以完美地读取文件并且采用元胞数组格式。但是，您缺少读取文件内的图像、行数和列数。请注意，此文件的 MNIST 格式采用以下方式。左列是您引用的相对于开头的字节偏移量：

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000803(2051) magic number
0004     32 bit integer  60000            number of images
0008     32 bit integer  28               number of rows
0012     32 bit integer  28               number of columns
0016     unsigned byte   ??               pixel
0017     unsigned byte   ??               pixel
........
xxxx     unsigned byte   ??               pixel

前四个字节是一个幻数：2051，以确保您正确读取文件。接下来的四个字节表示图像的总数，接下来的四个字节是行，最后四个字节是列。应该有 60000 张大小为 28 行 x 28 列的图像。在此之后，像素以行主要格式交错，因此您必须遍历一系列 28 x 28 像素并存储它们。在这种情况下，我将它们存储在一个元胞数组中，这个元胞数组中的每个元素都是一位数。同样的格式也适用于测试数据，但有 10000 张图像。

至于实际的标签，格式大致相同，但略有不同：

[offset] [type]          [value]          [description]
0000     32 bit integer  0x00000801(2049) magic number (MSB first)
0004     32 bit integer  60000            number of items
0008     unsigned byte   ??               label
0009     unsigned byte   ??               label
........
xxxx     unsigned byte   ??               label

前四个字节是一个幻数：2049，然后第二组四个字节告诉您有多少标签，最后数据集中每个对应的数字正好有 1 个字节。测试数据也是相同的格式，但有 10000 个标签。因此，一旦您在标签集中读取了必要的数据，您只需要一个fread 调用并确保数据是无符号的 8 位整数来读取其余标签。

现在你必须使用swapbytes的原因是因为MATLAB会以little-endian格式读取数据，这意味着首先读取一组字节中的最低有效字节 .完成后，您可以使用swapbytes 重新排列此顺序。

因此，我已为您修改了此代码，使其成为接收一组两个字符串的实际函数：数字图像文件的完整路径和数字的完整路径。我还更改了代码，以便图像是 3D 数字矩阵，而不是单元阵列，以便更快地处理。请注意，当您开始读取实际图像数据时，每个像素都是无符号的 8 位整数，因此无需进行任何字节交换。只有在一个fread 调用中读取多个字节时才需要这样做：

function [images, labels] = mnist_parse(path_to_digits, path_to_labels)

% Open files
fid1 = fopen(path_to_digits, 'r');

% The labels file
fid2 = fopen(path_to_labels, 'r');

% Read in magic numbers for both files
A = fread(fid1, 1, 'uint32');
magicNumber1 = swapbytes(uint32(A)); % Should be 2051
fprintf('Magic Number - Images: %d\n', magicNumber1);

A = fread(fid2, 1, 'uint32');
magicNumber2 = swapbytes(uint32(A)); % Should be 2049
fprintf('Magic Number - Labels: %d\n', magicNumber2);

% Read in total number of images
% Ensure that this number matches with the labels file
A = fread(fid1, 1, 'uint32');
totalImages = swapbytes(uint32(A));
A = fread(fid2, 1, 'uint32');
if totalImages ~= swapbytes(uint32(A))
    error('Total number of images read from images and labels files are not the same');
end
fprintf('Total number of images: %d\n', totalImages);

% Read in number of rows
A = fread(fid1, 1, 'uint32');
numRows = swapbytes(uint32(A));

% Read in number of columns
A = fread(fid1, 1, 'uint32');
numCols = swapbytes(uint32(A));

fprintf('Dimensions of each digit: %d x %d\n', numRows, numCols);

% For each image, store into an individual slice
images = zeros(numRows, numCols, totalImages, 'uint8');
for k = 1 : totalImages
    % Read in numRows*numCols pixels at a time
    A = fread(fid1, numRows*numCols, 'uint8');

    % Reshape so that it becomes a matrix
    % We are actually reading this in column major format
    % so we need to transpose this at the end
    images(:,:,k) = reshape(uint8(A), numCols, numRows).';
end

% Read in the labels
labels = fread(fid2, totalImages, 'uint8');

% Close the files
fclose(fid1);
fclose(fid2);

end

要调用此函数，只需指定图像文件和标签文件的路径。假设您在文件所在的同一目录中运行此文件，您将对训练图像执行以下操作之一：

[images, labels] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');

此外，您还可以对测试图像执行以下操作：

[images, labels] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');

要访问kth 数字，您只需执行以下操作：

digit = images(:,:,k);

kth 数字的对应标签是：

lbl = label(k);

为了最终将这些数据转换为我在 Github 上看到的代码可接受的格式，他们假设行对应于训练示例，列对应于特征。如果您希望采用这种格式，只需重塑数据，使图像像素分布在各列中。

因此，只需这样做：

[trainingdata, traingnd] = mnist_parse('train-images-idx3-ubyte', 'train-labels-idx1-ubyte');
trainingdata = double(reshape(trainingdata, size(trainingdata,1)*size(trainingdata,2), []).');
traingnd = double(traingnd);

[testdata, testgnd] = mnist_parse('t10k-images-idx3-ubyte', 't10k-labels-idx1-ubyte');
testdata = double(reshape(testdata, size(testdata,1)*size(testdata_data,2), []).');
testgnd = double(testgnd);

以上使用与脚本中相同的变量，因此您应该能够将其插入并且它应该可以工作。第二行重塑矩阵，使每个数字都在一列中，但我们需要转置它，以便每个数字都在一列中。我们还需要转换为double，因为这就是 Github 代码正在做的事情。相同的逻辑适用于测试数据。另请注意，我已明确将训练和测试标签转换为 double，以确保您决定在此数据上使用的任何算法的最大兼容性。

快乐的数字黑客！

【讨论】：

非常感谢您的详细解释。由于某些故障，我无法登录我的 Stackoverflow 帐户，这就是我无法检查您的答案的原因。因此，我逐步运行您的代码，但 Matlab 抛出错误：使用 fread 无效文件标识符时出错。使用 fopen 生成有效的文件标识符。 mnist_parse 中的错误（第 11 行） A = fread(fid1, 1, 'uint32');演示中的错误（第 3 行）[Trainimages, Trainlabels] = mnist_parse('C:\Users\Desktop\MNIST\train-images-idx3-ubyte', 'C:\Users\Desktop\MNIST\train-labels-idx
根据您的观察，我错过了阅读总图像、行和列的数量，我已经更正了那部分。我已经在我的问题中纠正了这一点。但是，我无法减轻在实施您的解决方案时出现的这个新错误。我没有使用我应该应用到 GitHub 中的程序的最后一段代码。请让我知道我应该怎么做才能消除错误？
它不起作用，因为您的路径不正确。在您的路径中的用户和桌面之间应该是您的用户名。没有这样的目录。 fopen 在提供了文件的有效路径而您没有这样做时有效。请确保您的文件路径绝对正确...或者将脚本放在与 MNIST 数据相同的目录中并使用本地路径。
解压后的文件名和你回答的不一样。 :) 仔细查看文件名后，我看到的是 train-images.idx3-ubyte。其他人也一样。所以，我没有得到那个错误。但是，有一个新问题是 GitHub 代码使用 cateTrainTest 可用于 cifar_10_gist 数据库。该文件用于 [Pre, Rec] = evaluate_macro(cateTrainTest, Ret) 行。
基本上，元素表示相似性：0表示两个数据点是否相似，1表示不相似，如果我没记错的话。你知道我在哪里可以找到这个 MNIST 数据库的数据文件吗？或者您是否可以帮助进行其他一些黑客攻击，以便可以使用 MNIST 数据库？