注意几点:
1.只支持“正方形”的数据(width==height)的图片 (目前还没找到好的解决方案)我尝试改了一下,希望你们顺利
2.默认将图片转为灰度图
3.基于github上https://github.com/gskielian/JPG-PNG-to-MNIST-NN-Format的代码,调试运行成功!
4.数据格式:和代码1、2同目录下:
其中两个图片文件夹下分别为各个类(例如:0/1/2/3/4/5/6).每个类文件夹里对应多张图片
batches.meta.txt 为对应类别的名称。
代码1: 裁剪尺寸和转换为灰度图代码:(代码尺寸默认28x28 自己修改自己的尺寸,但必须width==height)
resize-script.sh
#!/bin/bash
#simple script for resizing images in all class directories
#also reformats everything from whatever to png
if [ `ls test-images/*/*.jpg 2> /dev/null | wc -l ` -gt 0 ]; then
echo hi
for file in test-images/*/*.jpg; do
convert "$file" -resize 64x128\! "${file%.*}.png"
convert "$file" -set colorspace Gray -separate -average "${file%.*}.png"
file "$file" #uncomment for testing
rm "$file"
done
fi
if [ `ls test-images/*/*.png 2> /dev/null | wc -l ` -gt 0 ]; then
echo hi
for file in test-images/*/*.png; do
convert "$file" -resize 64x128\! "${file%.*}.png"
convert "$file" -set colorspace Gray -separate -average "${file%.*}.png"
file "$file" #uncomment for testing
done
fi
if [ `ls training-images/*/*.jpg 2> /dev/null | wc -l ` -gt 0 ]; then
echo hi
for file in training-images/*/*.jpg; do
convert "$file" -resize 64x128\! "${file%.*}.png"
convert "$file" -set colorspace Gray -separate -average "${file%.*}.png"
file "$file" #uncomment for testing
rm "$file"
done
fi
if [ `ls training-images/*/*.png 2> /dev/null | wc -l ` -gt 0 ]; then
echo hi
for file in training-images/*/*.png; do
convert "$file" -resize 64x128\! "${file%.*}.png"
convert "$file" -set colorspace Gray -separate -average "${file%.*}.png"
file "$file" #uncomment for testing
done
fi
2.jpg、png转minist格式数据集代码
convert-images-to-mnist-format.py
import os
from PIL import Image
from array import *
from random import shuffle
# Load from and save to
Names = [['./training-images','train'], ['./test-images','test']]
for name in Names:
data_image = array('B')
data_label = array('B')
FileList = []
for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
path = os.path.join(name[0],dirname)
for filename in os.listdir(path):
if filename.endswith(".png"):
FileList.append(os.path.join(name[0],dirname,filename))
shuffle(FileList) # Usefull for further segmenting the validation set
for filename in FileList:
label = int(filename.split('/')[2])
Im = Image.open(filename)
pixel = Im.load()
width, height = Im.size
for x in range(0,width):
for y in range(0,height):
print(pixel[y,x])
#data_image.append(pixel[y,x])
data_image.append(Im.getpixel((x,y))) #改动的地方,能够完美运行不同尺寸的图像
data_label.append(label) # labels start (one unsigned byte each)
hexval = "{0:#0{1}x}".format(len(FileList),6) # number of files in HEX
# header for label array
header = array('B')
header.extend([0,0,8,1,0,0])
header.append(int('0x'+hexval[2:][:2],16))
header.append(int('0x'+hexval[2:][2:],16))
data_label = header + data_label
# additional header for images array
if max([width,height]) <= 256:
header.extend([0,0,0,width,0,0,0,height])
else:
raise ValueError('Image exceeds maximum size: 256x256 pixels');
header[3] = 3 # Changing MSB for image data (0x00000803)
data_image = header + data_image
output_file = open(name[1]+'-images-idx3-ubyte', 'wb')
data_image.tofile(output_file)
output_file.close()
output_file = open(name[1]+'-labels-idx1-ubyte', 'wb')
data_label.tofile(output_file)
output_file.close()
# gzip resulting files
for name in Names:
os.system('gzip '+name[1]+'-images-idx3-ubyte')
os.system('gzip '+name[1]+'-labels-idx1-ubyte')
4:运行方式先运行resize-script.sh、再运行convert-images-to-mnist-format.py
成功!!!