注意几点:

1.只支持“正方形”的数据(width==height)的图片  (目前还没找到好的解决方案)我尝试改了一下,希望你们顺利

2.默认将图片转为灰度图

3.基于github上https://github.com/gskielian/JPG-PNG-to-MNIST-NN-Format的代码,调试运行成功!

4.数据格式:和代码1、2同目录下:

【代码】python 非minist数据(jpg、png等)转minist格式数据集

其中两个图片文件夹下分别为各个类(例如:0/1/2/3/4/5/6).每个类文件夹里对应多张图片

【代码】python 非minist数据(jpg、png等)转minist格式数据集

batches.meta.txt 为对应类别的名称。

【代码】python 非minist数据(jpg、png等)转minist格式数据集

 

 

代码1: 裁剪尺寸和转换为灰度图代码:(代码尺寸默认28x28 自己修改自己的尺寸,但必须width==height)

resize-script.sh

#!/bin/bash

#simple script for resizing images in all class directories
#also reformats everything from whatever to png

if [ `ls test-images/*/*.jpg 2> /dev/null | wc -l ` -gt 0 ]; then
  echo hi
  for file in test-images/*/*.jpg; do
    convert "$file" -resize 64x128\! "${file%.*}.png"
    convert "$file"  -set colorspace Gray -separate -average "${file%.*}.png"
    file "$file" #uncomment for testing
    rm "$file"
  done
fi

if [ `ls test-images/*/*.png 2> /dev/null | wc -l ` -gt 0 ]; then
  echo hi
  for file in test-images/*/*.png; do
    convert "$file" -resize 64x128\! "${file%.*}.png"
    convert "$file"  -set colorspace Gray -separate -average "${file%.*}.png"
    file "$file" #uncomment for testing
  done
fi

if [ `ls training-images/*/*.jpg 2> /dev/null | wc -l ` -gt 0 ]; then
  echo hi
  for file in training-images/*/*.jpg; do
    convert "$file" -resize 64x128\! "${file%.*}.png"
    convert "$file"  -set colorspace Gray -separate -average "${file%.*}.png"
    file "$file" #uncomment for testing
    rm "$file"
  done
fi

if [ `ls training-images/*/*.png 2> /dev/null | wc -l ` -gt 0 ]; then
  echo hi
  for file in training-images/*/*.png; do
    convert "$file" -resize 64x128\! "${file%.*}.png"
    convert "$file"  -set colorspace Gray -separate -average "${file%.*}.png"
    file "$file" #uncomment for testing
  done
fi

2.jpg、png转minist格式数据集代码

convert-images-to-mnist-format.py

import os
from PIL import Image
from array import *
from random import shuffle

# Load from and save to
Names = [['./training-images','train'], ['./test-images','test']]

for name in Names:
	
	data_image = array('B')
	data_label = array('B')

	FileList = []
	for dirname in os.listdir(name[0])[1:]: # [1:] Excludes .DS_Store from Mac OS
		path = os.path.join(name[0],dirname)
		for filename in os.listdir(path):
			if filename.endswith(".png"):
				FileList.append(os.path.join(name[0],dirname,filename))

	shuffle(FileList) # Usefull for further segmenting the validation set

	for filename in FileList:

		label = int(filename.split('/')[2])

		Im = Image.open(filename)

		pixel = Im.load()

		width, height = Im.size

		for x in range(0,width):
			for y in range(0,height):
				print(pixel[y,x])
				#data_image.append(pixel[y,x])
                data_image.append(Im.getpixel((x,y)))  #改动的地方,能够完美运行不同尺寸的图像

		data_label.append(label) # labels start (one unsigned byte each)

	hexval = "{0:#0{1}x}".format(len(FileList),6) # number of files in HEX

	# header for label array

	header = array('B')
	header.extend([0,0,8,1,0,0])
	header.append(int('0x'+hexval[2:][:2],16))
	header.append(int('0x'+hexval[2:][2:],16))
	
	data_label = header + data_label

	# additional header for images array
	
	if max([width,height]) <= 256:
		header.extend([0,0,0,width,0,0,0,height])
	else:
		raise ValueError('Image exceeds maximum size: 256x256 pixels');

	header[3] = 3 # Changing MSB for image data (0x00000803)
	
	data_image = header + data_image

	output_file = open(name[1]+'-images-idx3-ubyte', 'wb')
	data_image.tofile(output_file)
	output_file.close()

	output_file = open(name[1]+'-labels-idx1-ubyte', 'wb')
	data_label.tofile(output_file)
	output_file.close()

# gzip resulting files

for name in Names:
	os.system('gzip '+name[1]+'-images-idx3-ubyte')
	os.system('gzip '+name[1]+'-labels-idx1-ubyte')

 

4:运行方式先运行resize-script.sh、再运行convert-images-to-mnist-format.py

 

成功!!!【代码】python 非minist数据(jpg、png等)转minist格式数据集

相关文章:

  • 2022-03-03
  • 2021-11-07
  • 2021-05-24
  • 2021-08-17
  • 2021-08-19
  • 2021-08-27
  • 2022-01-30
  • 2021-12-19
猜你喜欢
  • 2021-04-09
  • 2021-09-05
  • 2021-04-27
  • 2022-12-23
  • 2021-06-10
  • 2021-08-07
  • 2021-11-07
相关资源
相似解决方案