【问题标题】:How to check dimensions of all images in a directory using python?如何使用python检查目录中所有图像的尺寸?
【发布时间】:2010-12-03 04:28:56
【问题描述】:

我需要检查目录中图像的尺寸。目前它有大约 700 张图像。 我只需要检查尺寸,如果尺寸与给定尺寸不匹配,它将被移动到不同的文件夹。我该如何开始?

【问题讨论】:

    标签: python image directory


    【解决方案1】:

    如果您不需要 PIL 的其余部分而只需要 PNG、JPEG 和 GIF 的图像尺寸,那么这个小功能(BSD 许可)可以很好地完成工作:

    http://code.google.com/p/bfg-pages/source/browse/trunk/pages/getimageinfo.py

    import StringIO
    import struct
    
    def getImageInfo(data):
        data = str(data)
        size = len(data)
        height = -1
        width = -1
        content_type = ''
    
        # handle GIFs
        if (size >= 10) and data[:6] in ('GIF87a', 'GIF89a'):
            # Check to see if content_type is correct
            content_type = 'image/gif'
            w, h = struct.unpack("<HH", data[6:10])
            width = int(w)
            height = int(h)
    
        # See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
        # Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
        # and finally the 4-byte width, height
        elif ((size >= 24) and data.startswith('\211PNG\r\n\032\n')
              and (data[12:16] == 'IHDR')):
            content_type = 'image/png'
            w, h = struct.unpack(">LL", data[16:24])
            width = int(w)
            height = int(h)
    
        # Maybe this is for an older PNG version.
        elif (size >= 16) and data.startswith('\211PNG\r\n\032\n'):
            # Check to see if we have the right content type
            content_type = 'image/png'
            w, h = struct.unpack(">LL", data[8:16])
            width = int(w)
            height = int(h)
    
        # handle JPEGs
        elif (size >= 2) and data.startswith('\377\330'):
            content_type = 'image/jpeg'
            jpeg = StringIO.StringIO(data)
            jpeg.read(2)
            b = jpeg.read(1)
            try:
                while (b and ord(b) != 0xDA):
                    while (ord(b) != 0xFF): b = jpeg.read(1)
                    while (ord(b) == 0xFF): b = jpeg.read(1)
                    if (ord(b) >= 0xC0 and ord(b) <= 0xC3):
                        jpeg.read(3)
                        h, w = struct.unpack(">HH", jpeg.read(4))
                        break
                    else:
                        jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0])-2)
                    b = jpeg.read(1)
                width = int(w)
                height = int(h)
            except struct.error:
                pass
            except ValueError:
                pass
    
        return content_type, width, height
    

    【讨论】:

    • 这对我来说就像一个魅力,+1 对于没有第三方库的解决方案。
    • 你用什么来称呼这个函数?你的数据是什么?
    【解决方案2】:

    一种常见的方法是使用python成像库PIL来获取尺寸:

    from PIL import Image
    import os.path
    
    filename = os.path.join('path', 'to', 'image', 'file')
    img = Image.open(filename)
    print img.size
    

    然后您需要遍历目录中的文件,根据您所需的尺寸检查尺寸,并移动那些不匹配的文件。

    【讨论】:

      【解决方案3】:

      您可以使用Python Imaging Library(又名 PIL)来读取图像标题并查询尺寸。

      解决它的一种方法是自己编写一个函数,该函数接受文件名并返回尺寸(使用 PIL)。然后使用os.path.walk函数遍历目录下的所有文件,应用这个函数。收集结果,您可以构建映射字典filename -&gt; dimensions,然后使用列表推导(参见itertools)过滤掉与所需大小不匹配的那些。

      【讨论】:

      • 我这样做了,但是用 os.listdir 代替.. 与 ~700 个图像效果很好。 os.path.walk 更好吗?
      • 如果os.listdir 能满足您的需求,那很好。主要区别在于os.walk 会递归到子目录中。
      【解决方案4】:

      这是一个满足您需要的脚本:

      #!/usr/bin/env python
      
      """
      Get information about images in a folder.
      """
      
      from os import listdir
      from os.path import isfile, join
      
      from PIL import Image
      
      
      def print_data(data):
          """
          Parameters
          ----------
          data : dict
          """
          for k, v in data.items():
              print("%s:\t%s" % (k, v))
          print("Min width: %i" % data["min_width"])
          print("Max width: %i" % data["max_width"])
          print("Min height: %i" % data["min_height"])
          print("Max height: %i" % data["max_height"])
      
      
      def main(path):
          """
          Parameters
          ----------
          path : str
              Path where to look for image files.
          """
          onlyfiles = [f for f in listdir(path) if isfile(join(path, f))]
      
          # Filter files by extension
          onlyfiles = [f for f in onlyfiles if f.endswith(".jpg")]
      
          data = {}
          data["images_count"] = len(onlyfiles)
          data["min_width"] = 10 ** 100  # No image will be bigger than that
          data["max_width"] = 0
          data["min_height"] = 10 ** 100  # No image will be bigger than that
          data["max_height"] = 0
      
          for filename in onlyfiles:
              im = Image.open(filename)
              width, height = im.size
              data["min_width"] = min(width, data["min_width"])
              data["max_width"] = max(width, data["max_width"])
              data["min_height"] = min(height, data["min_height"])
              data["max_height"] = max(height, data["max_height"])
      
          print_data(data)
      
      
      if __name__ == "__main__":
          main(path=".")
      

      【讨论】:

        【解决方案5】:
        import os
        from PIL import Image 
        
        folder_images = "/tmp/photos"
        size_images = dict()
        
        for dirpath, _, filenames in os.walk(folder_images):
            for path_image in filenames:
                image = os.path.abspath(os.path.join(dirpath, path_image))
                with Image.open(image) as img:
                    width, heigth = img.size
                    SIZE_IMAGES[path_image] = {'width': width, 'heigth': heigth}
        print(size_images)
        

        folder_images 你的箭头目录中,它是图像。 size_images 是图片大小的变量,采用这种格式。

        例子:

        {'image_name.jpg' : {'width': 100, 'heigth': 100} }
        

        【讨论】:

        • 虽然您的代码背后的想法很好,但它缺乏解释。我还要指出,全部大写的变量通常用于常量,因此我不建议像使用 SIZE_IMAGES 那样将其用于字典。
        • 拜托,我可以重新评估我的答案。
        【解决方案6】:

        您还可以使用 cv2 库来检查图像的尺寸。

        import cv2
        
        # read image
        img = cv2.imread('boarding_pass.png', cv2.IMREAD_UNCHANGED)
        
        # get dimensions of image
        dimensions = img.shape
        
        # height, width, number of channels in image
        height = img.shape[0]
        width = img.shape[1]
        channels = img.shape[2]
        
        print('Image Dimension    : ',dimensions)
        print('Image Height       : ',height)
        print('Image Width        : ',width)
        print('Number of Channels : ',channels)
        

        【讨论】:

          【解决方案7】:

          我对上面提供的答案非常满意,因为这些答案帮助我为这个问题写了另一个简单的答案。

          由于上述答案只有脚本,因此读者需要运行以检查它们是否正常工作。所以我决定使用交互模式编程(使用 Python shell)来解决这个问题。

          我想你会很清楚。我正在使用 Python 2.7.12 并且我已经安装了 Pillow 库来使用 PIL 来访问图像。我的当前目录中有很多 jpg 图像和 1 个 png 图像。

          现在让我们继续讨论 Python shell。

          >>> #Date of creation : 3 March 2017
          >>> #Python version   : 2.7.12
          >>>
          >>> import os         #Importing os module
          >>> import glob       #Importing glob module to list the same type of image files like jpg/png(here)
          >>> 
          >>> for extension in ["jpg", 'png']:
          ...     print "List of all " + extension + " files in current directory:-"
          ...     i = 1
          ...     for imgfile in glob.glob("*."+extension):
          ...         print i,") ",imgfile
          ...         i += 1
          ...     print "\n"
          ... 
          List of all jpg files in current directory:-
          1 )  002-tower-babel.jpg
          2 )  1454906.jpg
          3 )  69151278-great-hd-wallpapers.jpg
          4 )  amazing-ancient-wallpaper.jpg
          5 )  Ancient-Rome.jpg
          6 )  babel_full.jpg
          7 )  Cuba-is-wonderfull.jpg
          8 )  Cute-Polar-Bear-Images-07775.jpg
          9 )  Cute-Polar-Bear-Widescreen-Wallpapers-07781.jpg
          10 )  Hard-work-without-a-lh.jpg
          11 )  jpeg422jfif.jpg
          12 )  moscow-park.jpg
          13 )  moscow_city_night_winter_58404_1920x1080.jpg
          14 )  Photo1569.jpg
          15 )  Pineapple-HD-Photos-03691.jpg
          16 )  Roman_forum_cropped.jpg
          17 )  socrates.jpg
          18 )  socrates_statement1.jpg
          19 )  steve-jobs.jpg
          20 )  The_Great_Wall_of_China_at_Jinshanling-edit.jpg
          21 )  torenvanbabel_grt.jpg
          22 )  tower_of_babel4.jpg
          23 )  valckenborch_babel_1595_grt.jpg
          24 )  Wall-of-China-17.jpg
          
          
          List of all png files in current directory:-
          1 )  gergo-hungary.png
          
          
          >>> #So let's display all the resolutions with the filename
          ... from PIL import Image   #Importing Python Imaging library(PIL)
          >>> for extension in ["jpg", 'png']:
          ...     i = 1
          ...     for imgfile in glob.glob("*." + extension):
          ...         img = Image.open(imgfile)
          ...         print i,") ",imgfile,", resolution: ",img.size[0],"x",img.size[1]
          ...         i += 1
          ...     print "\n"
          ... 
          1 )  002-tower-babel.jpg , resolution:  1024 x 768
          2 )  1454906.jpg , resolution:  1920 x 1080
          3 )  69151278-great-hd-wallpapers.jpg , resolution:  5120 x 2880
          4 )  amazing-ancient-wallpaper.jpg , resolution:  1920 x 1080
          5 )  Ancient-Rome.jpg , resolution:  1000 x 667
          6 )  babel_full.jpg , resolution:  1464 x 1142
          7 )  Cuba-is-wonderfull.jpg , resolution:  1366 x 768
          8 )  Cute-Polar-Bear-Images-07775.jpg , resolution:  1600 x 1067
          9 )  Cute-Polar-Bear-Widescreen-Wallpapers-07781.jpg , resolution:  2300 x 1610
          10 )  Hard-work-without-a-lh.jpg , resolution:  650 x 346
          11 )  jpeg422jfif.jpg , resolution:  2048 x 1536
          12 )  moscow-park.jpg , resolution:  1920 x 1200
          13 )  moscow_city_night_winter_58404_1920x1080.jpg , resolution:  1920 x 1080
          14 )  Photo1569.jpg , resolution:  480 x 640
          15 )  Pineapple-HD-Photos-03691.jpg , resolution:  2365 x 1774
          16 )  Roman_forum_cropped.jpg , resolution:  4420 x 1572
          17 )  socrates.jpg , resolution:  852 x 480
          18 )  socrates_statement1.jpg , resolution:  1280 x 720
          19 )  steve-jobs.jpg , resolution:  1920 x 1080
          20 )  The_Great_Wall_of_China_at_Jinshanling-edit.jpg , resolution:  4288 x 2848
          21 )  torenvanbabel_grt.jpg , resolution:  1100 x 805
          22 )  tower_of_babel4.jpg , resolution:  1707 x 956
          23 )  valckenborch_babel_1595_grt.jpg , resolution:  1100 x 748
          24 )  Wall-of-China-17.jpg , resolution:  1920 x 1200
          
          
          1 )  gergo-hungary.png , resolution:  1236 x 928
          
          
          >>> 
          

          【讨论】:

            【解决方案8】:

            如果您使用的是 ipython / jupyter notebook,这个功能就像一个魅力。方便的命令是 linux 终端中的file 命令。你问优点?这里:

            • 速度极快,适合文件夹包含数千张图片且您需要了解图片大小分布的情况
            • 无需将图片加载到内存中,从而节省内存过载
            def get_image_size_faster(file_dir, ext='png'):
                    """
                    Function to retrieve image size without loading the image at all
            
                    params:
                    file_dir = path of the folder containing image files
                    dim_index = index of image dimensions in the `file $file_path` call output
                                For PNG : -3 # Downloads/test.png: PNG image data, 4032 x 3024, 8-bit/color RGB, non-interlaced
                                For JPEG/JPG : -2 # Downloads/test.jpg: JPEG image data,..., baseline, precision 8, 2252x1400, components 3
                                For GIF : -1 # Downloads/test.gif: GIF image data, version 89a, 498 x 373
                    """
                    dim_index_map = {
                        'png' : -3,
                        'jpg' : -2,
                        'jpeg': -2,
                        'gif' : -1
                    }
            
                    dim_index = dim_index_map[ext]
            
                    files_regex = "{file_dir}/*.{ext}".format(file_dir=file_dir, ext=ext)
                    outputs = !file $files_regex
                    dims = [tuple(map(int, x.split(',')[dim_index].strip().split('x'))) for x in outputs]
                    return dims
            

            可以使用subprocess 包为这个函数编写python-script 替代方案,它产生相同的结果

            【讨论】:

              【解决方案9】:

              我尝试使用@JohnTESlade 的答案,但我遇到了字节字符串转换问题,所以我更正了它,遵循了一些 PEP,并添加了对 EMF 类型的支持,这是我需要的。

              def get_image_info(data: bytes) -> Tuple[str, int, int]:
                  size = len(data)
                  height = -1
                  width = -1
                  content_type = ''
              
                  # handle GIFs
                  if (size >= 10) and data[:6] in (b'GIF87a', b'GIF89a'):
                      # Check to see if content_type is correct
                      content_type = 'image/gif'
                      w, h = struct.unpack("<HH", data[6:10])
                      width = int(w)
                      height = int(h)
              
                  # See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
                  # Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
                  # and finally the 4-byte width, height
                  elif ((size >= 24) and data[0:8] == b'\211PNG\r\n\032\n'
                        and (data[12:16] == b'IHDR')):
                      content_type = 'image/png'
                      w, h = struct.unpack(">LL", data[16:24])
                      width = int(w)
                      height = int(h)
              
                  # Maybe this is for an older PNG version.
                  elif (size >= 16) and data[0:8] == b'\211PNG\r\n\032\n':
                      # Check to see if we have the right content type
                      content_type = 'image/png'
                      w, h = struct.unpack(">LL", data[8:16])
                      width = int(w)
                      height = int(h)
              
                  # handle JPEGs
                  elif (size >= 2) and data[0:2] == b'\377\330':
                      content_type = 'image/jpeg'
                      jpeg = BytesIO(data)
                      jpeg.read(2)
                      b = jpeg.read(1)
                      w, h = -1, -1
                      try:
                          while b and ord(b) != 0xDA:
                              while ord(b) != 0xFF:
                                  b = jpeg.read(1)
                              while ord(b) == 0xFF:
                                  b = jpeg.read(1)
                              if 0xC0 <= ord(b) <= 0xC3:
                                  jpeg.read(3)
                                  h, w = struct.unpack(">HH", jpeg.read(4))
                                  break
                              else:
                                  jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0]) - 2)
                              b = jpeg.read(1)
                          width = int(w)
                          height = int(h)
                      except struct.error:
                          pass
                      except ValueError:
                          pass
              
                  # Maybe this will work for most EMF types.
                  elif (size >= 40) and data[0:4] == b'\001\000\000\000':
                      # Check to see if we have the right content type
                      content_type = 'image/x-emf'
                      x, y, r, b = struct.unpack("<LLLL", data[24:40])
                      width = int(r - x)
                      height = int(b - y)
              
                  return content_type, width, height
              
              

              【讨论】:

                猜你喜欢
                • 1970-01-01
                • 1970-01-01
                • 2018-11-23
                • 2018-10-07
                • 2018-03-24
                • 1970-01-01
                • 1970-01-01
                • 2016-03-03
                • 1970-01-01
                相关资源
                最近更新 更多