【问题标题】:python: how to write a generic file reader with format pluginspython:如何使用格式插件编写通用文件阅读器
【发布时间】:2011-03-30 17:46:45
【问题描述】:

我正在尝试编写各种医学图像格式的通用阅读器 我们遇到了。我想,让我们向专业人士学习,然后去模仿如何 PIL generically reads files(“Python 图像库”,格式)。

据我了解,PIL 有一个开放函数,可以循环遍历可能的列表 接受功能。当一个工作时,它使用相关的工厂函数来实例化 合适的对象。

所以我去做了,我的(精简的)努力就在这里:


pluginID = []     # list of all registered plugin IDs
OPEN = {}         # plugins have open and (maybe) accept functions as a tuple

_initialized = False

import os, sys

def moduleinit():
    '''Explicitly initializes the library.  This function 
    loads all available file format drivers.

    This routine has been lifted from PIL, the Python Image Library'''

    global _initialized
    global pluginID
    if _initialized:
        return 

    visited = {}

    directories = sys.path

    try:
        directories = directories + [os.path.dirname(__file__)]
    except NameError:
        pass

    # only check directories (including current, if present in the path)
    for directory in filter(isDirectory, directories):
        fullpath = os.path.abspath(directory)
        if visited.has_key(fullpath):
            continue
        for file in os.listdir(directory):
            if file[-19:] == "TestReaderPlugin.py":
                f, e = os.path.splitext(file)
                try:
                    sys.path.insert(0, directory)
                    try: # FIXME: this will not reload and hence pluginID 
                        # will be unpopulated leading to "cannot identify format"
                        __import__(f, globals(), locals(), [])
                    finally:
                        del sys.path[0]
                except ImportError:
                    print f, ":", sys.exc_value
        visited[fullpath] = None

    if OPEN:
        _initialized = True
        return 1

class Reader:
    '''Base class for image file format handlers.'''
    def __init__(self, fp=None, filename=None):

        self.filename = filename

        if isStringType(filename):
            import __builtin__
            self.fp = __builtin__.open(filename) # attempt opening

        # this may fail if not implemented
        self._open() # unimplemented in base class but provided by plugins

    def _open(self):
        raise NotImplementedError(
            "StubImageFile subclass must implement _open"
            )


# this is the generic open that tries to find the appropriate handler
def open(fp):
    '''Probe an image file

    Supposed to attempt all opening methods that are available. Each 
    of them is supposed to fail quickly if the filetype is invalid for its 
    respective format'''

    filename=fp

    moduleinit() # make sure we have access to all the plugins

    for i in pluginID:
        try:
            factory, accept = OPEN[i]
            if accept:
                fp = accept(fp)
                # accept is expected to either return None (if unsuccessful) 
                # or hand back a file handle to be used for opening                                 
                if fp:
                    fp.seek(0)  
                    return factory(fp, filename=filename) 
        except (SyntaxError, IndexError, TypeError): 
                pass # I suppose that factory is allowed to have these 
                # exceptions for problems that weren't caught with accept()
                # hence, they are simply ignored and we try the other plugins

    raise IOError("cannot identify format")

# --------------------------------------------------------------------
# Plugin registry

def register_open(id, factory, accept=None):
    pluginID.append(id)
    OPEN[id] = factory, accept

# --------------------------------------------------------------------
# Internal:

# type stuff
from types import  StringType

def isStringType(t):
    return isinstance(t, StringType)

def isDirectory(f):
    '''Checks if an object is a string, and that it points to a directory'''
    return isStringType(f) and os.path.isdir(f)

幕后的重要一点是所有格式插件的注册 第一次尝试打开文件 (moduleinit)。每一个符合条件的 插件必须位于可访问的路径中并命名为 *TestReaderPlugin.py。它会 获取(动态)导入。每个插件模块都必须调用一个 register_open 提供一个 ID、一个创建文件的方法和一个接受测试的函数 候选文件。

示例插件如下所示:


import TestReader

def _accept(filename):
    fp=open(filename,"r")
    # we made it here, so let's just accept this format
    return fp

class exampleTestReader(TestReader.Reader):
    format='example'

    def _open(self):
        self.data = self.fp.read()

TestReader.register_open('example', exampleTestReader, accept=_accept)

TestReader.open() 是用户将使用的函数:

import TestReader
a=TestReader.open(filename) # easy

那么 - 问题出在哪里?首先,我仍在寻找pythonic 方式。是这个吗?我怀疑的理由是 moduleinit 的魔力 舞台看起来很乱。它是直接从 PIL 复制的。主要问题:如果你 reload(TestReader), 因为 ID 被初始化为 [], 它将全部停止工作, 但插件不会重新加载。

有没有更好的方法来设置通用阅读器
1. 允许对所有格式和
进行简单的 open(filename) 调用 2. 只需要为您想要的任何格式提供封装良好的插件。
3. 可以重装吗?

【问题讨论】:

    标签: python image-processing polymorphism filereader


    【解决方案1】:

    一些准则:

    1. 使用“窥视”缓冲区的概念来测试是否有您可以理解的数据数据。
    2. 知道进口商的名称是用户不想知道的(如果您有 100 个进口商怎么办) 使用“外观”界面medicimage.open(filepath)
    3. 要进行重新加载,您必须实现一些逻辑,有一些关于如何实现的示例

    【讨论】:

    • 我想我在做 1 和 2。(1)取决于实现的接受方法。给出的示例简单地接受任何格式,甚至无需窥视,但如果需要可以。 (2) 是通过让 TestReader.open 成为您提到的外观接口来实现的。你能更具体地说明(3)吗?想到的两个选项:(a)检查 _initialized 是否已定义,如果为 true,则不要重置 pluginID。 (b) 深度重装的一些变体。这些听起来都不是很pythonic因此问题......
    猜你喜欢
    • 2021-05-30
    • 2013-02-05
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2023-04-04
    • 1970-01-01
    相关资源
    最近更新 更多