【问题标题】:Load class instance from pickle file during instantiation with HIGHEST_PROTOCOL在使用 HIGHEST_PROTOCOL 实例化期间从 pickle 文件加载类实例
【发布时间】:2021-12-18 20:18:45
【问题描述】:

目标

  • 当我创建一个 ClassA 的实例时:
    • 如果存在pickle文件,则从pickle文件加载实例
    • 如果 pickle 文件不存在,则从头开始创建实例
  • 使用pickle HIGHEST_PROTOCOL。

失败

有人reported here 在评论中遇到同样的问题。 这个问题也描述了in this post:如果协议> 1,那么pickle在加载过程中调用__new__,这会创建一个无限递归。

# module_a.py
import os
import pickle
# import dill as pickle

save_path = r'C:\tests\pickle_tests\saved_instance_of_a.pkl'

def load(path):
    with open(path, 'rb') as f:
        return pickle.load(f)

def dump(x, path):
    with open(path, 'wb') as f:
        pickle.dump(
            x, f,
            protocol=pickle.HIGHEST_PROTOCOL)

class ClassA:
    def __new__(cls):
        print('__new__ called')

        if os.path.isfile(save_path):
            print('The saved pickle exists: loading from file.')
            instance = load(save_path)

        else:
            print('The saved pickle does not exist: creating.')
            instance = super(ClassA, cls).__new__(cls)

        return instance

    def __init__(self):
        print('__init__ called')
        if not os.path.isfile(save_path):
            self.my_dict = {'pi': 3.14}
            dump(self, save_path)
# myprogram.py
import os
import module_a

if __name__ =='__main__':
    instance_a = module_a.ClassA()
    print(instance_a.my_dict)

第一次运行OK(从头开始创建实例):

$ python myprogram.py
__new__ called
The saved pickle does not exist: creating.
__init__ called
{'pi': 3.14}

第二次运行失败(从pickle加载实例):

$ python myprogram.py
__new__ called
The saved pickle exists: loading from file.
__new__ called
The saved pickle exists: loading from file.
__new__ called
The saved pickle exists: loading from file.
__new__ called
The saved pickle exists: loading from file.
__new__ called
...
  File "C:\tests\pickle_tests\module_a.py", line 19, in __new__
    print('__new__ called')
  File "C:\anaconda\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
RecursionError: maximum recursion depth exceeded while calling a Python object

当前的解决方法

如果我替换上面的代码有效 protocol=pickle.HIGHEST_PROTOCOL 经过 protocol=0 但我不想使用协议 0(或 1)。我想使用protocol=pickle.HIGHEST_PROTOCOL

第一次运行OK:

$ python myprogram.py
__new__ called
The saved pickle does not exist: creating.
__init__ called
{'pi': 3.14}

第二次运行正常:

$ python myprogram.py
__new__ called
The saved pickle exists: loading from file.
__init__ called
{'pi': 3.14}

【问题讨论】:

    标签: python pickle instantiation


    【解决方案1】:

    由于 pickle 调用 __new__ 是导致问题的原因,因此允许使用 HIGHEST_PROTOCOL 的简单解决方法是您自己使用它并在 @987654324 中执行所有操作@方法。

    这是一种方法:

    myprogram.py:

    import os
    import module_a
    
    if __name__ =='__main__':
        instance_a = module_a.ClassA()
        print(instance_a.my_dict)
    

    module_a.py:

    import os
    import pickle
    
    SAVE_PATH = r'C:\tests\pickle_tests\saved_instance_of_a.pkl'
    
    def load(path):
        with open(path, 'rb') as f:
            return pickle.load(f)
    
    def dump(x, path):
        with open(path, 'wb') as f:
            pickle.dump(x, f, protocol=pickle.HIGHEST_PROTOCOL)
    
    
    class ClassA:
        def __init__(self):
            print('__init__ called')
            if os.path.isfile(SAVE_PATH):
                print('  loading from pickle file.')
                self.__dict__ = load(SAVE_PATH)
            else:
                print('  creating from scratch.')
                self.my_dict = {'pi': 3.14}
                dump(self.__dict__, SAVE_PATH)
    

    【讨论】:

    • 谢谢@Martineau。我以为我需要__new__,因为self = load(SAVE_PATH)__init__ 中失败了。在您的解决方案中,不是转储/加载实例本身,而是转储/加载其__dict__,我没有想到。我在思考__dict__ 是否足以捕获保存的实例的所有内容(__dict__ = "用于存储对象(可写)属性的字典或其他映射对象")?如果是,那么这是一个很好的解决方案。
    • self = load(SAVE_PATH) 不起作用,因为它所做的只是为局部变量名称 self 分配一个不同的值(即它不会按照您的想法进行操作)。此技巧不适用于必须具有__new__() 方法的不可变类(例如tuple 的子类)。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-01-20
    • 1970-01-01
    • 1970-01-01
    • 2012-06-08
    • 1970-01-01
    相关资源
    最近更新 更多