【问题标题】:YAML - Serializing attributes which are typesYAML - 序列化作为类型的属性
【发布时间】:2018-01-27 02:49:14
【问题描述】:

我在对具有类型引用作为成员的类进行 YAML 序列化时遇到问题。我正在使用 ruamel.yaml 的安全加载器。

我从 REPL 提示符运行了以下所有内容(以获取多个错误)。

初始化:

import sys
from ruamel.yaml import YAML, yaml_object

Y = YAML(typ="safe",pure=True)

# ==============

@yaml_object(Y)
class A(object):
    """Object I want to serialize"""
    yaml_tag = "!Aclass"
    def __init__(self, type):
        self.type = type
    def f(self):
        return self.type()
    pass

class T1(object):
    """This will be referenced."""
    pass

@yaml_object(Y)
class T2(object):
    """Another referenced object"""
    pass

class T3(object):
    """Yet another try"""
    pass
Y.register_class(T3.__class__)

导致失败的代码:

Y.dump(A(T1), sys.stdout)
Y.dump(A(T2), sys.stdout)
Y.dump(A(T3), sys.stdout)
Y.dump(A(int), sys.stdout)

此输出(仅最后几行回溯):

ruamel.yaml.representer.RepresenterError: cannot represent an object: <attribute '__dict__' of 'T1' objects>
ruamel.yaml.representer.RepresenterError: cannot represent an object: <attribute '__dict__' of 'T2' objects>
ruamel.yaml.representer.RepresenterError: cannot represent an object: <attribute '__dict__' of 'T3' objects>
ruamel.yaml.representer.RepresenterError: cannot represent an object: <slot wrapper '__abs__' of 'int' objects>

任何让我(安全地)唯一保存类型(我需要生成该类型的对象并检查传入对象是否属于某种类型)的解决方案都将不胜感激。函数或生成我所需类型的类也会有同样的问题,即不可序列化。


附:我还可能发现了一个错误,由于某种原因,解析器会根据是否(尝试)序列化相同的有效参数而具有不同的行为。

Y.dump(A(str), sys.stdout)
Y.dump(A(str), sys.stdout)
Y.dump(A(str), sys.stdout)
Y.dump(A(str), sys.stdout)

输出:

>>> Y.dump(A(str), sys.stdout)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\main.py", line 352, in dump
    return self.dump_all([data], stream, _kw, transform=transform)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\main.py", line 383, in dump_all
    self.representer.represent(data)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 73, in represent
    node = self.represent_data(data)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 101, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\main.py", line 552, in t_y
    tag, data, cls, flow_style=representer.default_flow_style)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 371, in represent_yaml_object
    return self.represent_mapping(tag, state, flow_style=flow_style)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 206, in represent_mapping
    node_value = self.represent_data(item_value)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 101, in represent_data
    node = self.yaml_representers[data_types[0]](self, data)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\main.py", line 492, in t_y
    tag, data, cls, flow_style=representer.default_flow_style)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 371, in represent_yaml_object
    return self.represent_mapping(tag, state, flow_style=flow_style)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 206, in represent_mapping
    node_value = self.represent_data(item_value)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 111, in represent_data
    node = self.yaml_representers[None](self, data)
  File "C:\Program Files\Anaconda3\lib\site-packages\ruamel\yaml\representer.py", line 375, in represent_undefined
    raise RepresenterError("cannot represent an object: %s" % data)
ruamel.yaml.representer.RepresenterError: cannot represent an object: <slot wrapper '__add__' of 'str' objects>
>>> Y.dump(A(str), sys.stdout)
!Aclass
type: !type {}
>>> Y.dump(A(str), sys.stdout)
Traceback (most recent call last):
# same traceback here
ruamel.yaml.representer.RepresenterError: cannot represent an object: <slot wrapper '__add__' of 'str' objects>
>>> Y.dump(A(str), sys.stdout)
!Aclass
type: !type {}
>>> 

【问题讨论】:

    标签: python-3.x serialization yaml ruamel.yaml


    【解决方案1】:

    YAML 期望转储对象,并最终通过写出标量字符串来实现。 T1 不是对象(T2T3 也不是),这就是问题所在。您可以尝试将每个类引用到一个对象中并在这些对象上使用标签,但 IMO 只会使事情复杂化。

    最终这一切都归结为获得一个标量表示,即类的字符串表示到文件中,因此您不妨调整A() 直接转储字符串表示并将其读回:

    import sys
    from ruamel.yaml import YAML, yaml_object
    from ruamel.yaml.compat import StringIO
    from ruamel.yaml.scalarstring import DoubleQuotedScalarString
    
    
    Y = YAML(typ="safe", pure=True)
    
    # ==============
    
    @yaml_object(Y)
    class A(object):
        """Object I want to serialize"""
        yaml_tag = "!Aclass"
        def __init__(self, type):
            self.type = type  #.__class__.__name__
    
        @classmethod
        def to_yaml(cls, representer, node):
            return representer.represent_scalar(
                cls.yaml_tag, u'{}'.format(node.type.__name__)
            )
    
        @classmethod
        def from_yaml(cls, constructor, node):
            if '.' in node.value:  # in some other module
                m, n = node.value.rsplit('.', 1)
                return cls(getattr(sys.modules[m], n))
            else:
                return cls(globals()[node.value])
    
    
    class T1(object):
        """This will be referenced."""
        pass
    
    
    @yaml_object(Y)
    class T2(object):
        """Another referenced object"""
        pass
    
    
    class T3(object):
        """Yet another try"""
        pass
    Y.register_class(T3)
    
    
    for t in T1, T2, T3, DoubleQuotedScalarString:
        print('----------------------')
        x = StringIO()
        s = A(t)
        print('s', s.type)
        Y.dump(s, x)
        print(x.getvalue())
    
        d = Y.load(x.getvalue())
        print('d', d.type)
    

    给出:

    ----------------------
    s <class '__main__.T1'>
    !Aclass T1
    ...
    
    d <class '__main__.T1'>
    ----------------------
    s <class '__main__.T2'>
    !Aclass T2
    ...
    
    d <class '__main__.T2'>
    ----------------------
    s <class '__main__.T3'>
    !Aclass T3
    ...
    
    d <class '__main__.T3'>
    ----------------------
    s <class 'ruamel.yaml.scalarstring.DoubleQuotedScalarString'>
    !Aclass DoubleQuotedScalarString
    ...
    
    d <class 'ruamel.yaml.scalarstring.DoubleQuotedScalarString'>
    

    如果A() 上还有其他需要转储/加载的属性,您应该创建一个字典(将字符串转换为.type)并转储/加载它。

    我认为您没有发现真正的错误,但您在错误后继续操作会产生副作用:Y 对象(及其组件)处于未定义状态。捕获错误后,您不应重用 YAML() 实例。这应该在文档中更清楚。所以如果你想在 for 循环中做一个try/except,你应该在try 部分中移动Y = YAML(typ='safe', pure=True)

    【讨论】:

    • 好吧,这似乎工作正常,如果我想保留例如T2 在我的 . 但是,由于我构造了给定类的对象,并且 Y 无法判断(至少,不是从您的 A 实现)加载对象是否安全,我可能正在加载恶意类型我的代码。在这种情况下,即使该类未注册,T1 也已“加载”。我猜,一种可能的解决方法是检查 A.from_yaml() 找到的类是否由 Y 注册(即它是否在 constructor.yaml_constructors 中?)。关于第二部分,这是有道理的。
    • 您对安全性的看法是对的,我在研究答案时想到了这一点,但忘记包括在内。我会将所有相关类型放在一个模块中(或将多个模块放在一个子目录中)并从那里导入它们。然后你可以在Afrom_yaml()中检查node.value中的字符串。您还可以创建自己的@yaml_type,设置您测试的(唯一)属性(在转储和/或加载时)。没有必要重新使用 ruamel.yaml 的对象注册转储(这本质上是别的东西)。
    • 我在我的答案中做了一个小模型,请随时修改它(或您自己的答案)以使其安全。编辑:我在发帖时错过了你的评论,不太清楚你的意思是什么?您的意思是使用 @yaml_type 装饰器在对象中设置隐藏属性吗?是什么阻止了某人对自己的班级做同样的事情并冒充我们的班级?
    【解决方案2】:

    添加到Anthon's answer,我开始修改 A.from_yaml 以更安全,尽管我还没有完成 _check_registered() 的所有案例。这个想法是加载 Y 允许加载实例的所有类型,并阻止所有其他类型。将其视为 WIP:

    import sys
    from ruamel.yaml import YAML, yaml_object
    from ruamel.yaml.compat import StringIO
    from ruamel.yaml.scalarstring import DoubleQuotedScalarString
    
    
    Y = YAML(typ="safe", pure=True)
    
    # ==============
    
    @yaml_object(Y)
    class A(object):
        """Object I want to serialize"""
        yaml_tag = "!Aclass"
        def __init__(self, type):
            self.type = type  #.__class__.__name__
    
        @classmethod
        def to_yaml(cls, representer, node):
            return representer.represent_scalar(
                cls.yaml_tag, u'{}'.format(node.type.__name__)
            )
    
        @classmethod
        def from_yaml(cls, constructor, node):
            if '.' in node.value:  # in some other module
                m, n = node.value.rsplit('.', 1)
                t = getattr(sys.modules[m], n)
            else:
                t = globals()[node.value]
            cls._check_registered(t,constructor, node)
            return cls(t)
    
        @classmethod
        def _check_registered(cls, t, constructor, node):
            # Check if type "t" is registered in "constr"
            # Note: only a very basic check, 
            # and ideally should be made more secure
    
            if hasattr(t,"yaml_tag"):
                if t.yaml_tag in constructor.yaml_constructors: 
    
                    return
                raise Exception("Error: Tag not registered!")
            else:
                #
                raise Exception("Error: No attribute 'yaml_tag'!")
            pass
    
        pass
    
    class T1(object):
        """This will be referenced."""
        yaml_tag = u"!T1"
        pass
    
    
    @yaml_object(Y)
    class T2(object):
        """Another referenced object"""
        yaml_tag = u"!T2"
    
        def __init__(self):
            print("Initializing...")
            pass
        pass
    
    class T2_bad(object):
        """Malicious class impersonating T2"""
        # Note: It's not registered
        yaml_tag = u"!T2"
    
        def __init__(self):
            print("Evil code here!")
            pass
    
        pass
    
    
    class T3(object):
        """Yet another try"""
        yaml_tag = u"!T3"
        pass
    Y.register_class(T3)
    
    
    
    for t in T1, T2, T2_bad, T3, DoubleQuotedScalarString:
        try:
            print('----------------------')
            x = StringIO()
            s = A(t)
            print('s', s.type)
            Y.dump(s, x)
            print(x.getvalue())
            d = Y.load(x.getvalue())
            print('d', d.type)
            d.type()
        except Exception as e:
            print(e)
            continue
        pass
    

    这会返回:

    ----------------------
    s <class '__main__.T1'>
    !Aclass T1
    ...
    
    Error: Tag not registered!
    ----------------------
    s <class '__main__.T2'>
    !Aclass T2
    ...
    
    d <class '__main__.T2'>
    Initializing...
    <__main__.T2 object at 0x0000015B8EC82F60>
    ----------------------
    s <class '__main__.T2_bad'>
    !Aclass T2_bad
    ...
    
    d <class '__main__.T2_bad'>
    Evil code here!
    <__main__.T2_bad object at 0x0000015B8EC82EF0>
    ----------------------
    s <class '__main__.T3'>
    !Aclass T3
    ...
    
    d <class '__main__.T3'>
    <__main__.T3 object at 0x0000015B8EC82E10>
    ----------------------
    s <class 'ruamel.yaml.scalarstring.DoubleQuotedScalarString'>
    !Aclass DoubleQuotedScalarString
    ...
    
    Error: No attribute 'yaml_tag'!
    

    可以看出,它仍然不安全(运行“邪恶代码”),也不允许未定义 yaml_tag 的类型。随意修改以解决此问题。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2022-01-05
      • 1970-01-01
      • 1970-01-01
      • 2020-02-19
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多