【问题标题】:Generating anchors with PyYAML.dump()?使用 PyYAML.dump() 生成锚点?
【发布时间】:2013-08-06 14:08:06
【问题描述】:

我希望能够在由 PyYAML 的 dump() 函数生成的 YAML 中生成锚点。有没有办法做到这一点?理想情况下,锚点应与 YAML 节点同名。

例子:

import yaml
yaml.dump({'a': [1,2,3]})
'a: [1, 2, 3]\n'

我希望能够生成 YAML,例如:

import yaml
yaml.dump({'a': [1,2,3]})
'a: &a [1, 2, 3]\n'

我可以编写自定义发射器或转储器来执行此操作吗?还有其他方法吗?

【问题讨论】:

    标签: yaml pyyaml cross-reference


    【解决方案1】:

    默认情况下,只有在检测到对先前看到的对象的引用时才会发出锚点:

    >>> import yaml
    >>>
    >>> foo = {'a': [1,2,3]}
    >>> doc = (foo,foo)
    >>>
    >>> print yaml.safe_dump(doc, default_flow_style=False)
    - &id001
      a:
      - 1
      - 2
      - 3
    - *id001
    

    如果您想覆盖它的命名方式,您必须自定义Dumper class,特别是generate_anchor() 函数。 ANCHOR_TEMPLATE 也可能有用。

    在您的示例中,节点名称很简单,但您需要考虑 YAML 值的多种可能性,即它可能是一个序列而不是单个值:

    >>> import yaml
    >>>
    >>> foo = {('a', 'b', 'c'): [1,2,3]}
    >>> doc = (foo,foo)
    >>>
    >>> print yaml.dump(doc, default_flow_style=False)
    !!python/tuple
    - &id001
      ? !!python/tuple
      - a
      - b
      - c
      : - 1
        - 2
        - 3
    - *id001
    

    【讨论】:

      【解决方案2】:

      这并不容易。除非您要用于锚点的数据是 inside 节点。这是因为锚点附加到节点内容,在您的示例中为“[1,2,3]”,并且不知道该值与键“a”相关联。

      l = [1, 2, 3]
      foo = {'a': l, 'b': l}
      class SpecialAnchor(yaml.Dumper):
      
          def generate_anchor(self, node):
              print('Generating anchor for {}'.format(str(node)))
              anchor =  super().generate_anchor(node)
              print('Generated "{}"'.format(anchor))
              return anchor
      
      y1 = yaml.dump(foo, Dumper=Anchor)
      

      给你:

      Generating anchor for SequenceNode(tag='tag:yaml.org,2002:seq', value=[ScalarNode(tag='tag:yaml.org,2002:int', value='1'), ScalarNode(tag='tag:yaml.org,2002:int', value='2'), ScalarNode(tag='tag:yaml.org,2002:int', value='3')])
      Generated "id001"
      a: &id001 [1, 2, 3]
      b: *id001
      

      到目前为止,我还没有找到一种方法来获取给定节点的键 'a'...

      【讨论】:

      • 你是说 Dumper=SpecialAnchor 吗?
      【解决方案3】:

      我编写了一个自定义锚类来强制顶级节点的锚值。它不是简单地覆盖锚字符串(使用generate_anchor),而是实际上强制发出Anchor,即使稍后没有引用该节点:

      class CustomAnchor(yaml.Dumper):
          def __init__(self, *args, **kwargs):
              super(CustomAnchor, self).__init__(*args, **kwargs)
              self.depth = 0
              self.basekey = None
              self.newanchors = {}
      
          def anchor_node(self, node):
              self.depth += 1
              if self.depth == 2:
                  assert isinstance(node, yaml.ScalarNode), "yaml node not a string: %s" % node
                  self.basekey = str(node.value)
                  node.value = self.basekey + "_ALIAS"
              if self.depth == 3:
                  assert self.basekey, "could not find base key for value: %s" % node
                  self.newanchors[node] = self.basekey
              super(CustomAnchor, self).anchor_node(node)
              if self.newanchors:
                  self.anchors.update(self.newanchors)
                  self.newanchors.clear()
      

      请注意,我将节点名称覆盖为以“_ALIAS”为后缀,但您可以删除该行以使节点名称和锚名称保持不变,或者将其更改为其他名称。

      例如转储 {'FOO': 'BAR'} 结果:

      FOO_ALIAS:&FOO BAR

      另外,我只写它一次处理单个顶级键/值对,它只会强制顶级键的锚点。如果要将 dict 转换为 YAML 文件,所有键都是顶级 YAML 节点,则需要遍历 dict 并将每个键/值对转储为 {key:value},或重写此类以处理带有多个键的字典。

      【讨论】:

        【解决方案4】:

        这个问题已经很老了,aaa90210 in his answer 已经有一些很好的建议,但是提供的类并没有真正做到我想要的,我认为它不能很好地概括。

        我试图想出一个允许添加锚点的转储程序,并确保在文件稍后再次出现密钥时创建相应的别名。

        这绝不是功能齐全,它可能会变得更安全,但我希望它可以对其他人有所启发:

        import yaml
        from typing import Dict
        
        
        class CustomAnchor(yaml.Dumper):
            """Customer Dumper class to create anchors for keys throughout the YAML file.
        
            Attributes:
                added_anchors: mapping of key names to the node objects representing their value, for nodes that have an anchor
            """
        
            def __init__(self, *args, **kwargs):
                """Initialize class.
        
                We call the constructor of the parent class.
                """
                super().__init__(*args, **kwargs)
                self.filter_keys = ['a', 'b']
                self.added_anchors: Dict[str, yaml.ScalarNode] = {}
        
            def anchor_node(self, node):
                """Override method from parent class.
        
                This method first checks if the node contains the keys of interest, and if anchors already exist for these keys,
                replaces the reference to the value node to the one that the anchor points to. In case no anchor exist for
                those keys, it creates them and keeps a reference to the value node in the ``added_anchors`` class attribute.
        
                Args:
                    node (yaml.Node): the node being processed by the dumper
                """
                if isinstance(node, yaml.MappingNode):
                    # let's check through the mapping to find keys which are of interest
                    for i, (key_node, value_node) in enumerate(node.value):
                        if (
                            isinstance(key_node, yaml.ScalarNode)
                            and key_node.value in self.filter_keys
                        ):
                            if key_node.value in self.added_anchors:  # anchor exists
                                # replace value node to tell the dumper to create an alias
                                node.value[i] = (key_node, self.added_anchors[key_node.value])
                            else:  # no anchor yet exists but we need to create one
                                self.anchors.update({value_node: key_node.value})
                                self.added_anchors[key_node.value] = value_node
                super().anchor_node(node)
        
        

        【讨论】:

          【解决方案5】:

          我根本无法让@beeb 的答案运行,所以我继续尝试概括@aaa90210 的答案

          import yaml
          
          class _CustomAnchor(yaml.Dumper):
            anchor_tags = {}
            def __init__(self,*args,**kwargs):
              super().__init__(*args,**kwargs)
              self.new_anchors = {}
              self.anchor_next = None
            def anchor_node(self, node):
              if self.anchor_next is not None:
                self.new_anchors[node] = self.anchor_next
                self.anchor_next = None
              if isinstance(node.value, str) and node.value in self.anchor_tags:
                self.anchor_next = self.anchor_tags[node.value]
          
              super().anchor_node(node)
          
              if self.new_anchors:
                self.anchors.update(self.new_anchors)
                self.new_anchors.clear()
          def CustomAnchor(tags):
            return type('CustomAnchor', (_CustomAnchor,), {'anchor_tags': tags})
          
          print(yaml.dump(foo, Dumper=CustomAnchor({'a': 'a_name'})))
          

          这没有提供区分具有相同名称值的两个节点的方法,这需要一个等效于 XML 的 xpath 的 yaml,我在 pyyaml 中看不到 :(


          类工厂CustomAnchor 允许您传入基于节点值的锚点字典。 {value: anchor_name}

          【讨论】:

            猜你喜欢
            • 2012-12-25
            • 2016-01-31
            • 2010-10-22
            • 2018-05-13
            • 1970-01-01
            • 2012-06-01
            • 1970-01-01
            • 1970-01-01
            • 2017-05-17
            相关资源
            最近更新 更多