【问题标题】:xQuery to change node hierarchies (remove a child from one node and return it as sibling)xQuery 更改节点层次结构(从一个节点中删除一个子节点并将其作为兄弟节点返回)
【发布时间】:2014-06-16 00:31:51
【问题描述】:

我有一个如下所示的 xml 文档:

<dict>
    <word>
        <sense>
            <definition> This is the text of the definition. 
                <example>
                    <quote>This is the text of an example.</quote>
                </example>
                <source>
                    <place>This is the name of the place recorded</place>
                </source>. 
            </definition>
        </sense>
    </word>
</dict>

我需要使用 xQuery 对其进行转换,使&lt;example&gt; 及其子代成为&lt;definition&gt; 的兄弟姐妹,而&lt;source&gt; 及其子代应成为&lt;example&gt; 的子代。换句话说,我需要这个作为输出:

<word>
    <sense>
        <definition> This is the text of the definition. </definition>
        <example>
            <quote>This is the text of an example.</quote>
            <source>
                <place>This is the name of the place recorded.</place>
            </source>
        </example>
    </sense>
</word>

如您所见,原始&lt;source&gt; 元素后面的句号也存在问题,该元素需要成为&lt;place&gt; 关闭之前的最后一个字符串。

我创建了一个 xQuery 文件并想出了如何从层次结构中删除元素,但是我在递归处理节点和在同一函数中添加新元素时遇到了麻烦。

xquery version "3.0";
declare namespace saxon="http://saxon.sf.net/";
declare option saxon:output "indent=yes";
declare option saxon:output "saxon:indent-spaces=3";


declare function local:test($node as item()*) as item()* {
    typeswitch($node)
        case text() return normalize-space($node)
        case element(word) return <word>{local:recurse($node)}</word>
        case element(dict) return <dict>{local:recurse($node)}</dict>
        case element(sense) return <sense>{local:recurse($node)}</sense>
        case element(definition) return local:definition($node)
        case element(example) return local:example($node)
        case element(source) return local:source($node)
        case element(place) return <place>{local:recurse($node)}</place>
        default return local:recurse($node)
};

declare function local:definition($nodes as item()*) as item()*{

(: here I need to process children of definition - except <source> and its
children will become children of <example>; and <example> should be returned 
as a next sibling of definition. THIS IS THE PART THAT I DON'T KNOW HOW TO DO :)

<definition>
{
 for $node in $nodes/node()
    return
        local:test($node)
}
</definition>

};

declare function local:example($node as item()*) as item()* {
(: here i am removing <example> because I don't want it to be a child
of <definition> any more. THIS BIT WORKS AS IT SHOULD :)

if ($node/parent::definition) then ()
   else <example>{$node/@*}{local:recurse($node)}</example>
};

declare function local:source($node as item()*) as item()* {
(: here i am removing <source> because I don't want it to be a child
of <definition> any more.  :)

if ($node/parent::definition) then ()
   else <example>{$node/@*}{local:recurse($node)}</example>
};


declare function local:recurse($nodes as item()*) as item()* {
    for $node in $nodes/node()
    return
        local:test($node)
};


local:test(doc("file:test.xml"))

这应该不是一件非常困难的事情,但我在 xQuery 如何处理此类问题时遇到了概念上的困难。非常感谢您的帮助。

XSLT 不是一个选项。

【问题讨论】:

    标签: xml xpath recursion xml-parsing xquery


    【解决方案1】:

    为了完整起见,这里是一个只有一个递归函数的递归 XQuery 1.0 解决方案。我同意 Jens 的观点,即无需递归即可轻松处理给定的示例,但如果实际示例更大,并且您没有 XQuery Update 可供您使用,您可以尝试以下操作:

    declare function local:recurse($node as item()*) as item()* {
        typeswitch($node)
            case text()
                return normalize-space($node)
            case element(definition)
                return element {node-name($node)} {
                    $node/@*,
                    local:recurse($node/node() except $node/(example|source))
                }
            case element(sense)
                return element {node-name($node)} {
                    $node/@*,
                    local:recurse($node/node()),
                    <example>{
                        $node/definition/example/@*,
                        $node/definition/example/node(),
                        $node/definition/source
                    }</example>
                }
            case element()
                return element {node-name($node)} {
                    $node/@*,
                    local:recurse($node/node())
                }
            default return $node
    };
    
    
    let $xml :=
    <dict>
        <word>
            <sense>
                <definition> This is the text of the definition. 
                    <example>
                        <quote>This is the text of an example.</quote>
                    </example>
                    <source>
                        <place>This is the name of the place recorded.</place>
                    </source>
                </definition>
            </sense>
        </word>
    </dict>
    return local:recurse($xml)
    

    HTH!

    【讨论】:

      【解决方案2】:

      我会选择 XQuery 更新,它也受到 Saxon 的支持,这将使这变得更容易。这会复制输入文件,但只需进行少量修改,您也可以直接更改原始文档。

      (: Copy the input file :)
      copy $result := doc("file:test.xml")
      modify (
        for $definition in $result//definition
        return (
          (: Create new example element, and add it after the definition :)
          insert node element example {
            $definition/example/quote,
            $definition/source
          } after $definition,
          (: Throw away the old elements :)
          delete nodes $definition/(example, source)
        )
      )
      return $result/dict/word
      

      请注意,如果点错位,这并不能修复损坏的输入,但我也没有在您的代码中看到任何解决方法。

      如果您更喜欢没有更新语句的版本,仍然不需要使用递归函数的复杂方法:

      for $word in doc("file:test.xml")/dict/word
      return element word {
        for $sense in $word/sense
        return element sense {
          for $definition in $sense/definition
          return (
            element definition { $definition/text() },
            element example { $definition/(example/quote, source) }
          )
        }
      }
      

      【讨论】:

      • 非常感谢 Jens。我的实际代码比我上面显示的更复杂,在这种特殊情况下我不能使用 xQueryUpdate — 但你的两个建议都有效,我非常感谢他们。
      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2014-02-05
      • 1970-01-01
      • 2023-02-10
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多