【问题标题】:XPath sorting not persistent?XPath 排序不持久?
【发布时间】:2023-03-14 00:51:02
【问题描述】:

我有以下 XML:

<doc>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>1</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
  </ActivityNarrativeInformation>
 <ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>3</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>2</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
  </ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>486</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>It was a dark and stormy night; the rain fell in torrents--except at occasional intervals, when
 </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>488</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>scene lies), rattling along the housetops, and fiercely agitating the scanty flame of the lamps that
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>487</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>was checked by a violent gust of wind which swept up the streets (for it is in London that our
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>987654321</ActivityID>
  <ActivityNarrativeInformationID>222222222</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>489</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>struggled against the darkness.
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31921</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>Papa Bear was very big and growly. Mamma Bear was middle-sized and pleasant.
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31923</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>Papa bear loved to fix things around the house; Mama bear loved to grow flowers in her garden; and, Baby bear loved playing in the yard. They were very happy. </ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31920</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>Once upon a time there were three bears, Papa Bear, Mamma Bear and Baby Bear
</ActivityNarrativeText>
</ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>55555555</ActivityID>
  <ActivityNarrativeInformationID>77777777</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>31922</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>And Baby Bear, well, he was small, and
sometimes he squeaked! They lived in a pretty little house on the edge of the forest
</ActivityNarrativeText>
</ActivityNarrativeInformation>
</doc

我需要将 ActivityNarrativeInformation 元素按 ActivityID 分组并以这样的方式连接 ActivityNarrativeText,使其按 ActivityNarrativeSequenceNumber 排序

我设法使用以下 XPath 查询 (XPath 3.1) 对元素进行排序 sort(//ActivityNarrativeInformation[ActivityID=123456789], (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber})

所以结果是这样的:

<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>1</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
  </ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>2</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>
  </ActivityNarrativeInformation>
<ActivityNarrativeInformation>
  <ActivityID>123456789</ActivityID>
  <ActivityNarrativeInformationID>111111111</ActivityNarrativeInformationID>
  <ActivityNarrativeSequenceNumber>3</ActivityNarrativeSequenceNumber>
  <ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
</ActivityNarrativeInformation>

然而问题是,如果我想通过在末尾添加/ActivityNarrativeText 来限制所有ActivityNarrativeText,就像这样

sort(//ActivityNarrativeInformation[ActivityID=123456789], (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber})/ActivityNarrativeText

sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText, (), function($seq) {$seq/ActivityNarrativeSequenceNumber})

订单丢失:

<ActivityNarrativeText>She Sells Sea Shells by the Sea Shore and she also</ActivityNarrativeText>
<ActivityNarrativeText>triple shot frappuccino, extra hot, with whipped cream in a tall cup </ActivityNarrativeText>
<ActivityNarrativeText>likes to take long walks on the beach while she drinks a</ActivityNarrativeText>

我做错了什么?

【问题讨论】:

    标签: xpath xpath-3.1


    【解决方案1】:

    当你写 /ActivityNarrativeText 时你会丢失顺序,它会以与输入文件中相同的顺序返回 &lt;ActivityNarrativeText&gt;

    /something 带有节点不仅仅意味着将其映射到子节点。

    意思是

    • 映射它

    • 将所有节点重新排序为输入文档顺序

    • 删除重复项

    你可以使用!ActivityNarrativeText

    【讨论】:

      【解决方案2】:

      除了在排序后不使用/ 而是使用! 的正确答案之外,如果您的排序函数参数选择正确的元素作为排序键,您的尝试之一实际上会起作用:

      sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText, (), function($text) {$text/../ActivityNarrativeSequenceNumber})
      

      【讨论】:

        【解决方案3】:

        如果您想做的是从特定的 ActivityID 的示例 xml 中提取一个 coherenet 句子,则此表达式

        string-join(sort(//ActivityNarrativeInformation[ActivityID=123456789]/ActivityNarrativeText/concat(normalize-space()," "), (), function($ActivityNarrativeSequenceNumber) {$ActivityNarrativeSequenceNumber}))
        

        应该输出

        She Sells Sea Shells by the Sea Shore and she also likes to take long walks on the beach while she drinks a triple shot frappuccino, extra hot, with whipped cream in a tall cup 
        

        【讨论】:

        • 在这里测试:videlibri.de/cgi-bin/xidelcgi,但结果仍然如描述的那样,即顺序不正确
        • @Macin 有趣;我在 Base-X 中尝试过,它可以工作,但现在它似乎只适用于特定的 ActivityID,但不适用于其他两个......让我进一步检查。
        【解决方案4】:

        在这里进行测试:videlibri.de/cgi-bin/xidelcgi

        如果您使用的是,请添加其标签。也许 用于Windows,或者 用于Unix。

        我不太确定这是否可以使用 XPath。 我相信您最好使用 XQuery。

        对于&lt;ActivityID&gt;123456789&lt;/ActivityID&gt; 的叙述,您可以这样做:

        $ xidel -s input.xml --xquery '
          normalize-space(
            for $x in //ActivityNarrativeInformation
            where $x/ActivityID = 123456789
            order by $x/ActivityNarrativeSequenceNumber
            return
            $x/ActivityNarrativeText
          )
        '
        

        我建议的所有叙述:

        $ xidel -s input.xml --xquery '
          for $narrative at $i in //ActivityNarrativeInformation
          group by $id:=$narrative/ActivityID
          count $i
          return (
            $i,
            normalize-space(
              for $seq in $narrative
              order by $seq/ActivityNarrativeSequenceNumber
              return
              $seq/ActivityNarrativeText
            )
          )
        '
        1
        Once upon a time there were three bears, [...]
        2
        She Sells Sea Shells by the Sea Shore and [...]
        3
        It was a dark and stormy night; the rain [...]
        

        首先按&lt;ActivityID&gt; 分组,然后在另一个for 循环中按&lt;ActivityNarrativeSequenceNumber&gt; 对句子进行排序。

        2021-07-05 更新;我忘记了 XPath 的!。在这种情况下,一个 for 循环就足够了:

        $ xidel -s input.xml --xquery '
          for $narrative at $i in //ActivityNarrativeInformation
          order by $narrative/ActivityNarrativeSequenceNumber
          group by $id:=$narrative/ActivityID
          count $i
          return (
            $i,
            normalize-space($narrative ! ActivityNarrativeText)
          )
        '
        

        【讨论】:

        • normalize-space() 如何在这里处理序列? normalize-space() 应该最多接受一个字符串。当将一系列字符串传递给 xidel 的 normalize-space() 时,它似乎只作用于第一个字符串。 Saxon 和 Zorba 都正确地给出了基数错误。在这个例子中,$narrative 是否总是只有一个项目?
        • @DavidDenenberg 不,$narrative 是一个序列。如果normalize-space() 和你的xidel 二进制文件只返回第一个字符串,那么你的二进制文件太旧了。 normalize-space() 接受序列作为输入。这至少是我对w3.org/TR/xpath-functions-31/#func-normalize-space 的解释。最近的 xidel 构建支持它。
        • 是的,它似乎是一个较旧的二进制文件。我不会怀疑最新版本的 xidel 是否支持这一点,但这是对规范的错误解释。清楚地显示“xs:string?”作为参数(零个或一个原子字符串)。
        • @DavidDenenberg: Xidel 在standard XQuery mode 中也给出了一个错误。但是默认情况下所有类型检查都是禁用的,因为这样查询评估的运行速度会快 20%
        猜你喜欢
        • 2016-02-26
        • 1970-01-01
        • 2019-05-13
        • 1970-01-01
        • 2010-12-16
        • 1970-01-01
        • 1970-01-01
        • 2021-09-19
        • 1970-01-01
        相关资源
        最近更新 更多