【问题标题】:XSLT wrap specified element under condition in new parent tagXSLT 在新的父标记条件下包装指定元素
【发布时间】:2020-07-29 11:06:58
【问题描述】:

我的 XML 结构如下:

<?xml version="1.0" encoding="utf-8" ?>
<pages>
<page id="1" bbox="0.000,0.000,462.047,680.315" rotate="0">
<textbox id="0" bbox="191.745,592.218,249.042,603.578">
<textline bbox="191.745,592.218,249.042,603.578">
<text font="NUMPTY+ImprintMTnum" bbox="191.745,592.218,199.339,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">A</text>
<text font="NUMPTY+ImprintMTnum" bbox="199.227,592.218,205.657,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">P</text>
<text font="NUMPTY+ImprintMTnum" bbox="205.545,592.218,211.975,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">P</text>
<text font="NUMPTY+ImprintMTnum" bbox="211.023,592.218,218.617,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">A</text>
<text font="NUMPTY+ImprintMTnum" bbox="218.515,592.218,226.109,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">R</text>
<text font="NUMPTY+ImprintMTnum" bbox="226.008,592.218,233.602,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">A</text>
<text font="NUMPTY+ImprintMTnum" bbox="232.812,592.218,240.932,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">T</text>
<text font="NUMPTY+ImprintMTnum" bbox="240.922,592.218,249.042,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">O</text>
</textline>
</textbox>
<textbox id="1" bbox="44.614,554.008,58.101,564.246">
<textline bbox="44.614,554.008,58.101,564.246">
<text font="NUMPTY+ImprintMTnum" bbox="44.614,554.008,49.369,564.246" colourspace="DeviceGray" ncolour="0" size="10.238">2</text>
<text font="NUMPTY+ImprintMTnum" bbox="49.268,554.008,54.022,564.246" colourspace="DeviceGray" ncolour="0" size="10.238">4</text>
<text font="NUMPTY+ImprintMTnum" bbox="53.922,554.008,58.101,564.246" colourspace="DeviceGray" ncolour="0" size="10.238">a</text>
</textline>
</textbox>
<textbox id="2" bbox="43.563,475.008,58.117,485.246">
<textline bbox="43.563,475.008,58.117,485.246">
<text font="NUMPTY+ImprintMTnum" bbox="43.563,475.008,48.317,485.246" colourspace="DeviceGray" ncolour="0" size="10.238">2</text>
<text font="NUMPTY+ImprintMTnum" bbox="48.226,475.008,52.980,485.246" colourspace="DeviceGray" ncolour="0" size="10.238">4</text>
<text font="NUMPTY+ImprintMTnum" bbox="52.889,475.008,58.117,485.246" colourspace="DeviceGray" ncolour="0" size="10.238">b</text>
</textline>
</textbox>
</page>
</pages>

实际上更长。每次bbox属性的第一个数字与第一个数字和下一个bbox属性之间有一定距离时,我想插入一个&lt;newline&gt;父标签。我希望仅在需要打开另一个标签时才关闭标签。一切正常,但我不知道如何包装这些文本元素,因为我是 XSLT 的新手。

这是目前为止的代码:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" />

  <xsl:template match="textbox">
    <xsl:apply-templates select="textline" />  
    <xsl:text>&#xA;</xsl:text>
  </xsl:template>

  <xsl:template match="textline">
    <xsl:apply-templates select="text" />  
  </xsl:template>

  <xsl:template match="text[@bbox and text()]">
    <!-- each @bbox has this format: "x1,y1,x2,y2" (top-left/bottom-right coordinates) -->
    <xsl:variable name="x1this" select="number(substring-before(@bbox, ','))" />
    <xsl:variable name="x2prev" select="number(substring-before(substring-after(substring-after(preceding-sibling::text[@bbox][1]/@bbox, ','), ','), ','))" />
    <xsl:variable name="distance" select="$x1this - $x2prev" />
    <xsl:variable name="nextCharacter" select="following-sibling::text[normalize-space()][1]" />
    <xsl:variable name="isEOL" select="not($nextCharacter)" />
    <xsl:variable name="isHyphen" select=". = '-'" />
    <xsl:choose>
        <xsl:when test="$distance &gt; 10">
          <newline></newline>          
        </xsl:when>
        <xsl:when test="$distance &gt; 2">
            <whitespace></whitespace><!-- regular space for small gap -->
        </xsl:when>
    </xsl:choose>
    <xsl:choose>
        <xsl:when test="$isHyphen and $isEOL"></xsl:when><!-- suppress hyphens at the end of the line -->
        <xsl:when test="not($isHyphen) and $isEOL"><xsl:copy-of select="concat(., '&#xA;')" /></xsl:when><!-- add newline at the end of non-hyphenated lines -->
        <xsl:otherwise><xsl:copy-of select="." /></xsl:otherwise><!-- output character as-is -->
    </xsl:choose>
  </xsl:template>

  <xsl:template match="text">
    <xsl:if test="following-sibling::text"><!-- suppress end-of-line spaces, this re-connects hyphenated words -->
        <xsl:text> </xsl:text>
    </xsl:if>
  </xsl:template>
</xsl:stylesheet>

编辑:预期输出如下:

    <text font="NUMPTY+ImprintMTnum" bbox="191.745,592.218,199.339,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">A</text><text font="NUMPTY+ImprintMTnum" bbox="199.227,592.218,205.657,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">P</text><text font="NUMPTY+ImprintMTnum" bbox="205.545,592.218,211.975,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">P</text><text font="NUMPTY+ImprintMTnum" bbox="211.023,592.218,218.617,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">A</text><text font="NUMPTY+ImprintMTnum" bbox="218.515,592.218,226.109,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">R</text><text font="NUMPTY+ImprintMTnum" bbox="226.008,592.218,233.602,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">A</text><text font="NUMPTY+ImprintMTnum" bbox="232.812,592.218,240.932,603.578" colourspace="DeviceGray" ncolour="0" size="11.360">T</text>O


    <text font="NUMPTY+ImprintMTnum" bbox="44.614,554.008,49.369,564.246" colourspace="DeviceGray" ncolour="0" size="10.238">2</text><text font="NUMPTY+ImprintMTnum" bbox="49.268,554.008,54.022,564.246" colourspace="DeviceGray" ncolour="0" size="10.238">4</text>a


    <text font="NUMPTY+ImprintMTnum" bbox="43.563,475.008,48.317,485.246" colourspace="DeviceGray" ncolour="0" size="10.238">2</text><text font="NUMPTY+ImprintMTnum" bbox="48.226,475.008,52.980,485.246" colourspace="DeviceGray" ncolour="0" size="10.238">4</text>b


    <text font="NUMPTY+ImprintMTnum" bbox="44.614,421.608,49.369,431.846" colourspace="DeviceGray" ncolour="0" size="10.238">2</text><text font="NUMPTY+ImprintMTnum" bbox="49.268,421.608,54.022,431.846" colourspace="DeviceGray" ncolour="0" size="10.238">4</text>c


    <text font="NUMPTY+ImprintMTnum" bbox="43.563,339.508,48.317,349.746" colourspace="DeviceGray" ncolour="0" size="10.238">2</text><text font="NUMPTY+ImprintMTnum" bbox="48.226,339.508,52.980,349.746" colourspace="DeviceGray" ncolour="0" size="10.238">4</text>d


    <text font="NUMPTY+ImprintMTnum" bbox="44.949,237.108,49.703,247.347" colourspace="DeviceGray" ncolour="0" size="10.238">2</text><text font="NUMPTY+ImprintMTnum" bbox="49.274,237.108,54.028,247.347" colourspace="DeviceGray" ncolour="0" size="10.238">5</text>a

    **<newline>**
    <text font="PYNIYO+ImprintMTnum-Italic" bbox="68.031,553.639,76.375,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">T</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="76.231,553.639,79.479,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">i</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="79.334,553.639,83.161,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">t</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="83.017,553.639,88.112,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">o</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="87.968,553.639,91.216,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">l</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="91.071,553.639,96.167,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">o</text>
 <whitespace/><text font="NUMPTY+ImprintMTnum" bbox="99.311,553.628,104.406,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">I</text>
<text font="NUMPTY+ImprintMTnum" bbox="104.261,553.628,107.510,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">l</text>
<text font="NUMPTY+ImprintMTnum" bbox="107.365,553.628,110.269,566.110" colourspace="DeviceGray" ncolour="0" size="12.482"> </text>
 <text font="NUMPTY+ImprintMTnum" bbox="110.658,553.628,119.002,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">C</text>
<text font="NUMPTY+ImprintMTnum" bbox="118.857,553.628,123.953,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">a</text>
<text font="NUMPTY+ImprintMTnum" bbox="123.808,553.628,130.183,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">u</text>
<text font="NUMPTY+ImprintMTnum" bbox="130.038,553.628,134.555,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">s</text>
<text font="NUMPTY+ImprintMTnum" bbox="134.410,553.628,137.659,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">i</text>
<text font="NUMPTY+ImprintMTnum" bbox="137.514,553.628,143.889,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">d</text>
<text font="NUMPTY+ImprintMTnum" bbox="143.744,553.628,146.993,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">i</text>
<text font="NUMPTY+ImprintMTnum" bbox="146.848,553.628,151.943,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">c</text>
<text font="NUMPTY+ImprintMTnum" bbox="151.799,553.628,157.595,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">o</text>
<text font="NUMPTY+ImprintMTnum" bbox="157.450,553.628,161.277,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">]</text>
<text font="NUMPTY+ImprintMTnum" bbox="161.132,553.628,164.036,566.110" colourspace="DeviceGray" ncolour="0" size="12.482"> </text>
 <text font="PYNIYO+ImprintMTnum-Italic" bbox="164.417,553.639,168.244,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">s</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="168.099,553.639,173.895,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">p</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="173.751,553.639,177.578,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">s</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="176.966,553.639,180.215,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">.</text>
<text font="PYNIYO+ImprintMTnum-Italic" bbox="180.070,553.639,182.974,566.366" colourspace="DeviceGray" ncolour="0" size="12.727"> </text>
 <text font="PYNIYO+ImprintMTnum-Italic" bbox="183.363,553.639,189.159,566.366" colourspace="DeviceGray" ncolour="0" size="12.727">a</text>
 <whitespace/><text font="NUMPTY+ImprintMTnum" bbox="192.314,553.628,201.937,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">D</text><text font="NUMPTY+ImprintMTnum" bbox="201.793,553.628,207.589,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">o</text><text font="NUMPTY+ImprintMTnum" bbox="207.444,553.628,213.819,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">n</text><text font="NUMPTY+ImprintMTnum" bbox="213.674,553.628,216.578,566.110" colourspace="DeviceGray" ncolour="0" size="12.482"> </text> <text font="NUMPTY+ImprintMTnum" bbox="216.967,553.628,225.311,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">R</text><text font="NUMPTY+ImprintMTnum" bbox="225.166,553.628,230.962,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">o</text><text font="NUMPTY+ImprintMTnum" bbox="230.818,553.628,237.192,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">d</text><text font="NUMPTY+ImprintMTnum" bbox="237.048,553.628,241.565,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">r</text><text font="NUMPTY+ImprintMTnum" bbox="241.420,553.628,244.668,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">i</text><text font="NUMPTY+ImprintMTnum" bbox="244.524,553.628,250.320,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">g</text><text font="NUMPTY+ImprintMTnum" bbox="250.064,553.628,255.860,566.110" colourspace="DeviceGray" ncolour="0" size="12.482">o</text> <text>
    </text>
    **</newline>**
    **<newline>**
    <text font="QKWQNQ+ImprintMTnum-Bold" bbox="272.661,554.072,277.415,564.757" colourspace="DeviceGray" ncolour="0" size="10.685">1</text>
    ... continues
**</newline>**

【问题讨论】:

  • 我没有看到任何 python 代码。你遇到了什么问题?
  • 是的,因为我通过 Python 阅读了 XSLT。每次我指定的距离大于 10 ( )。我在这里有一个 Python 版本的问题:stackoverflow.com/questions/61245945/…
  • 据我了解,将某些东西转换为新元素将是例如&lt;newline&gt;&lt;xsl:copy-of select="."/&gt;&lt;/newline&gt; 将当前处理的 text 包装到 newline 元素中。如果您向我们展示您想要创建的结果 XML,这将非常有帮助。
  • 我更改了问题,以便您可以看到预期的输出!因为它按你说的那样工作,问题是它只包装一个文本标签而我想要多个
  • 根据条件在 XSLT 1 中包装多个相邻元素的常用方法是使用兄弟递归,有时也可以在前面或后面的兄弟节点 id 上使用键。为了让其他人更好地理解您的核心问题,如果您去除所有不相关数据的样本,将会有所帮助。您需要更详细地说明何时要包装元素,何时不包装。

标签: python xml xslt tags elementtree


【解决方案1】:

正如@MartinHonnen 在他的评论中所说,这个问题的经典解决方案是兄弟递归。

这种技术的本质是:

  • 从父元素 textline 应用模板到第一个子元素 text 元素:&lt;xsl:apply-templates select="text[1]"/&gt;

  • 从子元素 text 将模板应用到以下兄弟元素:&lt;xsl:apply-templates select="following-sibling::text[1]"/&gt;

有一个该技术的工作示例

How to apply XSL templates to start and finish XML element from different parts of the document

看看你能不能适应它。

【讨论】:

  • 对于定义不明确的问题没有经典的解决方案。
猜你喜欢
  • 1970-01-01
  • 2019-08-13
  • 2022-11-04
  • 1970-01-01
  • 2021-06-27
  • 1970-01-01
  • 1970-01-01
  • 2014-09-08
  • 1970-01-01
相关资源
最近更新 更多