【问题标题】:XSL Merging Multiple XML Records in the Same FileXSL 在同一个文件中合并多个 XML 记录
【发布时间】:2020-01-28 21:21:31
【问题描述】:

我有一个 单个 XML 文件,其中包含 多个 记录。每条记录都有一个键。我想按键选择所有记录并将每个记录折叠成一个 XML 记录。每个 XML 记录中的某些数据是重复的,并且存在空元素。我还想删除重复的标签和空标签。

输入

<Data>
    <Record>
        <Key>12345</Key>
        <Number>09095I</Number>
        <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text Field 2</Text_Field_2>
        <Author>A1</Author>
        <Author>A2</Author>
        <Author></Author>
        <Author>A1</Author>
        <Author>A2</Author>
        <Author>A3</Author>
        <Author></Author>
        <Author>A1</Author>
        <Date>10/12/2019</Date>
        <Summary>Record 1: Summary 1 Text</Summary>
    </Record>
    <Record>
        <Key>12345</Key>
        <Number>09095I</Number>
        <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>A2</Author>
        <Author></Author>
        <Author>A1</Author>
        <Author>A3</Author>
        <Author></Author>
        <Author>B2</Author>
        <Author></Author>
        <Author>B2</Author>
        <Date>10/12/2019</Date>
        <Summary>Record 2: Summary 1 Text</Summary>
    </Record>
    <Record>
        <Key>23456</Key>
        <Number>43095I</Number>
        <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>AA2</Author>
        <Author></Author>
        <Author>AA1</Author>
        <Author>AA3</Author>
        <Author></Author>
        <Author>AA3</Author>
        <Author>BB2</Author>
        <Author></Author>
        <Author>AA3</Author>
        <Date>01/12/2020</Date>
        <Summary>Record 1: Summary 1 Text</Summary>
    </Record>
    <Record>
        <Key>23456</Key>
        <Number>43095I</Number>
        <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>AA1</Author>
        <Author>AA3</Author>
        <Author></Author>
        <Author>CC2</Author>
        <Author></Author>
        <Author>AA1</Author>
        <Author>CC2</Author>
        <Date>01/12/2020</Date>
        <Summary>Record 2: Summary 1 Text</Summary>
    </Record>
    <Record>
        <Key>23456</Key>
        <Number>43095I</Number>
        <Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>AA1</Author>
        <Author>AA3</Author>
        <Author></Author>
        <Author>CC2</Author>
        <Author></Author>
        <Author>AA1</Author>
        <Author>CC3</Author>
        <Date>01/12/2020</Date>
        <Summary>Record 3: Summary 1 Text</Summary>
    </Record>
    <Record>
        <Key>778899</Key>
        <Number>998822I</Number>
        <Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>A2</Author>
        <Author></Author>
        <Author>D1</Author>
        <Author>D2</Author>
        <Author></Author>
        <Author>D3</Author>
        <Author>D33</Author>
        <Author></Author>
        <Author>D33</Author>
        <Date>10/12/2019</Date>
        <Summary>Record 1: Summary 1 Text</Summary>
    </Record>
</Data>

期望的输出

<Data>
    <Record>
        <Key>12345</Key>
        <Number>09095I</Number>
        <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
        <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text Field 2</Text_Field_2>
        <Author>A1</Author>
        <Author>A2</Author>
        <Author>A3</Author>
        <Author>B2</Author>
        <Date>10/12/2019</Date>
        <Summary>Record 1: Summary 1 Text</Summary>
        <Summary>Record 2: Summary 1 Text</Summary>
    </Record>

    <Record>
        <Key>23456</Key>
        <Number>43095I</Number>
        <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
        <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
        <Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>AA1</Author>
        <Author>AA2</Author>
        <Author>AA3</Author>
        <Author>BB2</Author>
        <Author>CC2</Author>
        <Author>CC3</Author>
        <Date>01/12/2020</Date>
        <Summary>Record 1: Summary 1 Text</Summary>
        <Summary>Record 2: Summary 1 Text</Summary>
        <Summary>Record 3: Summary 1 Text</Summary>
    </Record>
    <Record>
        <Key>778899</Key>
        <Number>998822I</Number>
        <Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
        <Text_Field_2>This is Text_Field_2</Text_Field_2>
        <Author>A2</Author>
        <Author>D1</Author>
        <Author>D2</Author>
        <Author>D3</Author>
        <Author>D33</Author>
        <Date>10/12/2019</Date>
        <Summary>Record 1: Summary 1 Text</Summary>
    </Record>
</Data>

我用过这段代码,但我不确定这是不是正确的路径。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
    <xsl:output method="xml" indent="yes"/>
    <xsl:strip-space elements="*" />

    <xsl:key name="key" match="Record" use="Key"/>
    <xsl:key name="kNamedSiblings" match="*" 
           use="concat(generate-id(..), '+', name())"/>

    <xsl:template match="*">
      <xsl:copy>
        <xsl:apply-templates select="key('kNamedSiblings', 
                                         concat(generate-id(..), '+', name())
                                        )/node()" />
        </xsl:copy>
    </xsl:template>
    <xsl:template match="*[not(*) and . = '']" />
    <xsl:template match="*[generate-id() != 
                         generate-id(key('kNamedSiblings', 
                                         concat(generate-id(..), '+', name()))[1]
                                    )]" />
</xsl:stylesheet>

电流输出

<?xml version="1.0"?>
<Data>
  <Record>

    <Key>12345</Key>
    <Number>09095I</Number>
    <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
    <Text_Field_2>This is Text Field 2</Text_Field_2>
    <Author>A1A2A1A2A3A1</Author>
    <Date>10/12/2019</Date>
    <Summary>Record 1: Summary 1 Text</Summary>

    <Key>12345</Key>
    <Number>09095I</Number>
    <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
    <Text_Field_2>This is Text_Field_2</Text_Field_2>
    <Author>A2A1A3B2B2</Author>
    <Date>10/12/2019</Date>
    <Summary>Record 2: Summary 1 Text</Summary>

    <Key>23456</Key>
    <Number>43095I</Number>
    <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
    <Text_Field_2>This is Text_Field_2</Text_Field_2>
    <Author>AA2AA1AA3AA3BB2AA3</Author>
    <Date>01/12/2020</Date>
    <Field_Text_1>This is the Text 1</Field_Text_1>

    <Key>23456</Key>
    <Number>43095I</Number>
    <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
    <Text_Field_2>This is Text_Field_2</Text_Field_2>
    <Author>AA1AA3CC2AA1CC2</Author>
    <Date>01/12/2020</Date>
    <Field_Text_1>This is the Text 1</Field_Text_1>

    <Key>23456</Key>
    <Number>43095I</Number>
    <Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
    <Text_Field_2>This is Text_Field_2</Text_Field_2>
    <Author>AA1AA3CC2AA1CC3</Author>
    <Date>01/12/2020</Date>
    <Field_Text_1>This is the Text 1</Field_Text_1>

    <Key>778899</Key>
    <Number>998822I</Number>
    <Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
    <Text_Field_2>This is Text_Field_2</Text_Field_2>
    <Author>A2A3A3A3</Author>
    <Date>10/12/2019</Date>
    <Field_Text_1>This is the Text 1</Field_Text_1>
  </Record>
</Data>

我当前的代码创建了一个大记录,而不是三个单独的记录。此外,未维护 Author 元素。相反,会创建一个元素并将这些值集中在一起。我了解这是一个分阶段的解决方案,涉及: - 将多个记录合并为一个键 - 删除空标签 - 删除具有相同值的重复标签 - 保持原有的 XML 结构

了解解决方案也会有很大帮助。

【问题讨论】:

  • 请确认您使用的是 XSLT 2.0

标签: xml xslt merge


【解决方案1】:

因为您的样式表表明您能够使用 XSLT-2.0,您可以将您的方法从使用复杂的 xsl:key 简化为更简单的 xsl:for-each-group

<xsl:template match="Data">
  <xsl:copy>
    <xsl:for-each-group select="Record" group-by="Key">
      <xsl:copy>
        <xsl:for-each-group select="current-group()/*[normalize-space()]" group-by="concat(name(),.)">
          <xsl:sort select="name()" order="ascending" />
          <xsl:copy-of select="current-group()[1]" />
        </xsl:for-each-group>
      </xsl:copy>
    </xsl:for-each-group>
  </xsl:copy>
</xsl:template>

此模板将Record 元素按Key分组,然后将其结果按由元素名称及其内容组成的字符串分组。其结果按字母顺序排序,以对具有相同名称的元素进行分组。
然后,输出第一个(也是唯一的)元素。

输出为:

<?xml version="1.0" encoding="UTF-8"?>
<Data>
   <Record>
      <Author>A1</Author>
      <Author>A2</Author>
      <Author>A3</Author>
      <Author>B2</Author>
      <Date>10/12/2019</Date>
      <Key>12345</Key>
      <Number>09095I</Number>
      <Summary>Record 1: Summary 1 Text</Summary>
      <Summary>Record 2: Summary 1 Text</Summary>
      <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
      <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
      <Text_Field_2>This is Text Field 2</Text_Field_2>
      <Text_Field_2>This is Text_Field_2</Text_Field_2>
   </Record>
   <Record>
      <Author>AA2</Author>
      <Author>AA1</Author>
      <Author>AA3</Author>
      <Author>BB2</Author>
      <Author>CC2</Author>
      <Author>CC3</Author>
      <Date>01/12/2020</Date>
      <Key>23456</Key>
      <Number>43095I</Number>
      <Summary>Record 1: Summary 1 Text</Summary>
      <Summary>Record 2: Summary 1 Text</Summary>
      <Summary>Record 3: Summary 1 Text</Summary>
      <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
      <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
      <Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
      <Text_Field_2>This is Text_Field_2</Text_Field_2>
   </Record>
   <Record>
      <Author>A2</Author>
      <Author>D1</Author>
      <Author>D2</Author>
      <Author>D3</Author>
      <Author>D33</Author>
      <Date>10/12/2019</Date>
      <Key>778899</Key>
      <Number>998822I</Number>
      <Summary>Record 1: Summary 1 Text</Summary>
      <Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
      <Text_Field_2>This is Text_Field_2</Text_Field_2>
   </Record>
</Data>

【讨论】:

  • 好答案,我这边+1!
【解决方案2】:

除了zx485's good XSLT 2.0 answer,这里还有一个带有双键分组的 XSLT 1.0 样式表:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="xml" indent="yes"/>
    <xsl:strip-space elements="*" />
    <xsl:key name="Record-by-Key" match="Record" use="Key"/>
    <xsl:key name="Record-by-Key-child-by-name-value" match="Record/*" 
             use="concat(../Key,'+',name(),'+',.)"/>
    <xsl:template match="Data">
      <Data>
        <xsl:for-each 
             select="*[generate-id()=generate-id(key('Record-by-Key',Key)[1])]">
            <Record>
                <xsl:for-each  
                    select="key('Record-by-Key',Key)
                            /*[generate-id()
                                =generate-id(
                                    key('Record-by-Key-child-by-name-value',
                                        concat(../Key,'+',name(),'+',.))[1])]">
                    <xsl:sort select="name()"/>
                    <xsl:copy-of select="self::*[node()]"/>                
                </xsl:for-each>
            </Record>
        </xsl:for-each>  
      </Data>
    </xsl:template>
</xsl:stylesheet>

输出:

<Data>
   <Record>
      <Author>A1</Author>
      <Author>A2</Author>
      <Author>A3</Author>
      <Author>B2</Author>
      <Date>10/12/2019</Date>
      <Key>12345</Key>
      <Number>09095I</Number>
      <Summary>Record 1: Summary 1 Text</Summary>
      <Summary>Record 2: Summary 1 Text</Summary>
      <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
      <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
      <Text_Field_2>This is Text Field 2</Text_Field_2>
      <Text_Field_2>This is Text_Field_2</Text_Field_2>
   </Record>
   <Record>
      <Author>AA2</Author>
      <Author>AA1</Author>
      <Author>AA3</Author>
      <Author>BB2</Author>
      <Author>CC2</Author>
      <Author>CC3</Author>
      <Date>01/12/2020</Date>
      <Key>23456</Key>
      <Number>43095I</Number>
      <Summary>Record 1: Summary 1 Text</Summary>
      <Summary>Record 2: Summary 1 Text</Summary>
      <Summary>Record 3: Summary 1 Text</Summary>
      <Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
      <Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
      <Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
      <Text_Field_2>This is Text_Field_2</Text_Field_2>
   </Record>
   <Record>
      <Author>A2</Author>
      <Author>D1</Author>
      <Author>D2</Author>
      <Author>D3</Author>
      <Author>D33</Author>
      <Date>10/12/2019</Date>
      <Key>778899</Key>
      <Number>998822I</Number>
      <Summary>Record 1: Summary 1 Text</Summary>
      <Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
      <Text_Field_2>This is Text_Field_2</Text_Field_2>
   </Record>
</Data>

附录:也可以按名称强制执行子订单...

【讨论】:

  • 好答案,我这边+1!
【解决方案3】:

由于我们已经有了 XSLT 1 和 XSLT 2 解决方案,为了完整起见,这里使用了一个 XSLT 3 解决方案,使用 xsl:merge

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="#all"
    version="3.0">

  <xsl:output indent="yes"/>

  <xsl:mode on-no-match="shallow-copy"/>

  <xsl:template match="Data">
      <xsl:copy>
          <xsl:merge>
              <xsl:merge-source select="Record">
                  <xsl:merge-key select="Key"/>
              </xsl:merge-source>
              <xsl:merge-action>
                  <xsl:copy>
                      <xsl:merge>
                          <xsl:merge-source select="current-merge-group()/*[normalize-space()]" sort-before-merge="yes">
                              <xsl:merge-key select="name()"/>
                              <xsl:merge-key select="."/>
                          </xsl:merge-source>
                          <xsl:merge-action>
                              <xsl:copy-of select="."/>
                          </xsl:merge-action>
                      </xsl:merge>
                  </xsl:copy>
              </xsl:merge-action>
          </xsl:merge>
      </xsl:copy>
  </xsl:template>

</xsl:stylesheet>

https://xsltfiddle.liberty-development.net/gWEaSv5

【讨论】:

  • 感谢大家。我不能使用 Xslt 3.0,但其他解决方案有效。如何将多个标记为答案?
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2020-06-16
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-05-28
相关资源
最近更新 更多