【发布时间】:2020-01-28 21:21:31
【问题描述】:
我有一个 单个 XML 文件,其中包含 多个 记录。每条记录都有一个键。我想按键选择所有记录并将每个记录折叠成一个 XML 记录。每个 XML 记录中的某些数据是重复的,并且存在空元素。我还想删除重复的标签和空标签。
输入
<Data>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Author>A1</Author>
<Author>A2</Author>
<Author></Author>
<Author>A1</Author>
<Author>A2</Author>
<Author>A3</Author>
<Author></Author>
<Author>A1</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2</Author>
<Author></Author>
<Author>A1</Author>
<Author>A3</Author>
<Author></Author>
<Author>B2</Author>
<Author></Author>
<Author>B2</Author>
<Date>10/12/2019</Date>
<Summary>Record 2: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA2</Author>
<Author></Author>
<Author>AA1</Author>
<Author>AA3</Author>
<Author></Author>
<Author>AA3</Author>
<Author>BB2</Author>
<Author></Author>
<Author>AA3</Author>
<Date>01/12/2020</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1</Author>
<Author>AA3</Author>
<Author></Author>
<Author>CC2</Author>
<Author></Author>
<Author>AA1</Author>
<Author>CC2</Author>
<Date>01/12/2020</Date>
<Summary>Record 2: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1</Author>
<Author>AA3</Author>
<Author></Author>
<Author>CC2</Author>
<Author></Author>
<Author>AA1</Author>
<Author>CC3</Author>
<Date>01/12/2020</Date>
<Summary>Record 3: Summary 1 Text</Summary>
</Record>
<Record>
<Key>778899</Key>
<Number>998822I</Number>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2</Author>
<Author></Author>
<Author>D1</Author>
<Author>D2</Author>
<Author></Author>
<Author>D3</Author>
<Author>D33</Author>
<Author></Author>
<Author>D33</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
</Data>
期望的输出
<Data>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Author>A1</Author>
<Author>A2</Author>
<Author>A3</Author>
<Author>B2</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
</Record>
<Record>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1</Author>
<Author>AA2</Author>
<Author>AA3</Author>
<Author>BB2</Author>
<Author>CC2</Author>
<Author>CC3</Author>
<Date>01/12/2020</Date>
<Summary>Record 1: Summary 1 Text</Summary>
<Summary>Record 2: Summary 1 Text</Summary>
<Summary>Record 3: Summary 1 Text</Summary>
</Record>
<Record>
<Key>778899</Key>
<Number>998822I</Number>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2</Author>
<Author>D1</Author>
<Author>D2</Author>
<Author>D3</Author>
<Author>D33</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
</Record>
</Data>
我用过这段代码,但我不确定这是不是正确的路径。
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:fn="http://www.w3.org/2005/xpath-functions">
<xsl:output method="xml" indent="yes"/>
<xsl:strip-space elements="*" />
<xsl:key name="key" match="Record" use="Key"/>
<xsl:key name="kNamedSiblings" match="*"
use="concat(generate-id(..), '+', name())"/>
<xsl:template match="*">
<xsl:copy>
<xsl:apply-templates select="key('kNamedSiblings',
concat(generate-id(..), '+', name())
)/node()" />
</xsl:copy>
</xsl:template>
<xsl:template match="*[not(*) and . = '']" />
<xsl:template match="*[generate-id() !=
generate-id(key('kNamedSiblings',
concat(generate-id(..), '+', name()))[1]
)]" />
</xsl:stylesheet>
电流输出
<?xml version="1.0"?>
<Data>
<Record>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text Field 2</Text_Field_2>
<Author>A1A2A1A2A3A1</Author>
<Date>10/12/2019</Date>
<Summary>Record 1: Summary 1 Text</Summary>
<Key>12345</Key>
<Number>09095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2A1A3B2B2</Author>
<Date>10/12/2019</Date>
<Summary>Record 2: Summary 1 Text</Summary>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 1: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA2AA1AA3AA3BB2AA3</Author>
<Date>01/12/2020</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 2: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1AA3CC2AA1CC2</Author>
<Date>01/12/2020</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
<Key>23456</Key>
<Number>43095I</Number>
<Text_Field_1>Record 3: This is Text Field 1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>AA1AA3CC2AA1CC3</Author>
<Date>01/12/2020</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
<Key>778899</Key>
<Number>998822I</Number>
<Text_Field_1>Record 1: This is Text_Field_1</Text_Field_1>
<Text_Field_2>This is Text_Field_2</Text_Field_2>
<Author>A2A3A3A3</Author>
<Date>10/12/2019</Date>
<Field_Text_1>This is the Text 1</Field_Text_1>
</Record>
</Data>
我当前的代码创建了一个大记录,而不是三个单独的记录。此外,未维护 Author 元素。相反,会创建一个元素并将这些值集中在一起。我了解这是一个分阶段的解决方案,涉及: - 将多个记录合并为一个键 - 删除空标签 - 删除具有相同值的重复标签 - 保持原有的 XML 结构
了解解决方案也会有很大帮助。
【问题讨论】:
-
请确认您使用的是 XSLT 2.0