【发布时间】:2016-09-27 09:56:09
【问题描述】:
我正在将 XML 转换为 CSV 数据。通过查看各种示例,我能够编写用于解析 XML 文件和获取 CSV 文件的代码。但是,我编写的代码返回的 CSV 文件并未显示 XML 文件中存在的所有标签。
我有用于转换的 XSLT。我是 XSLT 的新手,所以我相信我的 XSLT 有问题。
这里是 Java 代码:
package com.adarsh.conversions;
import java.io.File;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.Document;
class XMLtoCsVConversion {
public static void main(String args[]) throws Exception {
File stylesheet = new File("style.xsl");
File xmlSource = new File("sample_data.xml");
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document document = builder.parse(xmlSource);
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = TransformerFactory.newInstance()
.newTransformer(stylesource);
Source source = new DOMSource(document);
Result outputTarget = new StreamResult(new File("/tmp/x.csv"));
transformer.transform(source, outputTarget);
}
}
这是我正在使用的 XSLT:
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:for-each select="*/*[1]/*">
<xsl:value-of select="name()" />
<xsl:if test="not(position() = last())">,</xsl:if>
</xsl:for-each>
<xsl:text> </xsl:text>
<xsl:apply-templates select="*/*" mode="row"/>
</xsl:template>
<xsl:template match="*" mode="row">
<xsl:apply-templates select="*" mode="data" />
<xsl:text> </xsl:text>
</xsl:template>
<xsl:template match="*" mode="data">
<xsl:choose>
<xsl:when test="contains(text(),',')">
<xsl:text>"</xsl:text>
<xsl:call-template name="doublequotes">
<xsl:with-param name="text" select="text()" />
</xsl:call-template>
<xsl:text>"</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="." />
</xsl:otherwise>
</xsl:choose>
<xsl:if test="position() != last()">,</xsl:if>
</xsl:template>
<xsl:template name="doublequotes">
<xsl:param name="text" />
<xsl:choose>
<xsl:when test="contains($text,'"')">
<xsl:value-of select="concat(substring-before($text,'"'),'""')" />
<xsl:call-template name="doublequotes">
<xsl:with-param name="text" select="substring-after($text,'"')" />
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$text" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>
</xsl:stylesheet>
这是我要转换为 CSV 的 XML 文件:
<?xml version="1.0"?>
<school id="100" name="WGen School">
<grade id="1">
<classroom id="101" name="Mrs. Jones' Math Class">
<teacher id="10100000001" first_name="Barbara" last_name="Jones"/>
<student id="10100000010" first_name="Michael" last_name="Gil"/>
<student id="10100000011" first_name="Kimberly" last_name="Gutierrez"/>
<student id="10100000013" first_name="Toby" last_name="Mercado"/>
<student id="10100000014" first_name="Lizzie" last_name="Garcia"/>
<student id="10100000015" first_name="Alex" last_name="Cruz"/>
</classroom>
<classroom id="102" name="Mr. Smith's PhysEd Class">
<teacher id="10200000001" first_name="Arthur" last_name="Smith"/>
<teacher id="10200000011" first_name="John" last_name="Patterson"/>
<student id="10200000010" first_name="Nathaniel" last_name="Smith"/>
<student id="10200000011" first_name="Brandon" last_name="McCrancy"/>
<student id="10200000012" first_name="Elizabeth" last_name="Marco"/>
<student id="10200000013" first_name="Erica" last_name="Lanni"/>
<student id="10200000014" first_name="Michael" last_name="Flores"/>
<student id="10200000015" first_name="Jasmin" last_name="Hill"/>
<student id="10200000016" first_name="Brittany" last_name="Perez"/>
<student id="10200000017" first_name="William" last_name="Hiram"/>
<student id="10200000018" first_name="Alexis" last_name="Reginald"/>
<student id="10200000019" first_name="Matthew" last_name="Gayle"/>
</classroom>
<classroom id="103" name="Brian's Homeroom">
<teacher id="10300000001" first_name="Brian" last_name="O'Donnell"/>
</classroom>
</grade>
</school>
预期结果是:
classroom id, classroom_name, teacher_1_id, teacher_1_last_name, teacher_1_first_name, teacher_2_id, teacher_2_last_name, teacher_2_first_name, student_id, student_last_name, student_first_name, grade
101, Mrs. Jones' Math Class, 10100000001, Jones, Barbara, , , , 10100000010, Gil, Michael, 2
101, Mrs. Jones' Math Class, 10100000001, Jones, Barbara, , , , 10100000011, Gutierrez, Kimberly, 2
101, Mrs. Jones' Math Class, 10100000001, Jones, Barbara, , , , 10100000013, Mercado, Toby, 1
101, Mrs. Jones' Math Class, 10100000001, Jones, Barbara, , , , 10100000014, Garcia, Lizzie, 1
101, Mrs. Jones' Math Class, 10100000001, Jones, Barbara, , , , 10100000015, Cruz, Alex, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000010, Smith, Nathaniel, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000011, McCrancy, Brandon, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000012, Marco, Elizabeth, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000013, Lanni, Erica, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000014, Flores, Michael, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000015, Hill, Jasmin, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000016, Perez, Brittany, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000017, Hiram, William, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000018, Reginald, Alexis, 1
102, Mr. Smith's PhysEd Class, 10200000001, Smith, Arthur, 10200000011, Patterson, John, 10200000019, Gayle, Matthew, 1
103, Brian's Homeroom, 10300000001, O'Donnell, Brian, , , , , , ,
但是我只是得到了
教室教室教室
有人可以帮我解决这个问题吗?
附:我已经在 stackoverflow 上提到了关于 CSV 到 XML 转换的其他问题。我已使用这些帖子中提供的信息来帮助我创建 XSL。
【问题讨论】:
-
您的预期输出不可读。请将其发布为代码格式。
-
该输出不是 CSV(逗号分隔值),而是制表符分隔值。我对其进行了更新,以直观地显示制表符,因此我们对正在发生的事情有一个线索。我相信你已经换好了线,因为那看起来还是不对。
-
@michael.hor257k:谢谢你指出这一点。我已经在 Excel 中构建了预期的格式,所以它没有逗号。对此感到抱歉。我现在已经编辑了这个问题,以反映一个正确的 CSV 文件,每个值用逗号分隔。
-
@AdarshBhat 您的预期输出没有引用值。您的输入可以包含逗号吗?如果是,在哪些领域?另外,我看不出
student_grade应该来自哪里。 -
@michael.hor257k 否 我的输入不包含逗号。此外,student_grade 指的是 XML 中的等级标签。我已将预期输出更改为仅反映
grade而不是student_grade。