【发布时间】:2015-05-15 17:22:27
【问题描述】:
我正在尝试对 XML 进行解析(是的,我知道有更简单的方法可以像 xstream 那样解析/验证),但我似乎无法仅获取单个元素的文本内容。例如:
<container>
<element0>textThatIWant</element0> //only returned by .getTextContent
<element1>
<subelement0>textThatIDontWant</subelement0> //but also returned by
<subelement1>textThatIDontWant</subelement1> //.getTextContent
</element1>
<container>
我将结果输出到控制台,主要得到我正在寻找的内容,但我似乎获得文本字符串的唯一方法是使用.getTextContent(),它也返回子元素中的所有文本,没有空格(否则我会在空格上拆分)或.getNodeValue().toString(),它会抛出nullPointerExceptions。 @Jihar 提到了 .getTextValue() 之类的东西,但 Eclipse 无法识别它(也许我可以实现/继承/添加任何功能),有什么帮助吗?
这是我正在使用的代码:
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
import java.io.*;
public class Test {
public static void main(String[] args) throws ParserConfigurationException, SAXException, IOException {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
StringBuilder xmlStringBuilder = new StringBuilder();
String appendage = "..." //This string holds the xml formatted data I'll be
//using in a long annoying line, I'll include it
//separately for clarity
xmlStringBuilder.append(appendage);
ByteArrayInputStream input = new ByteArrayInputStream(xmlStringBuilder.toString().getBytes("UTF-8"));
System.out.println("Test Results:");
System.out.println();
Document doc = builder.parse(input);
Element root = doc.getDocumentElement();
NodeList children = root.getChildNodes();
System.out.println(root.getTagName());
System.out.println();
for (int i = 0; i < children.getLength(); i++) {
Node child = children.item(i);
if (child instanceof Element) {
Element childElement = (Element) child;
System.out.println(childElement.getTagName() + " " + childElement);
NodeList grandChildren = child.getChildNodes();
for (int x = 0; x < grandChildren.getLength(); x++) {
Node grandChild = grandChildren.item(x);
if (grandChild instanceof Element) {
Element grandChildElement = (Element) grandChild;
System.out.print("\t" + grandChildElement.getTagName() + ":\t");
NodeList greatGrandChildren = grandChild.getChildNodes();
for (int y = 0; y < greatGrandChildren.getLength(); y++) {
Node greatGrandChild = greatGrandChildren.item(y);
if (greatGrandChild instanceof Element) {
Element greatGrandChildElement = (Element) greatGrandChild;
System.out.print(" " + greatGrandChildElement.getTextContent());
if ( y < greatGrandChildren.getLength() - 1) { System.out.print(","); } }
}
System.out.println();
}
}
}
}
}
}
这是完整的附加变量:
String appendage = "<?xml version=\"1.0\"?><branch0><name>business</name><taxINFO/><personnel><executives><name>Billy Bob</name><name>Colonel Jessup</name></executives><managerial/><operations><name>sabrina</name><name>lisa</name></operations><services><name>jamie</name><name>justin</name><name>forest</name></services></personnel><regions><ebay><area>OK</area><area>BE</area><area>EV</area><area>WC</area></ebay><sbay><area>SJ</area><area>MP</area><area>SV</area><area>MV</area></sbay><S.F.><area>SF</area></S.F.><N.Y.><area>NY</area></N.Y.><S.CA><area>SD</area><area>LA</area></S.CA></regions><products/><services/></branch0>";
或:
String appendage = "
<?xml version=\"1.0\"?>
<branch0>
<name>business</name>
<taxINFO/>
<personnel>
<executives>
<name>Billy Bob</name>
<name>Colonel Jessup</name>
</executives>
<managerial/>
<operations>
<name>sabrina</name>
<name>lisa</name>
</operations>
<services>
<name>jamie</name>
<name>justin</name>
<name>forest</name>
</services>
</personnel>
<regions>
<ebay>
<area>OK</area>
<area>BE</area>
<area>EV</area>
<area>WC</area>
</ebay>
<sbay>
<area>SJ</area>
<area>MP</area>
<area>SV</area>
<area>MV</area>
</sbay>
<S.F.>
<area>SF</area>
</S.F.>
<N.Y.>
<area>NY</area>
</N.Y.>
<S.CA>
<area>SD</area>
<area>LA</area>
</S.CA>
</regions>
<products/>
<services/>
</branch0>";
";
最后,我的控制台输出(您会看到[name: null] 我希望它说类似[name: business] 甚至只是business;但不包括不包含子元素数据空格):
Test Results:
branch0
name [name: null]
taxINFO [taxINFO: null]
personnel [personnel: null]
executives: Billy Bob, Colonel Jessup
managerial:
operations: sabrina, lisa
services: jamie, justin, forest
regions [regions: null]
ebay: OK, BE, EV, WC
sbay: SJ, MP, SV, MV
S.F.: SF
N.Y.: NY
S.CA: SD, LA
products [products: null]
services [services: null]
这是我使用.getTextContent的控制台输出:
Test Results:
business
branch0
name business
taxINFO
personnel Billy BobColonel Jessupsabrinalisajamiejustinforest
executives: Billy Bob, Colonel Jessup
managerial:
operations: sabrina, lisa
services: jamie, justin, forest
regions OKBEEVWCSJMPSVMVSFNYSDLA
ebay: OK, BE, EV, WC
sbay: SJ, MP, SV, MV
S.F.: SF
N.Y.: NY
S.CA: SD, LA
products
services
【问题讨论】:
标签: java xml string get element