【发布时间】:2013-10-14 18:09:03
【问题描述】:
在我的 XML 中,我有一个多行元素:
<tag id="sometag" ...>
| first line
| second line
| third line
| fourth line
<tag ...>
....
<tag id="someothertag" ...>
| ANOTHER FIRST LINE
| ANOTHER SECOND LINE
| ANOTHER THIRD LINE
| ANOTHER FORTH LINE
<tag ...>
然后在Java中我有必要的startElement、endElement和characters方法,但我发现characters有一些奇怪的行为:
public void characters(char[] ch, int start, int length){
Log.d(TAG, "characters( "\"" + (new String(ch)).replaceAll("[\r\n]", "\\n") + "\", " + start + ", " + length + " )");
}
除此之外,我对角色什么都不做。我基本上是在创建解析器的两个实例。在一个实例中,我正在搜索sometag。如果我找到我要查找的内容并返回该元素,我会抛出异常。
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "sometag"!
...我正在寻找另一个全新的实例someothertag。我做的和以前一样。
D/MyProgram( 1565): STARTING document parsing...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | first line", 0, 20 )
D/MyProgram( 1565): characters( "n | first line", 0, 1 )
D/MyProgram( 1565): characters( " | second line", 0, 23 )
D/MyProgram( 1565): characters( "n | second line", 0, 1 )
D/MyProgram( 1565): characters( " | third line", 0, 26 )
D/MyProgram( 1565): characters( "n | third line", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 22 )
D/MyProgram( 1565): characters( "n | fourth lineline", 0, 1 )
D/MyProgram( 1565): characters( " | fourth lineline", 0, 4 )
D/MyProgram( 1565): Successfully found "someothertag"!
我知道 XML 解析是基于流的(它解析块而不是整个字符串),但这是非常奇怪的行为。以下是我注意到的一些非常令人困惑的事情:
- 对于 characters() 的每次迭代,如果解析器确实完成了解析,则解析器不会从它停止的地方开始或完成字符:我什至得到 before 的字符第一个字符数组('
n',它是换行符的替换)。 -
ch包含原本不存在的额外字符:“line”附加到“forth line”。 - 当我创建一个全新的解析器实例时,字符被“重新读取”。第二次执行应该是这样的:
..这...
D/MyProgram( 1565): characters( "n", 0, 1 )
D/MyProgram( 1565): characters( " ", 0, 4 )
D/MyProgram( 1565): characters( "n ", 0, 1 )
D/MyProgram( 1565): characters( " | ANOTHER FIRST LINE", 0, 20 )
D/MyProgram( 1565): characters( "n | ANOTHER SECOND LINE", 0, 1 )
...等等。
知道我做错了什么吗?提前致谢。
【问题讨论】:
-
看起来你不尊重开始和长度。
标签: java xml parsing character sax