【发布时间】:2022-01-28 05:07:32
【问题描述】:
作为我上一个 (Perl XML::LibXML Getting info from specific nodes) 的后续问题
鉴于以下 XML 数据,我无法弄清楚如何获取 <tab/> 标记之后显示的数据(如果没有从该部分中获取子节点的所有数据,它没有结束标记?请参阅以下为更多细节:
XML 示例:
<title number="3">
<catchline>Uniform Agricultural Cooperative Association Act</catchline>
<chapter number="3-1">
<catchline>
General Provisions Relating to Agricultural Cooperative Associations
</catchline>
<section number="3-1-1">
<histories>
<history>
Amended by Chapter
<modchap sess="2010GS">378</modchap>
, 2010 General Session
</history>
<modyear>2010</modyear>
</histories>
<catchline>Declaration of policy.</catchline>
<tab/>
It is the declared policy of this state, as one means of improving the economic position of agriculture, to encourage the organization of producers of agricultural products into effective associations under the control of such producers, and to that end this act shall be liberally construed. THIS IS THE DATA THAT I WANT TO GET
</section>
<section number="3-1-1.1">
<histories>
<history>
Amended by Chapter
<modchap sess="1996GS">79</modchap>
, 1996 General Session
</history>
<modyear>1996</modyear>
</histories>
<catchline>General corporation laws do not apply.</catchline>
<tab/>
<xref depth="1" refnumber="16-10a" start="0">
Title 16, Chapter 10a, Utah Revised Business Corporation Act
</xref>
, does not apply to domestic or foreign corporations governed by this chapter, except as specifically provided in Sections
<xref depth="3" refnumber="3-1-13.4" start="0">3-1-13.4</xref>
,
<xref depth="3" refnumber="3-1-13.7" start="0">3-1-13.7</xref>
, and
<xref depth="3" refnumber="3-1-16.1" start="0">3-1-16.1</xref>
.
</section>
</chapter>
</title>
这是我当前的 perl 脚本:
!/usr/bin/perl -w
use XML::LibXML;
my $dom = XML::LibXML->load_xml(location => "file.xml");
my $titleName = $dom->findvalue('/title/catchline');
print "Title $titleName\n";
my @chapters = $dom->findnodes('/title/chapter');
for $chapter (@chapters) {
my $chapterNo = $chapter->getAttribute('number');
my $chapterName = $chapter->findvalue('catchline');
print " Chapter #$chapterNo - $chapterName\n";
my @sections = $chapter->findnodes('section');
for $section (@sections) {
my $sectionNo = $section->getAttribute('number');
my $sectionName = $section->findvalue('catchline');
my $sectionData = $section->textContent;
print " Section #$sectionNo - $sectionName\nSECDATA: $sectionData\n\n";
}
}
这可行,但发生的情况可能正是我所要求的,它为 $sectionData 变量打印<section> 中的所有内容。
我想要做的只是从<tab/> 标记之后获取数据,而标记内没有任何其他内容。喜欢<histories><history><xref>等的children标签。
例如,字符串:
,不适用于受本规则管辖的国内或外国公司 章节,除非章节中明确规定
不包含在任何特定标签中,我如何获取这些数据?
当前输出为:
Title Uniform Agricultural Cooperative Association Act
Chapter #3-1 -
General Provisions Relating to Agricultural Cooperative Associations
Section #3-1-1 - Declaration of policy.
SECDATA:
Amended by Chapter
378
, 2010 General Session
2010
Declaration of policy.
It is the declared policy of this state, as one means of improving the economic position of agriculture, to encourage the organization of producers of agricultural products into effective associations under the control of such producers, and to that end this act shall be liberally construed.
Section #3-1-1.1 - General corporation laws do not apply.
SECDATA:
Amended by Chapter
79
, 1996 General Session
1996
General corporation laws do not apply.
Title 16, Chapter 10a, Utah Revised Business Corporation Act
, does not apply to domestic or foreign corporations governed by this chapter, except as specifically provided in Sections
3-1-13.4
,
3-1-13.7
, and
3-1-16.1
.
但我正在寻找的更像是:
Title Uniform Agricultural Cooperative Association Act
Chapter #3-1 -
General Provisions Relating to Agricultural Cooperative Associations
Section #3-1-1 - Declaration of policy.
SECDATA:
It is the declared policy of this state, as one means of improving the economic position of agriculture, to encourage the organization of producers of agricultural products into effective associations under the control of such producers, and to that end this act shall be liberally construed.
Section #3-1-1.1 - General corporation laws do not apply.
SECDATA:
, does not apply to domestic or foreign corporations governed by this chapter, except as specifically provided in Sections
【问题讨论】:
标签: xml perl xml-libxml