[0] 首先,分析书后源代码,整题建立在下面的假设上:Level 标号从0开始,依次递增,且递增跨度为1,输入不存在错误

也正是这一点假设,LEVEL值可以不作为结点属性处理,即LEVEL值实际上是节点在堆栈中位置的代表,用堆栈结构可以轻松化解LEVEL带来的麻烦

 

[1] 第一种想法是用REXML,将字符串解析后存入XML结构,最后统一输出,源代码和书上一样

Best of Ruby Quiz - GEDCOM Parserrequire 'rexml/document'
Best of Ruby Quiz - GEDCOM Parserdoc 
= REXML::Document.new '<gedcom/>'
Best of Ruby Quiz - GEDCOM Parserstack 
= [doc.root]
Best of Ruby Quiz - GEDCOM ParserIO.read(ARGV[
0]).each do |line|
Best of Ruby Quiz - GEDCOM Parser  next 
if line =~ /^\s*$/
Best of Ruby Quiz - GEDCOM Parser  line 
=~ /^\s*(\d+)\s+(@\S+@|\S+)\s*(.*?)$/ or raise "Invalid GEDCOM"
Best of Ruby Quiz - GEDCOM Parser  level , tag , data 
= $1.to_i , $2 , $3
Best of Ruby Quiz - GEDCOM Parser  stack.pop 
while (level + 1 < stack.size)
Best of Ruby Quiz - GEDCOM Parser  parent 
= stack.last
Best of Ruby Quiz - GEDCOM Parser  
if tag =~ /^@(\S+)@$/
Best of Ruby Quiz - GEDCOM Parser    ele 
= parent.add_element data
Best of Ruby Quiz - GEDCOM Parser    ele.attributes[
'id'= tag
Best of Ruby Quiz - GEDCOM Parser  
else
Best of Ruby Quiz - GEDCOM Parser    ele 
= parent.add_element tag
Best of Ruby Quiz - GEDCOM Parser    ele.text 
= data
Best of Ruby Quiz - GEDCOM Parser  end
Best of Ruby Quiz - GEDCOM Parser  stack.push ele
Best of Ruby Quiz - GEDCOM Parserend
Best of Ruby Quiz - GEDCOM ParserFile.open(
"output_std.txt","w"do |file|
Best of Ruby Quiz - GEDCOM Parser  doc.write(file,
0)
Best of Ruby Quiz - GEDCOM Parserend

 

 [2] 当数据规模很大时,不可能将所有数据存入XML结构内存,再统一输出,So fight and run

 

Best of Ruby Quiz - GEDCOM Parserclass Node
Best of Ruby Quiz - GEDCOM Parser  def initialize(tag_or_id,data 
= "")
Best of Ruby Quiz - GEDCOM Parser    
if tag_or_id =~ /@.*@/
Best of Ruby Quiz - GEDCOM Parser      @name , @myid , @value 
= data , tag_or_id , ""
Best of Ruby Quiz - GEDCOM Parser    
else
Best of Ruby Quiz - GEDCOM Parser      @name , @value ,@myid 
= tag_or_id , data , ""
Best of Ruby Quiz - GEDCOM Parser    end
Best of Ruby Quiz - GEDCOM Parser  end
Best of Ruby Quiz - GEDCOM Parser  
Best of Ruby Quiz - GEDCOM Parser  def to_s_first
end

将节点打包成类,第二步解析交给类构造去做,输出也交给类做

在入栈时输出第一部分,在出栈时输出第二部分

自然想到将栈也打包

 

 1Best of Ruby Quiz - GEDCOM Parserrequire 'rexml/text'
 2Best of Ruby Quiz - GEDCOM Parser
 3Best of Ruby Quiz - GEDCOM Parserclass Node
 4Best of Ruby Quiz - GEDCOM Parser  def initialize(tag_or_id,data = "")
 5Best of Ruby Quiz - GEDCOM Parser    if tag_or_id =~ /@.*@/
 6Best of Ruby Quiz - GEDCOM Parser      @name , @myid , @value = data , tag_or_id , ""
 7Best of Ruby Quiz - GEDCOM Parser    else
 8Best of Ruby Quiz - GEDCOM Parser      @name , @value ,@myid = tag_or_id , data , ""
 9Best of Ruby Quiz - GEDCOM Parser    end
10Best of Ruby Quiz - GEDCOM Parser  end
11Best of Ruby Quiz - GEDCOM Parser  
12Best of Ruby Quiz - GEDCOM Parser  def to_s_first
13Best of Ruby Quiz - GEDCOM Parser    s = @myid.empty? ? "<#{@name}>\n" : "<#{@name} id=\'#{@myid}\'>\n" 
14Best of Ruby Quiz - GEDCOM Parser    s += (@value+"\n") unless @value.empty?
15Best of Ruby Quiz - GEDCOM Parser    s
16Best of Ruby Quiz - GEDCOM Parser  end
17Best of Ruby Quiz - GEDCOM Parser  
18Best of Ruby Quiz - GEDCOM Parser  def to_s_last
19Best of Ruby Quiz - GEDCOM Parser    "</#{@name}>\n"
20Best of Ruby Quiz - GEDCOM Parser  end
21Best of Ruby Quiz - GEDCOM Parser  
22Best of Ruby Quiz - GEDCOM Parserend
23Best of Ruby Quiz - GEDCOM Parser
24Best of Ruby Quiz - GEDCOM Parserclass Stack < Array
25Best of Ruby Quiz - GEDCOM Parser    def push(obj)
26Best of Ruby Quiz - GEDCOM Parser        raise "type error" unless obj.is_a? Node
27Best of Ruby Quiz - GEDCOM Parser        print obj.to_s_first        
28Best of Ruby Quiz - GEDCOM Parser        super(obj)
29Best of Ruby Quiz - GEDCOM Parser    end
30Best of Ruby Quiz - GEDCOM Parser    
31Best of Ruby Quiz - GEDCOM Parser    def pop
32Best of Ruby Quiz - GEDCOM Parser        print self.last.to_s_last
33Best of Ruby Quiz - GEDCOM Parser        super
34Best of Ruby Quiz - GEDCOM Parser    end
35Best of Ruby Quiz - GEDCOM Parserend
36Best of Ruby Quiz - GEDCOM Parser
37Best of Ruby Quiz - GEDCOM Parserdef file_write_env(file)
38Best of Ruby Quiz - GEDCOM Parser    $stdout =     file
39Best of Ruby Quiz - GEDCOM Parser    yield
40Best of Ruby Quiz - GEDCOM Parser    $stdout = STDOUT
41Best of Ruby Quiz - GEDCOM Parserend
42Best of Ruby Quiz - GEDCOM Parser
43Best of Ruby Quiz - GEDCOM Parserstack = Stack.new
44Best of Ruby Quiz - GEDCOM ParserFile.open("output.txt","w"do |file|
45Best of Ruby Quiz - GEDCOM Parser    file_write_env(file) do
46Best of Ruby Quiz - GEDCOM Parser        stack.push(Node.new "gedcom")
47Best of Ruby Quiz - GEDCOM Parser          IO.read($*[0]).each do |line|
48Best of Ruby Quiz - GEDCOM Parser            next if line =~ /^\s*$/
49Best of Ruby Quiz - GEDCOM Parser            line =~ /^\s*(\d+)\s+(@\S+@|\S+)\s*(.*?)$/ or raise "error"
50Best of Ruby Quiz - GEDCOM Parser            level , tag_or_id , data = $1.to_i , $2 , REXML::Text::normalize($3)
51Best of Ruby Quiz - GEDCOM Parser            stack.pop while (level + 1 < stack.size)
52Best of Ruby Quiz - GEDCOM Parser            stack.push Node.new(tag_or_id,data)  
53Best of Ruby Quiz - GEDCOM Parser          end
54Best of Ruby Quiz - GEDCOM Parser        stack.pop
55Best of Ruby Quiz - GEDCOM Parser    end
56Best of Ruby Quiz - GEDCOM Parserend

第50行调用  REXML::Text::normalize 将字符串escaping ,达到和XML输出差不多的效果,完成字符转义

 

[PS] 有一点不很明白,用REXML构造法得到的输出,和最后的方法得到的输出在空格处理上有一定差异,REXML对于多空格只输出一个,而最后两种方法忠实于输入

相关文章: