【问题标题】:Append strings to array if found in paragraph using `.match` in Ruby如果在 Ruby 中使用 `.match` 在段落中找到字符串,则将字符串附加到数组
【发布时间】:2016-12-18 22:30:06
【问题描述】:

我正在尝试为数组中的每个单词搜索一个段落,然后输出一个新数组,其中仅包含可以找到的单词。

但到目前为止,我一直无法获得所需的输出格式。

paragraph = "Japan is a stratovolcanic archipelago of 6,852 islands.
The four largest are Honshu, Hokkaido, Kyushu and Shikoku, which make up about ninety-seven percent of Japan's land area.
The country is divided into 47 prefectures in eight regions."

words_to_find = %w[ Japan archipelago fishing country ]

words_found = []

words_to_find.each do |w|
    paragraph.match(/#{w}/) ? words_found << w : nil
end

puts words_found

目前我得到的输出是打印单词的垂直列表。

Japan
archipelago
country

但我想要['Japan', 'archipelago', 'country']

我没有太多在段落中匹配文本的经验,并且不确定我在这里做错了什么。谁能给点指导?

【问题讨论】:

  • words_found 已经是你想要的了。 puts 每行打印一个元素。
  • 啊,谢谢。我需要阅读 putsp
  • P.S.你可以p words_found看看它到底是什么。

标签: arrays ruby regex


【解决方案1】:

这里有几种方法可以做到这一点。两者都不区分大小写。

使用正则表达式

r = /
    \b                               # Match a word break
    #{ Regexp.union(words_to_find) } # Match any word in words_to_find
    \b                               # Match a word break
    /xi                              # Free-spacing regex definition mode (x)
                                     # and case-indifferent (i)
  #=> /
  #   \b                             # Match a word break
  #   (?-mix:Japan|archipelago|fishing|country) # Match any word in words_to_find
  #   \b                             # Match a word break
  #   /ix                            # Free-spacing regex definition mode (x)
                                     # and case-indifferent (i)

paragraph.scan(r).uniq(&:itself)
  #=> ["Japan", "archipelago", "country"]  

两个数组相交

words_to_find_hash = words_to_find.each_with_object({}) { |w,h| h[w.downcase] = w }
  #=> {"japan"=>"Japan", "archipelago"=>"archipelago", "fishing"=>"fishing",
       "country"=>"country"}  

words_to_find_hash.values_at(*paragraph.delete(".;:,?'").
                               downcase.
                               split.
                               uniq & words_to_find_hash.keys)
  #=> ["Japan", "archipelago", "country"] 

【讨论】:

    【解决方案2】:

    这是因为您使用puts 来打印数组的元素。将"\n" 附加到每个元素“word”的末尾:

    #!/usr/bin/env ruby
    def run_me
    
    
    
        paragraph = "Japan is a stratovolcanic archipelago of 6,852 islands.
        the four largest are Honshu, Hokkaido, Kyushu and Shikoku, which make up about ninety-seven percent of Japan's land area.
        the country is divided into 47 prefectures in eight regions."
    
        words_to_find = %w[ Japan archipelago fishing country ]
    
    
        find_words_from_a_text_file paragraph , words_to_find
    
    
    
    end
    
    
    
    def  find_words_from_a_text_file( paragraph  , *words_to_find )
        words_found = []
    
        words_to_find.each do |w|
                  paragraph.match(/#{w}/) ? words_found << w : nil
        end
    
        #  print array with enum .  
        words_found.each { |x| puts "with enum and puts : : #{x}" }
    
        # or just use "print , which does not add anew line"
        print "with print :"; print  words_found "\n"
    
        # or with p
        p words_found
    
    end
    
    
    run_me
    

    输出:

    za:ruby_dir za$ ./fooscript.rb 
    with enum and puts : : ["Japan", "archipelago", "fishing", "country"]
    with print :[["Japan", "archipelago", "fishing", "country"]]
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2023-01-19
      • 2016-03-02
      • 1970-01-01
      • 2015-02-13
      • 1970-01-01
      • 2019-02-13
      • 2014-11-19
      • 1970-01-01
      相关资源
      最近更新 更多