【问题标题】:Selecting an entire paragraph by just matching a string仅通过匹配字符串来选择整个段落
【发布时间】:2015-09-09 13:46:16
【问题描述】:

假设我在阅读文件时有两个段落:

'Baa, baa, black sheep, have you any wool?
Yes sir, yes sir, three bags full!
One for the master
One for the dame'

'Mary had a little lamb,
its fleece was white as snow;
And everywhere that Mary went,
the lamb was sure to go.'

如果我搜索'lamb',是否有任何代码(使用正则表达式或其他东西)会选择整个第二段?

【问题讨论】:

    标签: python regex nltk text-mining


    【解决方案1】:

    假设所有段落都在一个字符串中,这样的事情应该可以工作:

    def select_paragraph(text, word, delimiter='\n'):
        return [p for p in text.split(delimiter) if word in p]
    

    【讨论】:

      【解决方案2】:

      这将选择一个包含lamb的段落:

      ([^\']*(?=lamb)[^\']*)
      

      DEMO

      这是python代码:

      import re
      data = """
      'Baa, baa, black sheep, have you any wool?
      Yes sir, yes sir, three bags full!
      One for the master
      One for the dame'
      
      'Mary had a little lamb,
      its fleece was white as snow;
      And everywhere that Mary went,
      the lamb was sure to go.'
      """
      
      match = re.search('([^\']*(?=lamb)[^\']*)',data)
      print(match.group())
      

      Output:

      Mary had a little lamb,
      its fleece was white as snow;
      And everywhere that Mary went,
      the lamb was sure to go.
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2021-03-30
        • 2016-05-16
        • 2021-04-11
        • 1970-01-01
        • 2018-03-28
        相关资源
        最近更新 更多