【问题标题】:how to use two or more delimiters with split() in python如何在python中使用两个或多个分隔符和split()
【发布时间】:2020-01-14 04:55:42
【问题描述】:

我有文本,其中分隔符可以是列表中的任何内容 [;,.?]

txt1 = "Kids of today have started selling drugs or taken drugs at this age, then we are finished as parent,what generation are we going to have when our generation is no more,am sick to my stomach, it means we do not have tomorrow leaders or future leader, drugs at this stage woowowow parent and Guidance's fasten your belt if not we will wake up someday to see what we never thought could happen"
txt2 = "There was a clear warning sign, and this person chose to take a risk regardless. It was quite a stupid decision to climb the fence, but even this is probably a common activity that generally never results in death. More of a freak accident than a definite way for someone to die. At the very most, the only changes that should be made by the airport / authorities would be to the fence design, making it more difficult for people to climb up. Barricading the area off completely and banning people from the area would be comparable to fencing off a scenic mountain path that hundreds of people like to climb and enjoy safely, but which does produce the occasional fatality when people slip. Just because this area carries a (clearly communicated) risk shouldn't be a reason for the authorities to step in and make adjustments. People take risks and are responsible for their own safety in areas like this. One fatality is a tiny drop in the bucket compared to the hundreds of people doing this each month without incident."

如何根据分隔符的存在将多行句子分成独立的句子。例如,在 txt1 中,分隔符应为“,”(逗号),而在 txt2 中,分隔符应为“.”(点)。

我为此使用了 re.split(),但没有得到想要的结果。我用过:

 print(re.split(';|,|.|?',txt1))

【问题讨论】:

标签: python regex python-3.x split


【解决方案1】:

你必须在.?前面添加一个escape character\

print(re.split(';|,|\.|\?',txt1))

为避免出现空白字符/空字符串,请进行列表理解

[x for x in re.split(';|,|\.|\?',txt1) if x]

【讨论】:

    【解决方案2】:

    点和问号都是正则表达式元字符,这意味着这些字符,当不转义使用时,具有特殊含义,not表示它们的字面值。一个快速解决您的问题的方法是拆分正则表达式:

    print(re.split('[;,.?]', txt1))
    

    【讨论】:

      【解决方案3】:

      试试这个:

      import re
      DATA = "sample, text"
      print(re.split(r'[;,.?]+', DATA))
      

      【讨论】:

      • 请始终将您的答案放在上下文中,而不仅仅是粘贴代码。有关详细信息,请参阅here
      【解决方案4】:

      如果有分隔符列表,可以直接传递。

      从列表中创建一个字符串,格式为 '[your delimiters]'

      del_list = '[your delimiters]'
      print(re.split('{0}'.format(del_list), txt1))
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2022-06-28
        • 2013-11-26
        • 2021-12-04
        • 2014-11-26
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多