【问题标题】:Regular expression to remove two line breaks conditionally有条件地删除两个换行符的正则表达式
【发布时间】:2022-01-15 09:49:30
【问题描述】:

我正在尝试删除文本文件中每行末尾的两个换行符,但前提是它遵循小写字母,即 [a-z]。 我有以下代码:

import re
text = "I want this \n\nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
texts = re.sub(r"(?<=[a-z])\r?\n\n"," ", text)
print(texts)

我要获取的文字

text = "I want this \nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."

在我的代码中,文本没有任何变化。我的代码遵循这个已回答的问题:Regular expression to remove line breaks,但我无法使其适应两个换行符。

【问题讨论】:

  • “如果它跟在小写字母后面”是指它跟在小写字母后面吗?恰恰相反

标签: python re


【解决方案1】:

看来你需要一个积极的前瞻,而不是后瞻,在之后有一个小写字符

import re

expected = "I want this \nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
text     = "I want this \n\nwith one line break \n\nIn this place \nNothing happens \n\nHere\nnothing should\nhappen too."
texts = re.sub(r"\n\n(?=[a-z])", "\n", text)

print(texts == expected)

【讨论】:

  • OP 确实在后面看了。你提出了一个前瞻(这是正确的,但你的命名是错误的)
猜你喜欢
  • 2011-07-01
  • 1970-01-01
  • 2020-08-24
  • 2019-02-13
  • 1970-01-01
  • 2016-08-23
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多