如何在 Python 中删除带或不带空格的空行答案

【问题标题】：How to remove empty lines with or without whitespace in Python如何在 Python 中删除带或不带空格的空行
【发布时间】：2011-04-12 07:54:33
【问题描述】：

我有一个大字符串，我用换行符分割。如何删除所有空行（仅限空格）？

伪代码：

for stuff in largestring:
   remove stuff that is blank

【问题讨论】：

For myself, I found the answer here is the best solution
一个删除空行（没有空格）的衬里是this。问题标题可能会更改为“仅在 python 中删除带有空格的空行”。

标签： python string

【解决方案1】：

str_whith_space = """
    example line 1

    example line 2
    example line 3

    example line 4"""

new_str = '\n'.join(el.strip() for el in str_whith_space.split('\n') if el.strip())
print(new_str)

输出：

""" <br>
example line 1 <br>
example line 2 <br>
example line 3 <br>
example line 4 <br>
"""

【讨论】：

【解决方案2】：

使用正则表达式：

re.sub(r'(?<=\n)\s+', '', s, re.MULTILINE)

当你输入时：

foo
<tab> <tab>

bar

输出将是：

foo
bar

【讨论】：

【解决方案3】：

我使用此解决方案删除空行并将所有内容合并为一行：

match_p = re.sub(r'\s{2}', '', my_txt) # my_txt is text above

【讨论】：

【解决方案4】：

lines = bigstring.split('\n')
lines = [line for line in lines if line.strip()]

【讨论】：

这适用于 lines = ['Line\n', '\n', 'Line\n'] 但输入是 'Line\n\nLine\n' 。
@Walter：实际上，如果你使用 'Line\n\nLine\n'.split() 就像你应该使用的那样，它会工作得很好。
与bigstring.split('\n')一起为我工作
这根本不是 OP 所要求的。用“a b c”试试：它返回“a\nb\nc”。

【解决方案5】：

如果你不愿意尝试正则表达式（你应该这样做），你可以使用这个：

s.replace('\n\n','\n')

重复几次以确保没有空行。或者链接命令：

s.replace('\n\n','\n').replace('\n\n','\n')

_{只是为了鼓励您使用正则表达式，这里有两个我觉得很直观的介绍性视频：

• Regular Expressions (Regex) Tutorial

• Python Tutorial: re Module}

【讨论】：

例如，您可能想要使用正则表达式。在编写代码时，“重复几行以确保”不是一个好主意，因为您可能会留下未解决的问题或浪费时间运行超出需要的次数。
+1 到正则表达式，但作为一个懒惰的黑客（或者如果导入正则表达式模块太慢），您可以链接替换语句：s.replace('\n\n','\n').replace('\n\n','\n') 在 3.6 上测试。
@evan_b 没有想到链接命令。哪个会先被执行？
执行顺序似乎是从左到右的，但经过短暂搜索后，我无法在任何地方找到记录，因此依赖于顺序敏感的替换可能不安全。跨度>

【解决方案6】：

你可以简单地使用 rstrip:

    for stuff in largestring:
        print(stuff.rstrip("\n")

【讨论】：

【解决方案7】：

惊讶的是没有建议多行 re.sub （哦，因为你已经分割了你的字符串......但是为什么？）：

>>> import re
>>> a = "Foo\n \nBar\nBaz\n\n   Garply\n  \n"
>>> print a
Foo

Bar
Baz

        Garply


>>> print(re.sub(r'\n\s*\n','\n',a,re.MULTILINE))
Foo
Bar
Baz
        Garply

>>>

【讨论】：

在多行 sub 上，\s* 将匹配任意数量的 \n 和任何其他空格：> >>> import re >>> a = "foo\n \n\t\n \nbar\n\n \n baz" > >>> print(re.sub(r'\n\s*\n','\n',a,re.MULTILINE)) > foo > bar > baz grumble .我显然无法弄清楚 cmets 中的降价。

【解决方案8】：

和@NullUserException 说的一样，我是这样写的：

removedWhitespce = re.sub(r'^\s*$', '', line)

【讨论】：

【解决方案9】：

我的版本：

while '' in all_lines:
    all_lines.pop(all_lines.index(''))

【讨论】：

【解决方案10】：

我也尝试了正则表达式和列表解决方案，列表一个更快。

这是我的解决方案（根据以前的答案）：

text = "\n".join([ll.rstrip() for ll in original_text.splitlines() if ll.strip()])

【讨论】：

【解决方案11】：

使用正则表达式：

if re.match(r'^\s*$', line):
    # line is empty (has only the following: \t\n\r and whitespace)

使用正则表达式 + filter():

filtered = filter(lambda x: not re.match(r'^\s*$', x), original)

如在codepad 上看到的那样。

【讨论】：

感谢所有结果，但是，这个解决方案正是我一直在寻找的！非常感谢
gimel 的解决方案，在之后重新加入文本，提供了更好的性能。我在一个小文本上比较了这两种解决方案（如果其中 3 行是空白的，则为 10 行）。结果如下：正则表达式：1000 loops, best of 3: 452 us per loop；加入、拆分和剥离：100000 loops, best of 3: 5.41 us per loop
介绍视频：Python Tutorial: re Module - How to Write and Match Regular Expressions (Regex) - YouTube

【解决方案12】：

尝试列表理解和string.strip():

>>> mystr = "L1\nL2\n\nL3\nL4\n  \n\nL5"
>>> mystr.split('\n')
['L1', 'L2', '', 'L3', 'L4', '  ', '', 'L5']
>>> [line for line in mystr.split('\n') if line.strip() != '']
['L1', 'L2', 'L3', 'L4', 'L5']

【讨论】：

您可以通过省略 != '' 来缩短它，只需 "if line.strip()"