Python 合并没有空白新行的行答案

【问题标题】：Python combine lines without blank new linesPython 合并没有空白新行的行
【发布时间】：2020-04-21 07:23:58
【问题描述】：

我需要您的帮助来解决以下问题。例如，我有一些大文本文件：

This is the Name of the Person

This is his surname

He likes to sing 
every time.

我只想将He likes to sing 与every time. 合并，因为在此之后我对每个字符串执行其他正则表达式。

所以输出应该是：

This is the Name of the Person

This is his surname

He likes to sing every time.

所以我试过了：

for file in file_list:
    with open(file, 'r', encoding='UTF-8', errors='ignore') as f_in:
        for line in f_in:
              if not line.startswith('\n'):
                line.replace('\n', '')
                print(line)

感谢您的帮助。

【问题讨论】：

print() 默认会在行尾添加换行符。试试print(line, end="")

标签： python newline

【解决方案1】：

你可以试试这个：

for file in file_list:
    with open(file, 'r', encoding='UTF-8', errors='ignore') as f_in:
        lines = [i.replace('\n', ' ') for i in f_in.read().split('\n\n')]

    # here you do something with your `lines`

【讨论】：

【解决方案2】：

我认为这样做会更好：

for file_name in file_list:
    with open(file_name, "r", encoding="UTF-8", errors="ignore") as file:
        text = file.read()
        text_blocks = text.split("\n\n")
        for text_block in text_blocks:
            formatted_text_block = text_block.replace("\n", "")
            # then you can do what ever you want with this new block of text

【讨论】：

【解决方案3】：

您可以在\n\n 上拆分部分，然后通过在\n 上拆分来合并每个部分：

with open("data.txt") as f:
    for line in f.read().split("\n\n"):
        print("".join(line.split("\n")) + "\n")

输出：

This is the Name of the Person

This is his surname

He likes to sing every time.

如果要将输出写回新文件，可以这样做：

with open("data.txt") as f, open("output.txt", mode="w") as o:
    for line in f.read().split("\n\n"):
        o.write("".join(line.split("\n")) + "\n\n")

我们需要添加一个额外的\n，因为我们不打印。

output.txt

This is the Name of the Person

This is his surname

He likes to sing every time.

另一种选择是将所有行收集到一个字符串中，然后将整个字符串内容写入文件：

with open("data.txt") as f, open("output.txt", mode="w") as o:
    lines = "\n\n".join("".join(line.split("\n")) for line in f.read().split("\n\n"))
    o.writelines(lines)

上述解决方案的问题是他们在处理之前使用read() 将整个文件内容读入内存，这对于大文件可能会很慢。

相反，我们可以创建一个生成器函数，从文件中生成部分：

def collect_file_sections(f):
    section = []
    for line in f:
        line = line.strip()
        if line:
            section.append(line)
            continue
        yield section
        section = []
    yield section

然后写成这样的部分：

with open("data.txt") as f, open("output.txt", mode="w") as o:
    o.writelines("\n\n".join(" ".join(section) for section in collect_file_sections(f)))

【讨论】：