【问题标题】：Python - Writing Separate Files per Section of a Single FilePython - 为单个文件的每个部分编写单独的文件
【发布时间】：2017-05-23 23:25:12
【问题描述】：

我有一个包含 5 个数据部分的 .txt 文件。每个部分都有一个标题行“Section X”。我想从这个单个文件中解析和写入 5 个单独的文件。该部分将从标题开始并在下一个部分标题之前结束。下面的代码创建了 5 个单独的文件；但是，它们都是空白的。

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2",
    "Section 3", "Section 4", "Section 5"]

with open(filename+".txt", "rb") as oldfile:
    for i in dimensionsList:
        licycle = cycle(dimensionsList)
        nextelem = licycle.next()
        with open(i+".txt", "w") as newfile: 
            for line in oldfile:
                if line.strip() == i:
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    break
                newfile.write(line)

【问题讨论】：

标签： python python-2.7 parsing

【解决方案1】：

问题

测试您的代码，它仅适用于第 1 部分（其他部分对我来说也是空白的）。我意识到问题在于各部分之间的转换（以及 licycle 在所有迭代中重新启动）。

第 2 节在第二个 for (if line.strip() == nextelem) 处读取。下一行是第 2 节的数据（而不是文本 Section 2）。

文字难，但测试以下代码：

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                  "Section 5"]

with open(filename + ".txt", "rb") as oldfile:
    licycle = cycle(dimensionsList)
    nextelem = licycle.next()
    for i in dimensionsList:
        print(nextelem)
        with open(i + ".txt", "w") as newfile:
            for line in oldfile:
                print("ignoring %s" % (line.strip()))
                if line.strip() == i:
                    nextelem = licycle.next()
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    # nextelem = licycle.next()
                    print("ignoring %s" % (line.strip()))
                    break
                print("printing %s" % (line.strip()))
                newfile.write(line)
            print('')

它将打印：

Section 1
ignoring Section 1
printing aaaa
printing bbbb
ignoring Section 2

Section 2
ignoring ccc
ignoring ddd
ignoring Section 3
ignoring eee
ignoring fff
ignoring Section 4
ignoring ggg
ignoring hhh
ignoring Section 5
ignoring iii
ignoring jjj

Section 2

Section 2

Section 2

它适用于第 1 节，它检测到第 2 节，但它一直忽略行，因为它没有找到“第 2 节”。

如果每次重新启动线路（总是从第 1 行开始），我认为程序会运行。但我做了一个更简单的代码，应该对你有用。

解决方案

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                  "Section 5"]

with open(filename + ".txt", "rb") as oldfile:

    licycle = cycle(dimensionsList)
    nextelem = licycle.next()
    newfile = None
    line = oldfile.readline()

    while line:

        # Case 1: Found new section
        if line.strip() == nextelem:
            if newfile is not None:
                newfile.close()
            nextelem = licycle.next()
            newfile = open(line.strip() + '.txt', 'w')

        # Case 2: Print line to current section
        elif newfile is not None:
            newfile.write(line)

        line = oldfile.readline()

如果它找到该部分，它将开始在这个新文件处写入。否则，继续写入当前文件。

Ps.：下面，我使用的文件作为示例：

Section 1
aaaa
bbbb
Section 2
ccc
ddd
Section 3
eee
fff
Section 4
ggg
hhh
Section 5
iii
jjj

【讨论】：