【问题标题】:Python - Writing Separate Files per Section of a Single FilePython - 为单个文件的每个部分编写单独的文件
【发布时间】:2017-05-23 23:25:12
【问题描述】:

我有一个包含 5 个数据部分的 .txt 文件。每个部分都有一个标题行“Section X”。我想从这个单个文件中解析和写入 5 个单独的文件。该部分将从标题开始并在下一个部分标题之前结束。下面的代码创建了 5 个单独的文件;但是,它们都是空白的。

from itertools import cycle

filename = raw_input("Which file?: \n")

dimensionsList = ["Section 1", "Section 2",
    "Section 3", "Section 4", "Section 5"]

with open(filename+".txt", "rb") as oldfile:
    for i in dimensionsList:
        licycle = cycle(dimensionsList)
        nextelem = licycle.next()
        with open(i+".txt", "w") as newfile: 
            for line in oldfile:
                if line.strip() == i:
                    break
            for line in oldfile:
                if line.strip() == nextelem:
                    break
                newfile.write(line)

【问题讨论】:

    标签: python python-2.7 parsing


    【解决方案1】:

    问题

    测试您的代码,它仅适用于第 1 部分(其他部分对我来说也是空白的)。我意识到问题在于各部分之间的转换(以及 licycle 在所有迭代中重新启动)。

    第 2 节在第二个 for (if line.strip() == nextelem) 处读取。下一行是第 2 节的数据(而不是文本 Section 2)。

    文字难,但测试以下代码:

    from itertools import cycle
    
    filename = raw_input("Which file?: \n")
    
    dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                      "Section 5"]
    
    with open(filename + ".txt", "rb") as oldfile:
        licycle = cycle(dimensionsList)
        nextelem = licycle.next()
        for i in dimensionsList:
            print(nextelem)
            with open(i + ".txt", "w") as newfile:
                for line in oldfile:
                    print("ignoring %s" % (line.strip()))
                    if line.strip() == i:
                        nextelem = licycle.next()
                        break
                for line in oldfile:
                    if line.strip() == nextelem:
                        # nextelem = licycle.next()
                        print("ignoring %s" % (line.strip()))
                        break
                    print("printing %s" % (line.strip()))
                    newfile.write(line)
                print('')
    

    它将打印:

    Section 1
    ignoring Section 1
    printing aaaa
    printing bbbb
    ignoring Section 2
    
    Section 2
    ignoring ccc
    ignoring ddd
    ignoring Section 3
    ignoring eee
    ignoring fff
    ignoring Section 4
    ignoring ggg
    ignoring hhh
    ignoring Section 5
    ignoring iii
    ignoring jjj
    
    Section 2
    
    Section 2
    
    Section 2
    

    它适用于第 1 节,它检测到第 2 节,但它一直忽略行,因为它没有找到“第 2 节”。

    如果每次重新启动线路(总是从第 1 行开始),我认为程序会运行。但我做了一个更简单的代码,应该对你有用。

    解决方案

    from itertools import cycle
    
    filename = raw_input("Which file?: \n")
    
    dimensionsList = ["Section 1", "Section 2", "Section 3", "Section 4",
                      "Section 5"]
    
    with open(filename + ".txt", "rb") as oldfile:
    
        licycle = cycle(dimensionsList)
        nextelem = licycle.next()
        newfile = None
        line = oldfile.readline()
    
        while line:
    
            # Case 1: Found new section
            if line.strip() == nextelem:
                if newfile is not None:
                    newfile.close()
                nextelem = licycle.next()
                newfile = open(line.strip() + '.txt', 'w')
    
            # Case 2: Print line to current section
            elif newfile is not None:
                newfile.write(line)
    
            line = oldfile.readline()
    

    如果它找到该部分,它将开始在这个新文件处写入。否则,继续写入当前文件。

    Ps.:下面,我使用的文件作为示例:

    Section 1
    aaaa
    bbbb
    Section 2
    ccc
    ddd
    Section 3
    eee
    fff
    Section 4
    ggg
    hhh
    Section 5
    iii
    jjj
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 2019-11-17
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2018-09-01
      • 2018-04-26
      相关资源
      最近更新 更多