使用 Python 根据数据将文件拆分为多个部分答案

【问题标题】：Split a file into parts based on the data using Python使用 Python 根据数据将文件拆分为多个部分
【发布时间】：2014-03-23 05:25:01
【问题描述】：

我正在寻找与 awk 脚本等效的 Python 脚本，以根据记录中的标志将文件拆分为 26 个部分。这是因为一个文件中有 26 种不同的记录类型，这是 Burroughs 在 1970 年代使用的分层数据库的遗留问题。我希望能够打开 26 个名为 f_A 到 f_Z 的文件，而不是传统的 f1，然后在我读取它们时将记录流式传输出来，而不会将整个文件保存在缓冲区中。

# Gawk original - split new valuation roll format into record types A-Z
# run gawk -F\| -f split.awk input_file
# creates A.raw, B.raw, .... Z.raw
# Oct 1995 
{ident = $8; 
file = ident".raw";
print $0 >> file}

所以我想我可以组成一个文件句柄，然后用 eval() 或其他东西调用它以将每条记录定向到正确的输出。

for line in fileinput.input(src):
    parts = line.split('|')
    recType = parts[7]
    recFile = 'f_'+recType
    if not recType in openFiles:
        eval(recFile) = open(recType+".raw",'w') # how should this line be written?
    eval(recFile).write(line)
    # ....

我可以从 f1.name 得到系统文件的名字并评估一个变量以获取句柄，例如 eval("f_A") 但我看不到如何使用未硬编码的句柄打开文件。

【问题讨论】：

标签： python file split eval

【解决方案1】：

eval 是需要避免的，幸运的是，它几乎从不需要。在这种情况下，open(recType+".raw",'w') 创建一个文件句柄。您只需要将其与 recType 关联。这就是字典的用途。

在下面的代码中，openFiles 是一个字典。每次我们遇到一个新的recType 时，我们都会为它打开一个文件并将其文件句柄保存在openFiles 中的recType 键下。每当我们想再次写入该文件时，我们只需向字典询问文件句柄。因此：

openFiles = {}
for line in fileinput.input(src):
    parts = line.split('|')
    recType = parts[7]
    if not recType in openFiles:
        openFiles[recType] = open('f_' + recType, 'w')
    openFiles[recType].write(line)
    # ....

【讨论】：

请提供一些解释以更具教育意义。
@LaszloPapp 好的。我希望它现在更具教育意义。