如何打开文件，转换字符串并写入新文件答案

【问题标题】：How to open file, convert string and write to new file如何打开文件，转换字符串并写入新文件
【发布时间】：2018-04-03 18:06:49
【问题描述】：

我正在尝试打开一个文本文件，删除某些后面有 ] 的单词，然后将新内容写入一个新文件。使用以下代码，new_content 包含我需要的内容，并创建了一个新文件，但它是空的。我不知道为什么。我尝试过不同的缩进并传入编码类型，但没有成功。非常感谢任何帮助。

import glob
import os
import nltk, re, pprint
from nltk import word_tokenize, sent_tokenize
import pandas
import string
import collections

path = "/pathtofiles"

for file in glob.glob(os.path.join(path, '*.txt')):
    if file.endswith(".txt"):
        f = open(file, 'r')
        flines = f.readlines()
        for line in flines: 
            content = line.split() 

            for word in content:
                if word.endswith(']'):
                    content.remove(word)

            new_content = ' '.join(content)

            f2 = open((file.rsplit( ".", 1 )[ 0 ] ) + "_preprocessed.txt", "w")
            f2.write(new_content)
            f.close

【问题讨论】：

for word in content: if word.endswith(']'): content.remove(word) 在迭代时被删除：bad
f.close 什么都不做，缩进是错误的。
if file.endswith(".txt") 保证始终为真，因为您执行了 globbing。
你根本没有关闭f2
您应该使用模式'a'打开文件进行写入。请参阅：docs.python.org/3/library/functions.html#open。或者做一个单词列表，然后使用writelines

标签： python string text

【解决方案1】：

这应该可以@firefly。如果您有问题，很乐意回答。

import glob
import os

path = "/pathtofiles"

for file in glob.glob(os.path.join(path, '*.txt')):
    if file.endswith(".txt"):
        with open(file, 'r') as f:
            flines = f.readlines()
            new_content = []
            for line in flines: 
                content = line.split() 

                new_content_line = []

                for word in content:
                    if not word.endswith(']'):
                        new_content_line.append(word)

                new_content.append(' '.join(new_content_line))

            f2 = open((file.rsplit( ".", 1 )[ 0 ] ) + "_preprocessed.txt", "w")
            f2.write('\n'.join(new_content))
            f.close
            f2.close

【讨论】：

你确定mode="w"？
在我的机器上工作¯_(ツ)_/¯。 AFAIK w 和 a 之间的区别只是创建一个新文件而不是追加。 OP 似乎每次都表示想要一个新文件，所以w 对我来说更有意义
谢谢@PeterDolan！我理解这是在做什么以及为什么它更好。但是文件对我来说仍然是空白的！我在 Mac 上工作，也在 Windows 上尝试过。还有什么我可能做错的吗？
因此，由于某种原因，最终这对某些决赛有效，但并非全部...通过在r 和w 之后添加encoding = 'utf-8' 它适用于所有文件。
有趣，也许你的文件中有一些奇怪的字符。很高兴听到它的工作——还有其他问题吗？