【发布时间】:2020-07-03 10:58:24
【问题描述】:
在下面的方法中,我根据时间戳对文件的内容进行排序,它也可以正常工作 但是我不知道在写入新文件时如何添加新行。它在同一行中写入我想更改输出文件中的行,因为输入非常大我需要在此使用块案子 所以在这里使用 readlines 或存储在任何数据结构中都行不通
1)我的输入文件格式如下
TIME[04.26_12:30:30:853664] ID[ROLL:201987623] MARKS[PHY:100|MATH:200|CHEM:400]
TIME[03.27_12:29:30.553669] ID[ROLL:201987623] MARKS[PHY:100|MATH:1200|CHEM:900]
TIME[03.26_12:28:30.753664] ID[ROLL:2341987623] MARKS[PHY:100|MATH:200|CHEM:400]
TIME[03.26_12:29:30.853664] ID[ROLL:201978623] MARKS[PHY:0|MATH:0|CHEM:40]
TIME[04.27_12:29:30.553664] ID[ROLL:2034287623] MARKS[PHY:100|MATH:200|CHEM:400]
代码如下
import re
from functools import partial
from itertools import groupby
from typing import Tuple
regex = re.compile(r"^.*TIME\[([^]]+)\]ID\[ROLL:([^]]+)\].+$")
def func1(arg) -> bool:
return regex.match(arg)
def func2(arg) -> Tuple[str, int]:
match = regex.match(arg)
if match:
return match.group(1), int(match.group(2))
return "", 0
def func3(arg) -> int:
match = regex.match(arg)
if match:
return int(match.group(2))
return 0
def read_in_chunks(file_object, chunk_size=1024*1024):
while True:
data = file_object.read(chunk_size)
if not data:
break
yield data
with open('b.txt') as fr:
for chunk in read_in_chunks(fr):
collection = filter(func1, chunk.splitlines())
collection = sorted(collection, key=func2)
for key, group in groupby(collection, key=func3):
with open(f"ROLL_{key}", mode="wa") as fw:
fw.writelines(group)# want suggestions to append newline character before every line
2)我现在得到的实际输出
在文件名ROLL_201987623.txt中
TIME[03.27_12:29:30.553669] ID[ROLL:201987623] MARKS[PHY:100|MATH:1200|CHEM:900] TIME[04.26_12:30:30:853664] ID[ROLL:201987623] MARKS[PHY:100|MATH:200|CHEM:400]
3)期望的输出(我想改变输入格式中给出的行)
TIME[03.27_12:29:30.553669] ID[ROLL:201987623] MARKS[PHY:100|MATH:1200|CHEM:900]
TIME[04.26_12:30:30:853664] ID[ROLL:201987623] MARKS[PHY:100|MATH:200|CHEM:400]
目前我在同一行获得输出,这对我来说是主要问题?
【问题讨论】:
-
line + '\n'呢? -
你想说'fw.writelines(group+'\n')'?
-
它给出了错误@snakecharmerb TypeError: unsupported operand type(s) for +: 'itertools._grouper' and 'str'
标签: python python-3.x file-handling