【问题标题】:Remove lines in one file, which are not present in another, by key [duplicate]按键删除一个文件中不存在于另一个文件中的行[重复]
【发布时间】:2021-02-04 10:27:35
【问题描述】:

我有两个文本文件,如下所示:

文件1:

STR_ape,        1000
STR_banana,     1001
STR_orange,     1004
STR_strawberry, 1005
STR_gooseberry, 1007
...

文件2:

1000="Some stringA"
1001="Some stringB"
1002="Some stringC"
1003="Some stringD"
1004="Some stringE"
1005="Some stringF"
1006="Some stringG"
1007="Some stringH"
...

因此,File1 中的一些字符串 ID:s 映射到 File2 中的一些字符串。我想要做的是删除 File2 中的所有字符串,这些字符串在 File1 中不存在。这意味着 File2 应该如下所示:

1000="Some stringA"
1001="Some stringB"
1004="Some stringE"
1005="Some stringF"
1007="Some stringH"
...

换句话说,文件1中不存在编号的字符串应该被删除。当然可以用计数器和for循环来实现,但是不知道Python3.X中是否有一些内置函数或者简单的方法可以做到这一点?

【问题讨论】:

标签: python


【解决方案1】:

我认为这是完成这项任务的最简单、最清晰的方法:

首先从file1中获取所有key,然后从file2中获取所有匹配的行,最后用匹配的行覆盖file2。

file1 = 'file1.txt'
file2 = 'file2.txt'

keys = []
with open(file1, 'r') as fp:
    for l in fp.readlines():
        key = l.split(',')[1].strip()
        keys.append(key)
        
new_lines = []
with open(file2, 'r') as fp:
    for l in fp.readlines():
        key = l.split('=')[0].strip()
        if key in keys:
            new_lines.append(l)
            
with open(file2, 'w') as fp:
    fp.writelines(new_lines)

【讨论】:

    【解决方案2】:

    带有io 模块的Python 3.x 是目前推荐的读取/写入文件的方法,做一些循环以消除不存在的ID:

    s1 = \
    """
    STR_ape,        1000
    STR_banana,     1001
    STR_orange,     1004
    STR_strawberry, 1005
    STR_gooseberry, 1007
    """
    
    s2 = \
    """
    1000="Some stringA"
    1001="Some stringB"
    1002="Some stringC"
    1003="Some stringD"
    1004="Some stringE"
    1005="Some stringF"
    1006="Some stringG"
    1007="Some stringH"
    """
    
    import io
    
    # with io.open("File1","r") as f:
    #     s1 = f.read().strip()
    # with io.open("File2","r") as f:
    #     s2 = f.read().strip()
    
    s1 = s1.strip()    
    s2 = s2.strip()
    
    # Process File1
    items = s1.split("\n")
    ids   = []
    
    for item in items:
        if len(item.strip())==0:
            continue # Skip blank lines, may be there
    
        id = item.split(",")[1].strip()
        ids.append(id)
    
    print(ids)
    
    # Process File2
    items = s2.split("\n")
    lines = []
    
    for item in items:
        if len(item.strip())==0:
            continue
    
        id = item.split("=")[0].strip()
    
        if id in ids:
            lines.append(item)
    
    text = "\n".join(lines)
    print(text)
    
    # Save text back to file, avoid writing to original File2 file
    with io.open("File3","w") as f:
        f.write(text)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2012-10-07
      • 2011-06-14
      • 2013-01-06
      • 1970-01-01
      • 1970-01-01
      • 2021-09-18
      • 1970-01-01
      • 2015-02-04
      相关资源
      最近更新 更多