【问题标题】:Finding which byte position differs between files with Python 3使用 Python 3 查找文件之间的哪个字节位置不同
【发布时间】:2021-02-24 15:26:52
【问题描述】:

我一直在寻找一个 Python 3 脚本,它可以找出两个相同大小的文件之间的字节(或者更确切地说,它们的位置)不同。 Filecmp 只给出真/假,我发现的其他脚本都没有工作。我做了这个,虽然它适用于较小的文件,但对于大文件来说非常慢。有没有人有更好(阅读:更快)的脚本?或者一种改进我的方法以使其更快。

#Get file length
items = [a,b,c]
file = open(a,"rb")
filelength = len(file.read())
file.close()

#Create dictionaries where bytes will be stored
comp = {}
compare = {}
for i in range(0,filelength):
    comp[i] = []
    compare[i] = {}

#Add bytes to dictionaries
for i in items:
    file = open(i,"rb")
    k = file.read(1)
    count = 0
    while k:
        comp[count].append(k)
        k = file.read(1)
        count = count + 1
    file.close()

#Get if the byte at the same position is identical to the others or not
for i in range(0,filelength):
    for n in comp[i]:
        if n in compare[i]:
            compare[i][n] = compare[i][n] + 1
        else:
            compare[i][n] = 1

#Print which byte differs
for i in compare:
    if compare[i][max(compare[i], key=compare[i].get)] < len(items):
        print("Byte number: " + str(i+1))

【问题讨论】:

  • 我是否正确解释了您的代码,项目包含要相互比较的文件列表,并且您想知道每个文件中的哪个位置出现任何不匹配?
  • @itprorh66 是的,就是这样!

标签: python python-3.x byte


【解决方案1】:

我不知道这个解决方案是否比你的解决方案更快,但实现起来更简单。

正在使用的函数:

def read_byte(afl):
    rslt = [None]*len(afl)
    for i in range(len(afl)):
        if afl[i]:
            rslt[i] = afl[i].read(1)
    return rslt   
                
def closeFiles(afl):
    for f in afl:
        f.close
        
def all_bytes_equal(data):
    for i in range(1, len(data)):
        if data[0] != data[i]:
            return False
    return True

def compareFiles(flist):
    active_files = [None] * len(flist)
    byte_count = 0
    for i in range(len(flist)):
        active_files[i] = open(flist[i],"rb")
    while True:
        data = read_byte(active_files)
        if all_bytes_equal(data):
            if data[0] == b'':
                print ("All files compare")
                closeFiles(active_files)
                break
            byte_count += 1
        else:
            for i in range(1, len(active_files)):
                if data[0] != data[i]:
                    print(f"File {file_list[i]} differs at byte count {byte_count}")
            closeFiles(active_files)
            break
             

实现:

file_list = ['test1.txt','test2.txt','test3.txt'] #list of paths to each file being compared
compareFiles(file_list)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2015-09-13
    • 2017-01-13
    • 2018-06-24
    • 2015-10-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多