【问题标题】:Checking for gzip or plain text and reading the file without checking extensions - python检查 gzip 或纯文本并在不检查扩展名的情况下读取文件 - python
【发布时间】:2014-10-29 10:00:42
【问题描述】:

我有文件,读取时它们的格式完全相同,但唯一的区别是我不确定其中一些文件是否为gzip

示例文件如下:

der ||| the ||| 0.3 ||| ||| 
das ||| the ||| 0.4 ||| |||  
das ||| it ||| 0.1 ||| ||| 
das ||| this ||| 0.1 ||| ||| 
die ||| the ||| 0.3 ||| ||| 

当我读到它时,我正在这样做:

try: 
    with gzip.open(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something
except:
    with open(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something

有没有其他方法可以避免丑陋的重复代码?(注意 #do something 是一段很长的代码)

有没有办法做到以下几点?

try: 
    with gzip.open(phrasetablefile, 'rb') as fin:
except:
    with open(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something

【问题讨论】:

  • 为什么不把它拉到一个函数中,例如process,那么就只有process(fin)?另外,至少应该是except Exception
  • @jonrsharpe,智慧之言。

标签: python file-io compression gzip


【解决方案1】:

警告:未经测试的代码

要么做(正如@jonrsharpe 建议的那样):

def process(fin):
    for line in fin:
        pass # do something

try:
    with gzip.open(phrasetablefile, 'rb') as fin:
        process(fin)
except:
    with open(phrasetablefile, 'rb') as fin:
        process(fin)

或者试试这样的:

try: 
    fin = gzip.open(phrasetablefile, 'rb')
except:
    fin = open(phrasetablefile, 'rb')

for line in fin:
    pass # do something
fin.close()

【讨论】:

    【解决方案2】:

    如果你有一个 gzip 后缀,你可以这样做吗?

    if phrasetablefile.endswith('.gz'):
        opener = gzip.open
    else:
        opener = open
    
    with opener(phrasetablefile, 'rb') as fin:
        for line in fin:
            # do something
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2013-08-31
      • 1970-01-01
      • 2015-06-30
      • 1970-01-01
      • 1970-01-01
      • 2014-12-10
      • 1970-01-01
      • 2014-11-08
      相关资源
      最近更新 更多