【问题标题】:Print lines of csv file that contain specified keyword打印包含指定关键字的csv文件行
【发布时间】:2014-06-19 10:06:56
【问题描述】:

我是 Python 新手,但我想对一些 csv 文件进行一些数据分析。我想从仅包含一些关键字的 csv 文件中打印行。我使用第一个块打印所有有效行。从这些行中,我想打印包括关键字在内的行。谢谢你的帮助。

csv.field_size_limit(sys.maxsize)
invalids = 0
valids = 0
for f in ['1.csv']:
    reader = csv.reader(open(f, 'rU'), delimiter='|', quotechar='\\')
    for row in reader:
        try:
            print row[2] 
            valids += 1
        except:
            invalids += 1
print 'parsed %s records. ignored %s' % (valids, invalids)

带关键字:

    for w in ['ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar', 'volkswagen','chevrolet','chrysler']: 

我想我需要用 if 语句过滤我的顶级代码,但我已经为此苦苦挣扎了好几个小时,似乎无法让它工作。

【问题讨论】:

  • 您希望在哪些列中搜索关键字?
  • 该文件是一个单列的 CSV(所以是第一列)。谢谢
  • 所以你根本不需要csv 模块。
  • 请更具体地说明您到目前为止所做的尝试。这将表明您实际上已经完成了一些工作,并引起了对您的问题的更多关注。

标签: python csv if-statement printing lines


【解决方案1】:

你的猜测是正确的。您需要做的就是使用 if 语句过滤行,检查每个字段是否与关键字匹配。以下是您的操作方法(我还对您的代码进行了一些改进并在 cmets 中进行了解释。):

# First, create a set of the keywords. Sets are faster than a list for
# checking if they contain an element. The curly brackets create a set.
keywords = {'ford', 'hyundai','honda', 'jeep', 'maserati','audi','jaguar',
            'volkswagen','chevrolet','chrysler'}
csv.field_size_limit(sys.maxsize)
invalids = 0
valids = 0
for filename in ['1.csv']:
    # The with statement in Python makes sure that your file is properly closed
    # (automatically) when an error occurs. This is a common idiom.
    # In addition, CSV files should be opened only in 'rb' mode.
    with open(filename, 'rb') as f:
        reader = csv.reader(f, delimiter='|', quotechar='\\')
        for row in reader:
            try:
                print row[2] 
                valids += 1
            # Don't use bare except clauses. It will catch
            # exceptions you don't want or intend to catch.
            except IndexError:
                invalids += 1
            # The filtering is done here.
            for field in row:
                if field in keywords:
                    print row
                    break
# Prefer the str.format() method over the old style string formatting.
print 'parsed {0} records. ignored {1}'.format(valids, invalids)

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2017-11-15
    • 1970-01-01
    • 1970-01-01
    • 2018-12-27
    • 1970-01-01
    • 2019-07-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多