【问题标题】:Print the count of lines written into CSV file打印写入 CSV 文件的行数
【发布时间】:2013-06-12 13:03:01
【问题描述】:

[Python3] 我有一个脚本,它读取一个(长)CSV 文件,其中包含电子邮件地址和相应的国家/地区代码,并按国家/地区代码拆分这些文件。这很好,但我希望脚本根据每个文件打印出行数(即电子邮件)(它已写入)。

另外,我对编程和 Python 还很陌生,所以我很高兴收到任何优化建议或其他一般提示!

输入文件如下所示:

12345@12345.com     us
xyz@xyz.com         gb
aasdj@ajsdf.com     fr
askdl@kjasdf.com    de
sdlfj@aejf.com      nl
...                 ...

输出应该是这样的:

Done!
us: 20000
gb: 20000
de: 10000
fr: 10000
nl: 10000
...

我的代码如下:

import csv, datetime
from collections import defaultdict

"""
Script splits a (long) list of email addresses with associated country codes by country codes.
Input file should have only two columns of data - ideally.
"""

# Declaring variables
emails = defaultdict(list)
in_file = "test.tsv"          # Write filename here.
filename = in_file.split(".")

"""Checks if file is comma or tab separated and sets delimiter variable."""
if filename[1] == "csv":
    delimiter = ','
elif filename[1] == "tsv":
    delimiter = '\t'

"""Reads csv/tsv file and cleans email addresses."""
with open(in_file, 'r') as f:
    reader = csv.reader(f, delimiter=delimiter)
    for row in reader:
        # Gets rid of empty rows
        if row:
            # Gets rid of non-emails
            if '@' in row[0]:
                # Strips the emails from whitespace and appends to the 'emails' list
                # Also now 'cc' is in the first position [0] and email in the second [1]
                emails[row[1].strip()].append(row[0].strip()+'\n')

""""Outputs the emails by cc and names the file."""
for key, value in emails.items():
    # Key is 'cc' and value is 'email'
    # File is named by "today's date-original file's name-cc"
    with open('{0:%Y%m%d}-{1}-{2}.csv'.format(datetime.datetime.now(), filename[0], key), 'w') as f:
        f.writelines(value)

【问题讨论】:

  • 你能发布一个emails dict的简短例子吗?说 5 或 10 封电子邮件?
  • 这不只是len(value)
  • @jamylak 我认为 op 想要为每个国家/地区代码单独计数,这将涉及查看我猜的地址。
  • @pypat:在撰写本文时,它已经在 key 中。
  • 听起来你想要print("{}: {}".format(key, len(value))。这是documentation

标签: python python-3.x csv


【解决方案1】:

要获得所需的输出,您需要打印密钥(您的国家代码)和值的长度(您的电子邮件列表),如下所示:

""""Outputs the emails by cc and names the file."""
for key, value in emails.items():
    # Key is 'cc' and value is 'email'
    # File is named by "today's date-original file's name-cc"
    with open('{0:%Y%m%d}-{1}-{2}.csv'.format(datetime.datetime.now(), filename[0], key), 'w') as f:
        f.writelines(value)

    # The file is closed (de-indented from the with), but we're still in the for loop
    # Use the format() method of a string to print in the form `cc: number of emails`
    print(`{}: {}`.format(key, len(value)))

这使用format() 来生成类似gb: 30000 的字符串(more examples 的用法)。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 2017-08-18
    • 1970-01-01
    • 1970-01-01
    • 2018-04-03
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多