【发布时间】:2019-10-15 23:19:34
【问题描述】:
我目前正在处理一个 Python 项目,该项目导入一个数据文本文件(在我的例子中为 CSV),然后输出在一个共同项目中一起工作时间最长的员工。首先,这是代码和数据文件:
from collections import defaultdict
from itertools import combinations
from datetime import datetime
import csv
d = defaultdict(list)
with open("data.csv") as f:
next(f) # skip header
r = csv.reader(f)
# unpack use height as key and append name age and position
for EmpID, ProjectID, FromDate, ToDate in r:
d[int(ProjectID)].append((EmpID, FromDate, ToDate))
for job, aref in d.items():
if len(aref) >= 2:
for ref in combinations(aref, 2):
begin = max(map(lambda x: x[1], ref))
end = min(map(lambda x: x[2], ref))
delta = datetime.strptime(end, '%Y-%m-%d') \
- datetime.strptime(begin, '%Y-%m-%d')
dd = delta.days
if dd > 0:
print('Employees with EmpID:', ref[0][0], 'and', ref[1][0],
'worked together on a common project (Project ID:', job, ') for a total of', dd, 'days')
这是数据文件,我正在导入:
EmpID,ProjectID,DateFrom,DateTo
1,100,2014-11-01,2015-05-01
2,101,2013-12-06,2014-10-06
3,102,2015-06-04,2017-09-04
5,103,2014-10-01,2015-12-01
2,100,2013-03-07,2015-11-07
2,103,2015-07-09,2019-01-19
4,102,2013-11-13,2014-03-13
4,103,2016-02-14,2017-03-15
5,104,2014-03-15,2015-11-09
现在,我有一个任务,如果 'DateTo' 列中有一个值 'NULL',我今天必须让它相等。我在想应该有一个自动 python 函数给出当前日期,然后在 CSV 代码块中执行一个 if 语句,用今天的日期替换 'NULL' (但据我所知,它只在读取模式下打开?)。如果有人能给我任何提示,我将不胜感激!谢谢。
编辑: 熊猫之前的解决方案尝试:(50% 完成)
# Load the Pandas libraries with alias 'pd'
import pandas as pd
import datetime as dt
import numpy as np
# Read data from file 'filename.csv'
# (in the same directory that your python process is based)
# Control delimiters, rows, column names with read_csv (see later)
date_parser = lambda c: pd.to_datetime(c, format='%Y/%m/%d', errors='coerce')
df = pd.read_csv('data.csv', delimiter = ',', parse_dates=[2,3], date_parser=date_parser)
df.set_index("EmpID", inplace = True)
df.sort_values(['ProjectID'], inplace=True)
df['Days Worked'] = (df['DateTo'] - df['DateFrom']).dt.days
cutdown_projecs = df.groupby('ProjectID').filter(lambda x: len(x) >= 2)
print(cutdown_projecs)
【问题讨论】:
-
您需要从该文件读取并写入第二个文件,因此打开 2 个文件
-
@jezrael 我之前已经尝试过 Pandas 解决方案,我可以向您展示我的进度。我只是不熟悉包和我可以在数据帧上执行的操作,所以我决定使用传统的 Python 方法。
-
@jezrael 是的,都解决了。谢谢。
-
@GerganZhekov - 超级,祝你好运。
标签: python python-3.x pandas csv dictionary