【发布时间】:2016-08-01 18:40:39
【问题描述】:
我有许多格式类似的 excel 文件:
|name| email| cat1| cat2| cat3
smith email 01JAN2016 01JAN2014 01JAN2015
前两列包含字符串(姓名和电子邮件地址),接下来的每一列包含每个人完成 cat(x) 中每个项目的日期。
我想与 current_date 进行比较,添加一个新列“status”,根据一行中的任何日期是否早于当前日期,该列的值将是“compliant”或“delinquent”,然后输出新的数据框到 Excel 电子表格。
我最初的尝试让我可以轻松地过滤“旧”日期,但是,当我尝试使用条件添加一列时,一切都开始中断:
import pandas as pd
import numpy as np
import datetime
current_date = datetime.datetime.now()
writer = pd.ExcelWriter('pd_output.xlsx', engine='xlsxwriter', datetime_format= 'mmm d yyy')
df = pd.read_excel(tracker,'Sheet1')
print(df.values) # Displays dates as 'Timestamp('2016-01-01 00:00:00') any value which is < current_date displays as 'True' else 'False'
print(df < current_date) # removes dates that are not older than current_date but does not delete column, ie someone with no old dates will still show up with column 3+ being blank
# a couple version of what I have been trying - unsuccessfully
df['Status'] = np.where(df[df < current_date], 'delinquent', 'compliant' # error: 'wrong number of items passed
df['Status'] = np.where(df == 'True', 'delinquent', 'compliant' # error: 'str' obj has no attr 'view'
df['Status' = df.Set.map(lambda x: 'delinquent' if 'True' in df else 'compliant' # from another post - error 'no attr 'Set' or 'map'
# send to output excel
df.to_excel(writer,sheet_name='Sheet1')
我希望有一个输出显示行,其中添加了“状态”列,显示行中存在“违规日期”的位置 - 以“合规”或“拖欠”引爆。我觉得我的比较不正确(使用 True 而不是另一个 .where),但似乎无法做到正确。
【问题讨论】: