【发布时间】:2018-07-19 01:52:50
【问题描述】:
我正在尝试使用 pandasql::sqldf 循环列表,但这个 sqldf 似乎没有捕获循环变量。下面是我的问题的程式化大纲:
import pandas as pd
from pandasql import sqldf
from datetime import datetime
FreqGamePlay = pd.DataFrame({'CONTACT_WID' : [1, 2, 3, 1, 4],
'TITLE_NOMIN_DT' : pd.to_datetime(['20130102', '20140103', '20120518',
'20140317', '20111123']),
'FreqGamePlay' : [12, 9, 22, 4, 5]})
FreqGamePlay = FreqGamePlay[['CONTACT_WID', 'TITLE_NOMIN_DT', 'FreqGamePlay']]
periodsList = ['2012-12-26', '2012-02-28']
for i in periodsList:
temp = sqldf("select CONTACT_WID, sum(FreqGamePlay) as FGP from FreqGamePlay where TITLE_NOMIN_DT > i group by CONTACT_WID;", globals())
print(temp)
上面的程序给出以下错误:
PandaSQLException: (sqlite3.OperationalError) no such column: i [SQL: 'select CONTACT_WID, sum(FreqGamePlay) as FGP from FreqGamePlay where TITLE_NOMIN_DT > i group by CONTACT_WID;']
但如果我手动硬编码日期,它可以正常工作:
for i in periodsList:
temp = sqldf("select CONTACT_WID, sum(FreqGamePlay) as FGP from FreqGamePlay where TITLE_NOMIN_DT > '2012-12-26' group by CONTACT_WID;", globals())
print(temp)
但上述方法效率不高,因为实际程序的日期列表要大得多。任何建议都非常感谢,谢谢
【问题讨论】:
标签: python sql pandas pandasql