Redshift SQL 子查询使用来自外部查询的未分组列答案

【问题标题】：Redhisft SQL subquery uses ungrouped column from outer queryRedshift SQL 子查询使用来自外部查询的未分组列
【发布时间】：2020-10-20 03:47:13
【问题描述】：

尝试执行子查询时出现以下错误子查询使用来自外部查询的未分组列“s.event_captured_dt”

select

to_date(s.event_captured_dt,'DD Mon YYYY')
,count(*) as count_
,(select count(*) 
    from spc_raw_responsys_kanui.sent s1
    where to_date(s1.event_captured_dt,'DD Mon YYYY') between to_date(s.event_captured_dt,'DD Mon YYYY') - 7 and to_date(s.event_captured_dt,'DD Mon YYYY')
) as count7days_
    
from spc_raw_responsys_kanui.sent s

where 1=1 
and to_date(s.event_captured_dt,'DD Mon YYYY') >= '2020-06-01 00:00:00'

group by to_date(s.event_captured_dt,'DD Mon YYYY')

这就是我想要达到的目标：

Goal

【问题讨论】：

请提供样本数据、预期结果和逻辑解释。

标签： sql date select subquery amazon-redshift

【解决方案1】：

如果您在子查询中预先聚合，它是否有效？

select
    s.*,
    (
        select count(*) 
        from spc_raw_responsys_kanui.sent s1
        where to_date(s1.event_captured_dt,'DD Mon YYYY') 
            between s.event_captured_date - 7 
            and s.event_captured_date
    ) as count7days_
from (
    select to_date(event_captured_dt,'DD Mon YYYY') event_captured_date, count(*) as count_
    from spc_raw_responsys_kanui.sent s
    where to_date(event_captured_dt,'DD Mon YYYY') >= '2020-06-01 00:00:00'
    group by to_date(event_captured_dt,'DD Mon YYYY')
) s

【讨论】：

是的，工作！但我猜它的表现不太好。我花了 2 分钟运行它。
还有其他选择，还是我应该这样离开？
@MiguelFilho：从性能角度来看，预聚合更好。性能问题是另一个与子查询有关的问题。作为初学者，您应该考虑将日期存储为日期而不是字符串。然后，您可以删除所有to_date() 调用，并在event_captured_dt 列上创建索引。