【问题标题】:How can you generate a date list from a range in Amazon Redshift?如何从 Amazon Redshift 中的范围生成日期列表?
【发布时间】:2018-01-24 17:54:16
【问题描述】:

Getting date list in a range in PostgreSQL 展示了如何在 PostgreSQL 中获取日期范围。但是,Redshift 不支持generate_series()

ans=> select (generate_series('2012-06-29', '2012-07-03', '1 day'::interval))::date;
ERROR:  function generate_series("unknown", "unknown", interval) does not exist
HINT:  No function matches the given name and argument types. You may need to add explicit type casts.

有没有办法复制 generate_series() 在 Redshift 中所做的事情?

【问题讨论】:

标签: amazon-redshift


【解决方案1】:

一个黑客,但有效:

使用包含许多行的表和窗口函数来生成系列

只要您生成的系列小于您用来生成系列的表中的行数,这就会起作用

WITH x(dt) AS (SELECT '2016-01-01'::date)
SELECT 
    dateadd(
        day, 
        COUNT(*) over(rows between unbounded preceding and current row) - 1, 
    dt)
FROM users, x 
LIMIT 100

初始日期2016-01-01控制开始日期,限制控制生成系列的天数。

更新:

Redshift 部分支持generate_series 函数,但遗憾的是在他们的文档中没有提到它。

起作用,并且是生成截至该日期 (2018-01-29) 的一系列日期的最短且最清晰的方式:

SELECT ('2016-01-01'::date + x)::date 
FROM generate_series(1, 100, 1) x

【讨论】:

  • generate_series 在领导节点上运行时将起作用。但是,当将该查询与实际表连接时,它将尝试在其他节点上执行并失败。来自 Redshift documentation on unsupported PostgreSQL features“一些不受支持的函数在领导节点上运行时不会返回错误”
  • 是的,你是对的。它仅适用于单节点集群。我会用你的评论更新答案
【解决方案2】:

如果您不想依赖任何现有表格,一个选项是预先生成一个系列表格,其中填充了一系列数字,每行一个。

create table numbers as (
  select
          p0.n
          + p1.n*2
          + p2.n * power(2,2)
          + p3.n * power(2,3)
          + p4.n * power(2,4)
          + p5.n * power(2,5)
          + p6.n * power(2,6)
          + p7.n * power(2,7)
          + p8.n * power(2,8)
          + p9.n * power(2,9)
          + p10.n * power(2,10)
          as number
        from
          (select 0 as n union select 1) p0,
          (select 0 as n union select 1) p1,
          (select 0 as n union select 1) p2,
          (select 0 as n union select 1) p3,
          (select 0 as n union select 1) p4,
          (select 0 as n union select 1) p5,
          (select 0 as n union select 1) p6,
          (select 0 as n union select 1) p7,
          (select 0 as n union select 1) p8,
          (select 0 as n union select 1) p9,
          (select 0 as n union select 1) p10
  order by 1
);

这将创建一个数字从 0 到 2^10 的表格,如果您需要更多数字,只需添加更多子句:D

一旦你有了这个表,你就可以加入它来代替generate_series

with date_range as (select 
   '2012-06-29'::timestamp as start_date , 
   '2012-07-03'::timestamp as end_date
)
select
    dateadd(day, number::int, start_date)
from date_range
inner join numbers on number <= datediff(day, start_date, end_date)

【讨论】:

    【解决方案3】:

    @michael_erasmus 这很有趣,我进行了更改以获得更好的性能。

    CREATE OR REPLACE VIEW v_series_0_to_1024 AS SELECT
      p0.n 
      | (p1.n << 1) 
      | (p2.n << 2) 
      | (p3.n << 3) 
      | (p4.n << 4) 
      | (p5.n << 5) 
      | (p6.n << 6) 
      | (p7.n << 7) 
      | (p8.n << 8) 
      | (p9.n << 9)
      as number
    from
      (select 0 as n union select 1) p0,
      (select 0 as n union select 1) p1,
      (select 0 as n union select 1) p2,
      (select 0 as n union select 1) p3,
      (select 0 as n union select 1) p4,
      (select 0 as n union select 1) p5,
      (select 0 as n union select 1) p6,
      (select 0 as n union select 1) p7,
      (select 0 as n union select 1) p8,
      (select 0 as n union select 1) p9
    order by number
    

    过去 30 天日期系列:

    select dateadd(day, -number, current_date) as dt from v_series_0_to_1024 where number < 30
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2019-08-25
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-11-01
      • 1970-01-01
      • 2011-01-10
      相关资源
      最近更新 更多