【问题标题】:Top N items in every month - BIGQUERY每月前 N 项 - BIGQUERY
【发布时间】:2020-12-05 02:45:16
【问题描述】:

我下面有一个大查询程序;

WITH cte AS(
SELECT *
  FROM (
    SELECT project_name,
    SUM(reward_value) AS total_reward_value,
    DATE_TRUNC(date_signing, MONTH) as month,
    date_signing,
    Row_number() over (partition by DATE_TRUNC(date_signing, MONTH)
                                          order by SUM(reward_value) desc) AS rank
      FROM `deals`
    WHERE CAST(date_signing as DATE) > '2019-12-31' 
    AND CAST(date_signing as DATE) < '2020-02-01'
    AND target_category = 'achieved'
    AND project_name IS NOT NULL
    GROUP BY project_name, month, date_signing
  )
)

SELECT * FROM cte WHERE rank <= 5

返回以下结果:

虽然我希望每个月内每个独特的项目都是 SUM,然后我只过滤前 5 个。

类似这样的:

如果删除了 date_signing 分组,我会收到以下错误 PARTITION BY 表达式引用了在 [16:48] 处既不分组也不聚合的列 date_signing

任何应该更正的提示将不胜感激!

【问题讨论】:

  • 你是按 `date_signing` 分组的,我想你不希望这样,而且在 group by 或 order by 语句中使用列顺序也不是一个好主意,使其难以阅读和维护
  • 我编辑了我的问题以添加在组中包含“date_signing”的意图

标签: sql google-bigquery


【解决方案1】:

然后可能还有一个子查询?

WITH cte AS(
  SELECT project_name,
    SUM(reward_value) as reward_sum,
    DATE_TRUNC(date_signing, MONTH) as month
  FROM `deals`
  WHERE CAST(date_signing as DATE) > '2019-12-31' 
    AND CAST(date_signing as DATE) < '2020-02-01'
    AND target_category = 'achieved'
    AND project_name IS NOT NULL
  GROUP BY project_name, month
),
ranks AS (
  SELECT 
    project_name,
    reward_sum,
    month,
    ROW_NUMBER() over (PARTITION BY month ORDER BY reward_sum DESC) AS rank
)
SELECT * 
FROM ranks 
WHERE rank <= 5

【讨论】:

  • 我添加 date_signing 的目的是为了解决这个错误 n 大查询; PARTITION BY 表达式引用了在 [16:48] 处既不分组也不聚合的列 date_signing,更新了我的问题。
  • 那么可能还有一个子查询?
  • 成功了,我需要的是第二个 CTE,谢谢!
【解决方案2】:

是的,你不能那样做,你可以显示最后的签名日期:

WITH cte AS(
    SELECT project_name,
           SUM(reward_value),
           DATE_TRUNC(date_signing, MONTH) as month,
           MAX(date_signing) as last_signing_date,
           Row_number() over (partition by DATE_TRUNC(date_signing, MONTH)
                                          order by SUM(reward_value) desc) AS rank
    FROM `deals`
    WHERE CAST(date_signing as DATE) > '2019-12-31' 
        AND CAST(date_signing as DATE) < '2020-02-01'
        AND target_category = 'achieved'
        AND project_name IS NOT NULL
    GROUP BY project_name, month
  
)

SELECT * FROM cte WHERE rank <= 5

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2018-02-01
    • 1970-01-01
    • 2021-04-07
    • 1970-01-01
    • 1970-01-01
    • 2016-11-25
    相关资源
    最近更新 更多