【问题标题】:Referencing other columns in a SQL SELECT引用 SQL SELECT 中的其他列
【发布时间】:2021-03-08 12:25:02
【问题描述】:

我在 BigQuery 中有一个 SQL 查询:

SELECT
  creator.country,
  (SUM(length) / 60) AS total_minutes,
  COUNT(DISTINCT creator.id) AS total_users,
  (SUM(length) / 60 / COUNT(DISTINCT creator.id)) AS minutes_per_user
FROM
  ...

您可能已经注意到最后一列等同于total_minutes / total_users

我试过了,但它不起作用:

SELECT
  creator.country,
  (SUM(length) / 60) AS total_minutes,
  COUNT(DISTINCT creator.id) AS total_users,
  (total_minutes / total_users) AS minutes_per_user
FROM
  ...

有什么方法可以让这更简单吗?

【问题讨论】:

    标签: sql google-bigquery


    【解决方案1】:

    不是真的。也就是说,您不能在同一个SELECT 的表达式中重复使用列别名。如果你真的想要,你可以使用子查询或 CTE:

    SELECT c.*,
           total_minutes / total_users
    FROM (SELECT creator.country,
                 (SUM(length) / 60) AS total_minutes,
                  COUNT(DISTINCT creator.id) AS total_users
          FROM
         ) c;
    

    【讨论】:

      【解决方案2】:

      另一种选择是将度量计算的所有业务逻辑移动到 UDF 中(临时或永久取决于使用需求)...

      create temp function custom_stats(arr any type) as ((
        select as struct    
          sum(length) / 60 as total_minutes,
          count(distinct id) as total_users,
          sum(length) / 60 / count(distinct id) as minutes_per_user
        from unnest(arr)
      ));
      

      ... 从而使查询本身保持简单且最少冗长 - 如下例所示

      select creator.country,
        custom_stats(array_agg(struct(length, creator.id))).*
      from `project.dataset.table`
      group by country
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 2015-07-16
        • 2022-01-12
        • 1970-01-01
        • 2012-12-19
        • 1970-01-01
        相关资源
        最近更新 更多