【问题标题】:SQL Count with Grouped aggregates具有分组聚合的 SQL 计数
【发布时间】:2019-11-18 22:14:23
【问题描述】:

我正在尝试编写一个 SQL 查询,它允许我创建一个包含历史数据的折线图。我试图弄清楚随着时间的推移有多少用户正在使用我的应用程序的每个版本(按每日津贴)。我的 Y 轴将是所有应用程序的使用百分比(共 100 个),X 轴是一天,每个构建都是不同的行。在任何时候,所有行的总和都应该等于 100%。

由于此查询应按版本/构建进行分组,因此除了日期之外,我还试图弄清楚如何在我的查询中获取任何给定日期的总用户百分比。到目前为止,我能够得到这个查询:

SELECT DISTINCT 
    sub.Version, 
    sub.Build,     
    sub.app_id, 
    sub.Users, 
    sub.`day`,
    (
        SELECT COUNT(DISTINCT user_id)
        FROM snowplow_enricher_good seg
    ) AS Total,
    (sub.Users/Total) * 100 AS Percent
FROM 
(
    SELECT
        visitParamExtractString(seg.contexts, 'version') AS Version,
        visitParamExtractString(seg.contexts, 'build') AS Build,
        seg.app_id,
        seg.`day`,
        CONCAT(
            Version, 
            ' (', 
            Build, 
            ')'
        ) AS AppBuildVersion,
        COUNT(DISTINCT seg.user_id) AS Users
    FROM snowplow_enricher_good seg
    GROUP BY Version, Build, app_id, `day`
    ORDER BY Users DESC
) AS sub
WHERE sub.app_id = 'APPID';

请注意,当前显示的百分比是所有天的百分比,而不是单天。我尝试在自定义 FROM 语句中创建 WHERE 子句,但失败了。

提前谢谢你:)

【问题讨论】:

    标签: sql clickhouse


    【解决方案1】:

    组数组

    SELECT
        totalCnt,
        totalSum,
        ga.1 AS tag,
        ga.2 AS value,
        (value / totalSum) * 100 AS percent
    FROM
    (
        SELECT
            count() AS totalCnt,
            sum(value) AS totalSum,
            groupArray((tag, value)) AS ga
        FROM
        (
            SELECT
                tag,
                value
            FROM
            (
                SELECT
                    [1, 2, 3, 4, 5] AS tag,
                    [10, 100, 50, 100, 40] AS value
            )
            ARRAY JOIN
                tag,
                value
        )
    )
    ARRAY JOIN ga
    
    ┌─totalCnt─┬─totalSum─┬─tag─┬─value─┬────────────percent─┐
    │        5 │      300 │   1 │    10 │ 3.3333333333333335 │
    │        5 │      300 │   2 │   100 │  33.33333333333333 │
    │        5 │      300 │   3 │    50 │ 16.666666666666664 │
    │        5 │      300 │   4 │   100 │  33.33333333333333 │
    │        5 │      300 │   5 │    40 │ 13.333333333333334 │
    └──────────┴──────────┴─────┴───────┴────────────────────┘
    

    【讨论】:

      【解决方案2】:

      能够使用一系列连接和子查询来解决这个问题:

      SELECT 
          day, 
          app_id, 
          version, 
          version_count, 
          app_count, 
          (version_count / app_count) * 100 AS percent
      FROM (
          SELECT 
              day, 
              app_id, 
              visitParamExtractString(contexts, 'version') AS version, 
              count(DISTINCT user_id) AS version_count
          FROM 
              snowplow_enricher_good
          where 
              day >= subtractDays(today(), 30)
          GROUP BY 
              day, 
              app_id, 
              version
      ) 
      INNER JOIN (
          SELECT 
              day, 
              app_id, 
              count(DISTINCT user_id) AS app_count
          FROM 
              snowplow_enricher_good
          WHERE 
              day >= subtractDays(today(), 30)
          GROUP BY 
              day, 
              app_id
      )
      USING 
          day, 
          app_id
      WHERE
          app_id = 'APPID'
      ORDER BY 
          day DESC, 
          app_id, 
          version;
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 1970-01-01
        • 2017-11-13
        • 2012-09-09
        • 1970-01-01
        • 2021-10-29
        • 1970-01-01
        • 1970-01-01
        • 2020-10-30
        相关资源
        最近更新 更多