【问题标题】:Joining three different tables with date columns and group by date使用日期列和按日期分组连接三个不同的表
【发布时间】:2016-08-24 17:54:44
【问题描述】:

我目前正在处理三个看起来像这样的不同表

用户:

user_id|createtime
-----------------------------
001     2016-07-20
002     2016-08-15
003     2016-08-05
004     2016-08-23

对象:

objc_id|createtime|user_id
-----------------------------
001     2016-07-20   001
002     2016-07-15   001
003     2016-08-05   002
004     2016-08-23   001
005     2016-08-19   003
006     2016-08-21   004
007     2016-08-22   004

事件:

event_id|createtime|objc_id
-----------------------------
001      2016-08-25   001
002      2016-08-26   004
003      2016-08-28   002
004      2016-08-27   005
005      2016-08-30   003

我已经做出了一些选择语句,告诉我在某个日期创建了多少新用户、对象或事件:

select createtime, count(user_id) as new_user from 
user where createtime >= CURRENT_DATE - INTERVAL '30 days' group by
createtime;

select createtime, count(objc_id) as new_objc from 
object createtime >= CURRENT_DATE - INTERVAL '30 days' group by createtime;

select createtime, count (event_id) from 
event where createtime >= CURRENT_DATE - INTERVAL '30 days' 
group by createtime;

现在我想做一个选择语句,其中显示所有新对象、用户和事件的数量,并按它们的创建时间对它们进行分组。

但是我无法解决这个问题。实现的目标如下所示:

createtime|new_user|new_objc|new_event
---------------------------------------
2016-07-15              1
2016-07-20    1         1
2016-07-22    
2016-07-24
2016-08-05    1         1
2016-08-15    1
2016-08-19              1
2016-08-21              1
2016-08-22              1
2016-08-23    1         1
2016-08-25                      1
2016-08-26                      1
2016-08-27                      1
2016-08-28                      1
2016-08-30                      1

...当然,某些日期可能会在一组中出现多次,因此计数将是 1 或更高。

目标是查看营销是否会随着时间的推移增加新用户数量、购买物品和参加活动(您需要活动物品)。

有没有人解决这个问题,或者我必须使用单独的结果表?

【问题讨论】:

    标签: sql postgresql date join


    【解决方案1】:
    SELECT
        Createtime
        ,COUNT(DISTINCT user_id) as new_user
        ,COUNT(DISTINCT objc_id) as new_objc
        ,COUNT(DISTINCT new_event) as new_objc
    FROM (
        SELECT Createtime, user_id, CAST(NULL AS INT) as objc_id, CAST(NULL as INT) as event_id
        FROM
           users
        WHERE createtime >= CURRENT_DATE - INTERVAL '30 days'
        UNION ALL
        SELECT Createtime, CAST(NULL AS INT) as user_id, objc_id, CAST(NULL as INT) event_id
        FROM
           object
        WHERE createtime >= CURRENT_DATE - INTERVAL '30 days'
        UNION ALL
        SELECT Createtime, CAST(NULL AS INT) as user_id, CAST(NULL AS INT) as objc_id, event_id
        FROM
           event
        WHERE createtime >= CURRENT_DATE - INTERVAL '30 days'
    ) t
    GROUP BY
        Createtime
    ORDER BY
        Createtime
    

    当您第一次查看您的问题时,您似乎需要加入,但实际上您需要合并所有和 count()。我正在计算 DISTINCT,这应该是多余的,因为 user_id、objc_id、event_id 似乎是唯一的主键。

    【讨论】:

      【解决方案2】:

      如果您不想使用查询,请尝试以下操作:

      SELECT T1.createtime , COUNT(U.user_id) new_user ,
             COUNT(O.objc_id) new_objc , COUNT(E.event_id) new_event
      FROM (((SELECT distinct createtime from user
              WHERE createtime >= CURRENT_DATE - INTERVAL '30 days'
              UNION
              SELECT distinct createtime from object
              WHERE createtime >= CURRENT_DATE - INTERVAL '30 days'
              UNION
              SELECT distinct createtime from event
              WHERE createtime >= CURRENT_DATE - INTERVAL '30 days'
              ) as T1 
              LEFT JOIN user U ON T1.createtime = U.createtime)
              LEFT JOIN object O ON O.createtime = T1.createtime)
              LEFT JOIN join event E ON E.createtime = T1.createtime)
      GROUP BY T1.createtime ORDER BY T1.createtime;
      

      【讨论】:

      • 因此,因为您从表中派生 t1,所以完全外连接将意味着永远不会返回右外结果,但左连接会更合适。如果您在联合语句中执行此操作,还应该更好地限制大型数据集中的创建时间。
      • @Matt 没错,关于限制大型数据集中的创建时间,现在编辑。
      • 该版本也很好用,但是我将 T2.createtime 更改为 T1 并按 T1.createtime 订购以获得更好的概览。非常感谢您的帮助!
      【解决方案3】:

      将您的三个 select 语句用作扩展连接中的子查询。

      SELECT A.createtime, A.new_user, B.new_objc, C.new_event 
      FROM (
          select createtime, count(user_id) as new_user from 
          user where createtime >= CURRENT_DATE - INTERVAL '30 days' group by
          createtime;
      ) A INNER JOIN (
          select createtime, count(objc_id) as new_objc from 
          object createtime >= CURRENT_DATE - INTERVAL '30 days' group by createtime;
      ) B ON A.createtime = B.createtime INNER JOIN (
          select createtime, count (event_id) as new_event from 
          event where createtime >= CURRENT_DATE - INTERVAL '30 days' 
          group by createtime;
      ) C ON B.createtime = C.createtime
      

      【讨论】:

      • 如果对象或事件中的用户缺少创建时间怎么办?这将删除这些结果。
      猜你喜欢
      • 2015-11-09
      • 1970-01-01
      • 1970-01-01
      • 2015-10-31
      • 2010-09-09
      • 2013-11-05
      • 1970-01-01
      • 2021-02-22
      • 1970-01-01
      相关资源
      最近更新 更多