【问题标题】:Querying Customers who have rented a movie at least once every week or in the Weekend查询每周或周末至少租过一次电影的客户
【发布时间】:2021-09-24 22:33:39
【问题描述】:

我有一个电影租赁数据库。我的桌子是:

  1. 客户级别:

    • 主键:Customer_id(INT)
    • 名字(VARCHAR)
    • 姓氏(VARCHAR)
  2. 电影级别:

    • 主键:Film_id(INT)
    • 标题(VARCHAR)
    • 类别(VARCHAR)
  3. 出租等级:

    • 主键:Rental_id(INT)。

    此表中的其他列是:

    • Rental_date(DATETIME)
    • customer_id(INT)
    • film_id(INT)
    • payment_date(DATETIME)
    • 金额(DECIMAL(5,2))

现在的问题是创建一个按以下分类的客户主列表:

  • 每周至少租一次的常客
  • 周末,他们的大部分租金都在周六和周日到来

我不是在这里寻找代码,而是寻找解决这个问题的逻辑。尝试了很多方法,但无法形成关于如何每周查找客户 ID 的逻辑。我试过的代码如下:

select
   r.customer_id
 , concat(c.first_name, ' ', c.last_name) as Customer_Name
 , dayname(r.rental_date) as day_of_rental
 , case
     when dayname(r.rental_date) in ('Monday','Tuesday','Wednesday','Thursday','Friday')
     then 'Regulars'
     else 'Weekenders'
   end as Customer_Category
from rental r
inner join customer c on r.customer_id = c.customer_id;

我知道这是不正确的,但我无法超越这一点。

【问题讨论】:

    标签: mysql sql join


    【解决方案1】:

    这是一项队列研究。先求各组的最小表达式:

    # Weekday regulars
    SELECT
       customer_id
    FROM rental
    WHERE WEEKDAY(`date`) < 5 # 0-4 are weekdays
    
    # Weekend warriors
    SELECT
       customer_id
    FROM rental
    WHERE WEEKDAY(`date`) > 4 # 5 and 6 are weekends
    

    现在我们知道如何获取在工作日和周末(包括工作日和周末)租用的客户列表。这些查询实际上只是告诉我们这些是给定系列中某天访问过的客户,因此我们需要做出一些判断。

    让我们引入一个周期性,它可以让我们获得阈值。我们也需要聚合,因此我们将通过分组到 rental.customer_id 来计算明显可知的周数。

    # Weekday regulars
    SELECT
       customer_id
     , COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
    FROM rental
    WHERE WEEKDAY(`date`) < 5
    GROUP BY customer_id
    
    # Weekend warriors
    SELECT
       customer_id
     , COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
    FROM rental
    WHERE WEEKDAY(`date`) > 4 
    GROUP BY customer_id
    

    我们还需要一个行列式周期:

    FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS weeks_in_period
    

    把它们放在一起:

    # Weekday regulars
    SELECT
       customer_id
     , period.total_weeks
     , COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
    FROM rental
    WHERE WEEKDAY(`date`) < 5
    CROSS JOIN (
      SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
    ) AS period
    GROUP BY customer_id
    
    # Weekend warriors
    SELECT
       customer_id
     , period.total_weeks
     , COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
    FROM rental
    CROSS JOIN (
      SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
    ) AS period
    WHERE WEEKDAY(`date`) > 4 
    GROUP BY customer_id
    

    所以现在我们可以为每个队列引入阈值累加器。

    # Weekday regulars
    SELECT
       customer_id
     , period.total_weeks
     , COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
    FROM rental
    WHERE WEEKDAY(`date`) < 5
    CROSS JOIN (
      SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
    ) AS period
    GROUP BY customer_id
    HAVING total_weeks = weeks_as_customer
    
    # Weekend warriors
    SELECT
       customer_id
     , period.total_weeks
     , COUNT(DISTINCT YEARWEEK(`date`)) AS weeks_as_customer
    FROM rental
    CROSS JOIN (
      SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
    ) AS period
    WHERE WEEKDAY(`date`) > 4 
    GROUP BY customer_id
    HAVING total_weeks = weeks_as_customer
    

    然后我们可以使用这些子查询我们的主列表。

    SELECT
       customer.customer_id
     , CONCAT(customer.first_name, ' ', customer.last_name) as customer_name
     , CASE
         WHEN regulars.customer_id IS NOT NULL THEN 'regular'
         WHEN weekenders.customer_id IS NOT NULL THEN 'weekender'
         ELSE NULL
       AS category
    FROM customer
    CROSS JOIN (
      SELECT FLOOR(DATEDIFF(DATE(NOW()), '2019-01-01') / 7) AS total_weeks
    ) AS period
    LEFT JOIN (
      SELECT
         rental.customer_id
       , period.total_weeks
       , COUNT(DISTINCT YEARWEEK(rental.`date`)) AS weeks_as_customer
      FROM rental
      WHERE WEEKDAY(rental.`date`) < 5
      GROUP BY rental.customer_id
      HAVING total_weeks = weeks_as_customer
    ) AS regulars ON customer.customer_id = regulars.customer_id
    LEFT JOIN (
      SELECT
         rental.customer_id
       , period.total_weeks
       , COUNT(DISTINCT YEARWEEK(rental.`date`)) AS weeks_as_customer
      FROM rental
      WHERE WEEKDAY(rental.`date`) > 4
      GROUP BY rental.customer_id
      HAVING total_weeks = weeks_as_customer
    ) AS weekenders ON customer.customer_id = weekenders.customer_id
    HAVING category IS NOT NULL
    

    对于是否要排除跨队列(例如,因为周末只租了至少一次而错过一周的常客)存在一些歧义。您需要解决此类包容性/排他性问题。

    这将涉及回到特定于同类群组的查询,以引入和调整查询以解释进一步理解的程度,和/或添加其他同类群组横切子查询,这些子查询可以以其他方式组合以建立更好的和/或更多推导式的顶视图。

    但是,鉴于此警告,我认为我提供的内容与您提供的内容合理匹配。

    【讨论】:

    • 谢谢@jared!!让我试试看
    【解决方案2】:

    当前方法的问题在于,每个客户的每次租赁都将被分开处理。我假设客户可能会多次租用,因此我们需要汇总客户的所有租赁数据来计算类别。

    因此,要创建主表,您在逻辑中提到周末客是“他们的大部分租金来自周六和周日”的客户,而常客是每周至少租一次的客户。

    2 个问题:-

    1. 对于周末游客来说,“最多”的逻辑是什么?
    2. 这两个类别是否相互排斥?从声明看来并非如此,因为客户可能只在周六或周日租房。

    我在 Oracle SQL 方言中尝试了一个解决方案(工作但性能可以提高),其逻辑如下:如果客户在工作日租用的租金比周末多,则客户是普通客户,否则是周末客户。可以根据上述问题的答案修改此查询。

    select
            c.customer_id,
             c.first_name || ' ' || c.last_name as Customer_Name,
                case
            when r.reg_count>r.we_count then 'Regulars'
                else 'Weekenders'
            end as Customer_Category
    from customer c
    inner join
    (select customer_id, count(case when trim(to_char(rental_date, 'DAY')) in ('MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY') then 1 end) as reg_count, 
            count(case when trim(to_char(rental_date, 'DAY')) in ('SATURDAY','SUNDAY') then 1 end) as we_count
            from rental group by customer_id) r on r.customer_id=c.customer_id;
    

    根据评论中给出的清晰度更新了查询:-

    select
            c.customer_id,
            c.first_name || ' ' || c.last_name as Customer_Name,
            case when rg.cnt>0 then 1 else 0 end as REGULAR,
            case when we.cnt>0 then 1 else 0 end as WEEKENDER
    from customer c
    left outer join
    (select customer_id, count(rental_id) cnt from rental where trim(to_char(rental_date, 'DAY')) in ('MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY') group by customer_id) rg on rg.customer_id=c.customer_id
    left outer join
    (select customer_id, count(rental_id) cnt from rental where trim(to_char(rental_date, 'DAY')) in ('SATURDAY','SUNDAY') group by customer_id) we on we.customer_id=c.customer_id;
    

    测试数据:

    insert into customer values (1, 'nonsensical', 'coder');
    insert into rental values(1, 1, sysdate, 1, sysdate, 500);
    
    insert into customer values (2, 'foo', 'bar');
    insert into rental values(2, 2, sysdate-5, 2, sysdate-5, 800); [Current day is Friday]
    
    

    查询输出(第一次查询):

    CUSTOMER_ID    CUSTOMER_NAME          CUSTOMER_CATEGORY
    1              nonsensical coder      Regulars
    2              foo bar                Weekenders
    

    查询输出(第二次查询):

    CUSTOMER_ID   CUSTOMER_NAME       REGULAR   WEEKENDER
    1             nonsensical coder    0         1
    2             foo bar              1         0
    

    【讨论】:

    • 感谢您回复@ashutosh。回答您的问题: 1. 这里的“大多数”很可能是指星期六或星期日或两者兼而有之 2. 这两个类别并不相互排斥。常客的人可以溢出到周末我这里的问题是检查我如何每周检查每个客户 ID 以确认他们属于常客。周末也一样,必须检查每个客户 ID 是否出现在所有周末
    • 已提供更新的查询和输出
    • 由于类别不是互斥的,我们可以让它们都与一个客户相关
    【解决方案3】:

    首先,您不需要customer 表。您可以在分类后添加它。

    要解决问题,您需要以下信息:

    • 出租总数。
    • 租赁的总周数。
    • 总周数或无租金的总周数。
    • 周末的出租总数。

    您可以使用聚合获得此信息:

    select r.customer_id,
           count(*) as num_rentals,
           count(distinct yearweek(rental_date)) as num_weeks,
           (to_days(max(rental_date)) - to_days(min(rental_date)) ) / 7 as num_weeks_overall,
           sum(dayname(r.rental_date) in ('Saturday', 'Sunday')) as weekend_rentals
    from rental r
    group by r.customer_id;
    

    现在,您的问题有点含糊不清,如果有人只在周末租房,但每周都租房,该怎么办。所以,我只是对最终的分类做出任意假设:

    select r.customer_id,
           (case when num_weeks > 10 and
                      num_weeks >= num_weeks_overall * 0.9
                 then 'Regular'     -- at least 10 weeks and rents in 90% of the weeks
                 when weekend_rentals >= 0.8 * num_rentals
                 then 'Weekender'   -- 80% of rentals are on the weekend'
                 else 'Hoi Polloi'
             end) as category
    from (select r.customer_id,
                 count(*) as num_rentals,
                 count(distinct yearweek(rental_date)) as num_weeks,
                 (to_days(max(rental_date)) - to_days(min(rental_date)) ) / 7 as num_weeks_overall,
                 sum(dayname(r.rental_date) in ('Saturday', 'Sunday')) as weekend_rentals
          from rental r
          group by r.customer_id
         ) r;
    

    【讨论】:

    • 非常感谢戈登。我只是想知道此代码是否检查每个客户 ID 是否每周至少进行一次交易以符合常规?
    • @nonsensical_coder 。 . .它使用的规则在 cmets 中指定。
    猜你喜欢
    • 1970-01-01
    • 2022-08-17
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2012-11-13
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多