【问题标题】:SQL why does dateA - dateB <= '3 years' give a different result than dateA <= dateB + '3 years'SQL 为什么 dateA - date <= '3 years' 给出的结果与 date <= date + '3 years' 不同
【发布时间】:2022-01-25 10:50:33
【问题描述】:

我正在做一个关于日期格式的 MODE.com SQL 练习题。

练习题是:编写一个查询,计算成立后 3 年、5 年和 10 年内收购的公司数量(在 3 个单独的列中)。包括一列收购的公司总数。按类别分组并限制为仅具有创建日期的行。

它使用两个表:

tutorial.crunchbase_companies_clean_date 表,包含所有公司的信息,如公司名称、成立年份等。

tutorial.crunchbase_acquisitions_clean_datetable,包含所有被收购公司的信息,如被收购公司名称、被收购日期等。

我的代码是:

SELECT companies.category_code,
       COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '3 years' THEN 1 ELSE NULL END) AS less_than_3_years,
       COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '5 years' THEN 1 ELSE NULL END) AS between_3_to_5_years,
       COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= '10 years' THEN 1 ELSE NULL END) AS within_10_years,
       COUNT(1) AS total
FROM tutorial.crunchbase_companies_clean_date companies
JOIN tutorial.crunchbase_acquisitions_clean_date acq 
ON companies.permalink = acq.company_permalink
WHERE companies.founded_at_clean IS NOT NULL
GROUP BY 1
ORDER BY total DESC

结果是: My result

答案查询是:

SELECT companies.category_code,
       COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '3 years'
                       THEN 1 ELSE NULL END) AS acquired_3_yrs,
       COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '5 years'
                       THEN 1 ELSE NULL END) AS acquired_5_yrs,
       COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + INTERVAL '10 years'
                       THEN 1 ELSE NULL END) AS acquired_10_yrs,
       COUNT(1) AS total
  FROM tutorial.crunchbase_companies_clean_date companies
  JOIN tutorial.crunchbase_acquisitions_clean_date acquisitions
    ON acquisitions.company_permalink = companies.permalink
 WHERE founded_at_clean IS NOT NULL
 GROUP BY 1
 ORDER BY 5 DESC

结果是: The answer result

您可以在屏幕截图中看到结果非常相似,但有些数字不同。

我的查询和答案之间的唯一区别在于 COUNT 语句,但我并没有真正看到区别,例如:acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp &lt;= '3 years'acquisitions.acquired_at_cleaned &lt;= companies.founded_at_clean::timestamp + INTERVAL '3 years'

我尝试在我的 SELECT 语句中添加INTERVAL

SELECT companies.category_code,
       COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= INTERVAL '3 years' THEN 1 ELSE NULL END) AS less_than_3_years,
       COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= INTERVAL '5 years' THEN 1 ELSE NULL END) AS between_3_to_5_years,
       COUNT(CASE WHEN acq.acquired_at_cleaned - companies.founded_at_clean:: timestamp <= INTERVAL '10 years' THEN 1 ELSE NULL END) AS within_10_years,
       COUNT(1) AS total

并从答案查询中删除INTERVAL

SELECT companies.category_code,
       COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + '3 years'
                       THEN 1 ELSE NULL END) AS acquired_3_yrs,
       COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + '5 years'
                       THEN 1 ELSE NULL END) AS acquired_5_yrs,
       COUNT(CASE WHEN acquisitions.acquired_at_cleaned <= companies.founded_at_clean::timestamp + '10 years'
                       THEN 1 ELSE NULL END) AS acquired_10_yrs,
       COUNT(1) AS total

但结果是一样的。

我试图知道acquid_date和founded_date之间的差异的结果,看看该值是否可以与INTERVAL进行比较。结果是几天之内,这对我来说很有希望。 The result

我尽量提供所有信息供您考虑。希望有人能帮忙。提前谢谢!

【问题讨论】:

  • 请不要在您的问题中使用链接。用所有相关信息更新它,最好是可编辑的文本
  • 可能是您的数据库将INTERVAL '3 years' 视为一个区间,而不仅仅是'3 years',这会导致一些字符串比较。想想&lt;= 运算符两边的表达式类型。
  • @NickW 感谢您的建议。我编辑了我的问题。希望现在更清楚了
  • @JonArmstrong 我也尝试在我的查询中使用间隔,或者从答案查询中删除间隔,但结果是一样的。而且我还在比较中指定了时间戳,所以我认为不是字符串比较的问题。
  • 一个测试用例会很有帮助。看来这两个查询之间的唯一区别是“INTERVAL”操作与可能的字符串比较。您能否包括(在问题中)对使用 INTERVAL 的查询的调整?

标签: sql postgresql datetime case intervals


【解决方案1】:

我的建议是从一个日期/时间添加/减去 INTERVAL,然后与另一个日期/时间进行比较。不要减去日期/时间,然后与字符串文字进行比较。无论someDateTimesomeDateTime +/- '3 YEARS' 之间的实际天数如何,您的数据库似乎都将'3 YEARS' 理解为3 * 365 days。每年的实际天数可能是 365 或 366,具体取决于是否跨越闰年。

这是一个与特定区间进行比较的简单示例,这也需要我们知道是否跨越了闰年以及跨越了多少个闰年。

Fiddle

测试用例:

WITH dates AS (
        SELECT '2021-01-01'::date AS xdate
     )
SELECT xdate - (xdate - INTERVAL '1' YEAR) AS diff
     , xdate - (xdate - INTERVAL '1' YEAR) = '1 YEAR'   AS b1
     , xdate - (xdate - INTERVAL '1' YEAR) = '365 DAYS' AS b2
     , xdate - (xdate - INTERVAL '1' YEAR) = '366 DAYS' AS b3
  FROM dates
;

-- AND --

WITH dates AS (
        SELECT '2021-01-01'::date AS xdate
     )
SELECT xdate - (xdate - INTERVAL '1' YEAR) AS diff
     , xdate - (xdate - INTERVAL '1' YEAR) = INTERVAL '1' YEAR   AS b1
     , xdate - (xdate - INTERVAL '1' YEAR) = INTERVAL '365 DAYS' AS b2
     , xdate - (xdate - INTERVAL '1' YEAR) = INTERVAL '366 DAYS' AS b3
  FROM dates
;

结果:

diff b1 b2 b3
366 days f f t

Fiddle

WITH dates AS (
        SELECT '2021-01-01'::date AS xdate
     )
   , diff AS (
        SELECT xdate - (xdate - INTERVAL '1' YEAR) AS diff
          FROM dates
     )
SELECT diff
     , CASE WHEN diff = (366*24*60*60 * INTERVAL '1' SECOND)
            THEN 1
        END AS compare1
     , 366*24*60*60 AS seconds
     , CASE WHEN diff = (366*24*60*60 * INTERVAL '1' SECOND)
            THEN 1
        END AS compare2
     , CASE WHEN diff = '31622400 SECONDS'
            THEN 1
        END AS compare3
  FROM diff
;

结果:

diff compare1 seconds compare2 compare3
366 days 1 31622400 1 1

原始回复:

The fiddle for PostgreSQL

此处(下)显示的行为与发布的行为相似。

问题是产生的价值不一定是你想的那样。

这是 postgresql 中的一个测试用例,它可能代表您的问题。

这可能与leap year 有关,其中一年中的天数不是恒定的。

因此,比较日期可能比假设某个天数更安全,这可能是&lt;= '3 years' 所做的假设。

测试 SQL:

WITH test (acquired_at_cleaned, founded_at_clean, n) AS (
        SELECT current_date, current_date - INTERVAL '4' YEAR, 4 UNION
        SELECT current_date, current_date - INTERVAL '3' YEAR, 3 UNION
        SELECT current_date, current_date - INTERVAL '2' YEAR, 2 UNION
        SELECT current_date, current_date - INTERVAL '1' YEAR, 1
     )
   , cases AS (
        SELECT test.*
             , CASE WHEN acquired_at_cleaned <= founded_at_clean::timestamp + INTERVAL '3' year
                    THEN 1 ELSE NULL
                END AS acquired_3_yrs_case1
             , CASE WHEN acquired_at_cleaned - founded_at_clean::timestamp <= '3 year'
                    THEN 1 ELSE NULL
                END AS acquired_3_yrs_case2
             , acquired_at_cleaned - founded_at_clean::timestamp AS x1
             , acquired_at_cleaned - (n * INTERVAL '1' YEAR) AS x2
          FROM test
     )
SELECT acquired_at_cleaned AS acquired
     , founded_at_clean    AS founded
     , n
     , acquired_3_yrs_case1 AS case1
     , acquired_3_yrs_case2 AS case2
     , x1, x2
  FROM cases
 ORDER BY founded_at_clean
;

结果:

acquired founded n case1 case2 x1 x2
2021-12-25 2017-12-25 00:00:00 4 null null 1461 days 2017-12-26 00:00:00
2021-12-25 2018-12-25 00:00:00 3 1 null 1096 days 2018-12-26 00:00:00
2021-12-25 2019-12-25 00:00:00 2 1 1 731 days 2019-12-26 00:00:00
2021-12-25 2020-12-25 00:00:00 1 1 1 365 days 2020-12-26 00:00:00

有趣的结果。

【讨论】:

  • 对不起,我对 SQL 和 Stackoverflow 很陌生。你的测试 SQL 太复杂了,我现在无法理解,但我明白你的意思。考虑到leap year,请问您认为哪一种应该是处理查询问题的正确方法?非常感谢您对问题和答案的建议。圣诞快乐!
  • @ericzheng0404 我将根据我的建议调整答案,即在一个日期/时间添加/减去 INTERVAL,然后与另一个日期/时间进行比较。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2017-04-01
  • 1970-01-01
  • 2011-10-22
  • 2015-12-02
  • 2023-04-08
  • 1970-01-01
  • 2017-06-18
相关资源
最近更新 更多