【问题标题】:is two inner joins is best for optimization of query是两个内连接最适合查询优化
【发布时间】:2017-08-20 14:35:32
【问题描述】:

我刚从学校收到一个挑战优化这个查询这是理论问题

挑战:

SELECT TO_CHAR(CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableA."date"),'YYYY-MM') AS "date_month",
COUNT(DISTINCT CASE WHEN (tableB."date" IS NOT NULL) THEN tableB._id ELSE NULL END) AS "tableB.countB",
COUNT(DISTINCT CASE WHEN (tableC."date" IS NOT NULL) THEN tableC._id ELSE NULL END) AS "tableC.countC"
FROM tableA AS tableA
LEFT JOIN tableB AS tableB ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableB."date"))) = (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableA."date")))
LEFT JOIN tableC AS tableC ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableC."date"))) = (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',tableA."date")))
WHERE tableA."date" >= CONVERT_TIMEZONE ('America/Los_Angeles','UTC',DATEADD (month,-17,DATE_TRUNC('month',DATE_TRUNC('day',CONVERT_TIMEZONE ('UTC','America/Los_Angeles',GETDATE ()))))
GROUP BY 1
ORDER BY 1 DESC LIMIT 500;

为了优化,我只是删除了上面提到的查询中的case语句,我认为这也会提高查询的效率

SELECT    To_char(Convert_timezone ('UTC','America/Los_Angeles',tablea."date"),'YYYY-MM') AS "date_month", 
          Count(DISTINCT 
           decode(tableb."date", not null,tableb._id,null)
           AS "tableB.countB",
          Count(DISTINCT 
           decode(tablec."date", not null,tablec._id ,null)
            AS "tableC.countC"  
FROM      tablea AS tablea 
LEFT JOIN tableb AS tableb 
ON        ( 
                    Date (Convert_timezone ('UTC','America/Los_Angeles',tableb."date"))) = (Date (Convert_timezone ('UTC','America/Los_Angeles',tablea."date")))
LEFT JOIN tablec AS tablec 
ON        ( 
                    Date (Convert_timezone ('UTC','America/Los_Angeles',tablec."date"))) = (Date (Convert_timezone ('UTC','America/Los_Angeles',tablea."date")))
WHERE     tablea."date" >= convert_timezone ('America/Los_Angeles','UTC',Dateadd (month,-17,Date_trunc('month',Date_trunc('day',Convert_timezone ('UTC','America/Los_Angeles',Getdate ())))) group BY 1 ORDER BY 1 DESC limit 500;

如果我们删除一个左连接并合并语句,您的建议是什么 可以优化吗

【问题讨论】:

  • 请张贴 DDL,DML 所涉及的表格,不要张贴图片。拥有此信息有助于其他人快速重现您的问题并更好地解决问题。以下是一些可能有助于您理解的示例。 --样本数据 create table t1 ( id int ) insert into t1 values (1), (2), (1) 我当前的查询/我尝试过的内容: select id,count(*) as cnt from t1 group by id 我当前的结果: id cnt 1 2 2 1 我的预期结果: id cnt 1 2 2 1 1 2
  • 一个快速的优化是不给表赋予别名,其中别名与表名完全相同。 FROM tableA as tableAFROM tableA 完全相同。这个简单的修改不会让代码变得更快,但它会让你的大脑(当然还有我的!)在阅读代码时工作得更快。
  • 请用正确的数据库产品名称标记您的帖子。您已将其标记为 Oracle,这肯定是不正确的(稍后我将删除 oracle 标记)。在 Oracle 中你不能按 1 分组(即第一列),没有 convert_timezone 等。我不认识你的产品,但它绝对不是 Oracle 数据库。
  • ... 或者,使用更短的别名,实际上使 SQL 更短更清晰。这也有助于阅读能力。此外,将其格式化为单独的子句(Select、From、Join、Where、Order By、Group by、Having 等,以便它们易于分离和区分。并使用与支持的逻辑结构一致的缩进,以及不妨碍,您可以将这些部分彼此分开。
  • 用您正在使用的数据库提出问题。

标签: sql oracle join optimization


【解决方案1】:

... 或者,使用更短的别名,实际上使 SQL 更短更清晰。这也有助于阅读能力。此外,将其格式化为单独的子句(Select、From、Join、Where、Order By、Group by、Having 等,以便它们易于分离和区分。并使用与支持的逻辑结构一致的缩进,以及不会妨碍,您可以将这些部分彼此分开。
举个例子,这是您的第一个 SQL 查询重新格式化,但在逻辑结构上与您发布的内容相同:

SELECT TO_CHAR(CONVERT_TIMEZONE ('UTC','America/Los_Angeles', a.date),'YYYY-MM') date_month,
   COUNT(DISTINCT CASE WHEN (b."date" IS NOT NULL) THEN b._id ELSE NULL END) countB,
   COUNT(DISTINCT CASE WHEN (c."date" IS NOT NULL) THEN c._id ELSE NULL END) countC
FROM tableA a   
  LEFT JOIN tableB b 
     ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',b.date))) = 
        (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',a.date)))
  LEFT JOIN tableC c 
     ON (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',c.date))) = 
        (DATE (CONVERT_TIMEZONE ('UTC','America/Los_Angeles',a.date)))
WHERE a.date >= CONVERT_TIMEZONE ('America/Los_Angeles', 'UTC', 
       DATEADD (month,-17,DATE_TRUNC('month', 
       DATE_TRUNC('day',CONVERT_TIMEZONE ('UTC','America/Los_Angeles', 
                        GETDATE ()))))
GROUP BY 1
ORDER BY 1 DESC LIMIT 500;

这是一个优化的版本

SELECT DatePart(month, a.Date-8/24) date_month, 
  sum(case when b.date is Not null then 1 else 0 end) countb,
  sum(case when c.date is Not null then 1 else 0 end) countc,
FROM tableA a    
  LEFT JOIN tableB b 
     ON b.Date = a.Date -- Timezone offsets are not necessary, 
  LEFT JOIN tableC c  
     ON c.date = a.date -- both in same timezone 
WHERE a.date >= DateAdd(hour, 8,
            DATEADD (month,-17,DATE_TRUNC('month', 
             GETDATE () ))
GROUP BY 1
ORDER BY 1 DESC LIMIT 500;

【讨论】:

    【解决方案2】:

    大概_id 列是唯一的。所以:

    SELECT TO_CHAR(CONVERT_TIMEZONE('UTC','America/Los_Angeles', a."date"), 'YYYY-MM') AS date_month,
           SUM(CASE WHEN b."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableB_countB,
           SUM(CASE WHEN c."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableC_countC
    FROM tableA a LEFT JOIN
         tableB b
         ON DATE(CONVERT_TIMEZONE ('UTC', 'America/Los_Angeles', b."date")) = DATE(CONVERT_TIMEZONE ('UTC', 'America/Los_Angeles', b."date")) LEFT JOIN
         tableC c
          ON DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', c."date")) = DATE(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', a."date")
    WHERE a."date" >= CONVERT_TIMEZONE('America/Los_Angeles', 'UTC',
                                       DATEADD(month, -17, DATE_TRUNC('month', DATE_TRUNC('day', CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE ()))
    GROUP BY 1
    ORDER BY 1 DESC
    LIMIT 500;
    

    然后,ON 子句中的日期转换似乎没有必要,因为两侧是从同一时区转换的。如果值没有时间分量(如 date 这样的名称所建议的那样),则也不需要 DATE()

    SELECT TO_CHAR(CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', a."date"), 'YYYY-MM') AS date_month,
           SUM(CASE WHEN b."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableB_countB,
           SUM(CASE WHEN c."date" IS NOT NULL THEN 1 ELSE 0 END) AS tableC_countC
    FROM tableA a LEFT JOIN
         tableB b
         ON b."date" = b."date" LEFT JOIN
         tableC c
          ON c."date" = a."date"
    WHERE a."date" >= CONVERT_TIMEZONE('America/Los_Angeles', 'UTC',
                                       DATEADD(month, -17, DATE_TRUNC('month', DATE_TRUNC('day', CONVERT_TIMEZONE('UTC', 'America/Los_Angeles', GETDATE ()))
    GROUP BY 1
    ORDER BY 1 DESC
    LIMIT 500;
    

    WHERE 子句很好。它可以利用a(date) 上的索引。

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多