【问题标题】:LEFT JOIN count causing figures to 'square' instead of count?左连接计数导致数字“平方”而不是计数?
【发布时间】:2019-03-12 18:08:40
【问题描述】:

我重写了一个查询来连接表而不是执行子查询,因为我需要查找大约 10 个数字,而 10 个子查询存在一点性能问题。

为简单起见更改了表名和列 *

查询以前是这样做的:

SELECT t1.col1, t1.col2, t1.col3, 
(SELECT COUNT(j1.j_id) FROM jointable1 as j1 WHERE t1.t_employee_id = j1.j_employee_id
    AND t1.t_week_ending = j1.j_week_ending AND j1.j_reason <> 'DNC') as col4,
(SELECT COUNT(j2.j_id) FROM jointable1 as j2 WHERE t1.t_employee_id = j2.j_employee_id
    AND t1.t_week_ending = j2.j_week_ending) as col5
FROM table1 as t1
GROUP BY t1.col1, t1.col2, t1.col3;

我已经改写成这样了:

SELECT t1.col1, t1.col2, t1.col3, COUNT(j1.j_id) as col4, COUNT(j2.o_id) as col5
FROM table1 as t1
LEFT JOIN jointable1 as j1 ON (t1.t_employee_id = j1.j_employee_id
    AND t1.t_week_ending = j1.j_week_ending)
    AND j1.j_reason = <> 'DNC'
GROUP BY t1.col1, t1.col2, t1.col3;

问题是在上面的例子中为 col4 和 col5 返回的值很好。假设他们返回 7 和 8。

+------+------+------+------+--+
| col1 | col2 | col3 | col4 |  |
+------+------+------+------+--+
|    1 |    0 |    0 |   34 |  |
|    0 |    3 |    3 |    9 |  |
|    7 |    1 |    0 |    2 |  |
|    3 |    2 |    2 |    9 |  |
|    4 |    1 |    0 |    4 |  |
|    1 |   11 |    1 |    4 |  |
|    5 |    2 |    5 |   21 |  |
|    2 |    3 |    0 |    3 |  |
|    2 |    3 |    0 |    2 |  |
+------+------+------+------+--+

但在底部查询中,它们返回平方或乘以自身。所以 7 变成 49,8 变成 64。

+------+------+------+------+--+
| col1 | col2 | col3 | col4 |  |
+------+------+------+------+--+
|    1 |    0 |    0 | 1156 |  |
|    0 |    3 |    3 |   81 |  |
|    7 |    1 |    0 |   16 |  |
|    3 |    2 |    2 |   81 |  |
|    4 |    1 |    0 |   16 |  |
|    1 |   11 |    1 |   16 |  |
|    5 |    2 |    5 |  441 |  |
|    2 |    3 |    0 |    9 |  |
|    2 |    3 |    0 |    4 |  |
+------+------+------+------+--+

我不知道它是 LEFT JOIN 还是 GROUP BY 函数中缺少的东西,但是任何纠正的帮助都会很棒,任何帮助重写一个更有效的方式也很棒。

【问题讨论】:

  • 您能给我们展示一些示例表数据和预期结果吗? (格式化文本,而不是图像。)另请查看stackoverflow.com/help/mcve

标签: sql sql-server database


【解决方案1】:

如果您的 JOINS 中有多个匹配记录,行数可能会增加,当您使用像 COUNT 这样的聚合函数时,这可能会给您带来不正确的结果。您需要将COUNTDISTINCT 一起使用,如下所示。

 SELECT   t1.col1, 
          t1.col2, 
          t1.col3, 
          Count(DISTINCT j1.j_id) AS col4, 
          Count(DISTINCT j1.o_id) AS col5 
FROM      table1                  AS t1 
LEFT JOIN jointable1              AS j1 
ON        t1.t_employee_id = j1.j_employee_id 
AND       t1.t_week_ending = j1.j_week_ending 
AND       j1.j_reason = <> 'DNC' 
GROUP BY  t1.col1, 
          t1.col2, 
          t1.col3;

注意:在您的查询中,您使用了别名j2,它没有在任何地方设置,您需要适当地更正它。

【讨论】:

    【解决方案2】:

    尝试使用outer apply 编写查询。这样会更有效率。此外,您不会从第二个查询中获得 col5 的正确计数。对于col4,您需要计算j_reason 不是DNC 的行数,以及col5 的所有行数。

    SELECT  t1.col1, t1.col2, t1.col3, c4.col4, c5.col5
    FROM    table1 as t1
    OUTER APPLY
    (
        SELECT  COUNT(j1.j_id) col4
        FROM    jointable1 as j1 
        WHERE   t1.t_employee_id = j1.j_employee_id
        AND     t1.t_week_ending = j1.j_week_ending 
        AND     j1.j_reason <> 'DNC'
    )c4
    OUTER APPLY
    (
        SELECT  COUNT(j2.j_id) col5
        FROM    jointable1 as j2 
        WHERE   t1.t_employee_id = j2.j_employee_id
        AND     t1.t_week_ending = j2.j_week_ending
    )c5
    

    【讨论】:

    • 谢谢你。最初,在 SSMS 中检索查询时它的执行速度较慢,但​​在软件/应用程序中,它比以前快了 55-60% 以上。真的很感激!
    【解决方案3】:

    最好在子查询中进行计数以计算出所有组合,然后加入这些结果,现在您知道每个子查询只会加入 一个 行。

    当您以 1 对多的方式加入多个表时,问题就来了。如果您有两个 1-2 关联并加入两者,您将获得 4 行,而不是 2

    SELECT t1.col1, t1.col2, t1.col3, j1.Cnt, /* same for j2 */
    FROM table1 as t1
    LEFT JOIN (select j_employee_id,j_week_ending,COUNT(j_id) AS Cnt
         from jointable1
         where j_reason <> 'DNC'
         group by j_employee_id,j_week_ending) j1
    ON (t1.t_employee_id = j1.j_employee_id
        AND t1.t_week_ending = j1.j_week_ending)
    /* Same again for j2 */
    /* Don't need GROUP BY out here at all now? */
    

    【讨论】:

      【解决方案4】:
      SELECT 
          t1.col1,
          t1.col2,
          t1.col3, 
          cnt.col4,
          cnt.col5
      FROM table1 as t1
          LEFT JOIN (
              SELECT j1.j_employee_id
                  ,j1.j_week_ending
                  ,SUM(CASE WHEN j1.j_reason <> 'DNC' AND j1.j_id IS NOT NULL THEN 1 ELSE 0 END) as col4
                  ,COUNT(j1.j_id) as col5
              FROM jointable1 as j1
              GROUP BY j1.j_employee_id, j1.j_week_ending
          ) cnt ON t1.t_employee_id = cnt.j_employee_id
              AND t1.t_week_ending = cnt.j_week_ending
      GROUP BY t1.col1, t1.col2, t1.col3;
      

      【讨论】:

        猜你喜欢
        • 2012-06-15
        • 2018-10-26
        • 2011-01-14
        • 2014-08-08
        • 2015-05-31
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多