【问题标题】:select non-duplicated records选择不重复的记录
【发布时间】:2014-01-14 03:35:50
【问题描述】:

我有一个包含大约 5000 万条记录的表。

表结构如下所示,calleridcall_start 字段均已编入索引。

id -- callerid -- call_start

我想选择所有 call_start 大于 '2013-12-22' 并且 callerid 在整个表中在 '2013-12-22' 之前不重复的记录。

我用过这样的东西:

SELECT DISTINCT 
  ca.`callerid` 
FROM
  call_archives AS ca 
WHERE ca.`call_start` >= '2013-12-22' 
  AND ca.`callerid` NOT IN 
  (SELECT DISTINCT 
    ca.`callerid` 
  FROM
    call_archives AS ca 
  WHERE ca.`call_start` < '2013-12-21')

但这非常慢,任何建议都非常感谢。

【问题讨论】:

    标签: mysql sql performance select not-exists


    【解决方案1】:

    使用NOT EXISTS 而不是NOT IN

    试试这个:

    SELECT DISTINCT ca.callerid 
    FROM call_archives AS ca 
    WHERE ca.call_start>='2013-12-22' AND 
      NOT EXISTS(SELECT 1 FROM call_archives AS ca1 
                 WHERE ca.callerid = ca1.callerid AND ca1.call_start <'2013-12-21');
    

    【讨论】:

      【解决方案2】:

      只是好奇这个查询是否在您的桌子上运行得很快:

      SELECT ca.`callerid` 
      FROM call_archives 
      GROUP BY ca.`callerid` 
      HAVING MIN(ca.`call_start`) >='2013-12-22' 
      

      【讨论】:

      • 查询执行时间显着改善。谢谢
      【解决方案3】:

      试试NOT EXISTS

      SELECT DISTINCT 
        ca.`callerid` 
      FROM
        call_archives AS ca 
      WHERE ca.`call_start` >= '2013-12-22' 
        AND NOT EXISTS 
        (SELECT 
          1 
        FROM
          call_archives AS cb 
        WHERE ca.`callerid` = cb.`callerid` 
          AND cb.`call_start` < '2013-12-21')
      

      【讨论】:

      • 这缩短了执行时间,但没有我预期的那么多。
      猜你喜欢
      • 1970-01-01
      • 2011-08-05
      • 2013-12-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2013-11-04
      • 1970-01-01
      相关资源
      最近更新 更多