【问题标题】:Optimizing an embedded SELECT query in mySQL在 mySQL 中优化嵌入式 SELECT 查询
【发布时间】:2010-10-30 01:36:28
【问题描述】:

好的,这是我现在在一个有 45,000 条记录且大小为 65MB 的表上运行的查询......并且即将变得越来越大(所以我必须在这里考虑未来的性能):

SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL
AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id)

正如你可能想象的那样 - 它使 mysql 服务器陷入停顿......

它的作用是 - 它只是提取注册的新用户数量,至少有一次“已完成”付款,tm_completed 不为空(因为它仅填充已完成的付款),以及(嵌入式选择)该会员以前从未“完成”付款 - 这意味着他是新会员(只是因为系统确实会重新收费等等,这是区分刚刚重新收费的现有会员和新会员的唯一方法第一次收费)。

现在,是否有任何可能的方法来优化此查询以使用更少的资源或其他东西,并停止让我的 mysql 资源跪下......?

我是否遗漏了任何信息来进一步澄清这一点?让我知道...

编辑:

以下是该表上已有的索引:

PRIMARY PRIMARY 46757 payment_id

member_id 索引 23378 member_id

payer_id 索引 11689 payer_id

coupon_id INDEX 1 coupon_id

tm_added INDEX 46757 tm_added, product_id

tm_completed 索引 46757 tm_completed, product_id

【问题讨论】:

  • 您在使用搜索参数的字段上有索引吗?

标签: database optimization mysql


【解决方案1】:

这种IN 子查询在MySQL 中有点慢。我会改写成这样:

SELECT COUNT(1) AS signup_count, SUM(amount) AS signup_amount
FROM   payments p
WHERE  tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND    completed > 0
AND    NOT EXISTS (
           SELECT member_id
           FROM   payments
           WHERE  member_id = p.member_id
           AND    completed = 1
           AND    tm_completed < '2009-05-01');

检查“tm_completed IS NOT NULL”不是必需的,因为您的BETWEEN 条件暗示了这一点。

还要确保你有一个索引:

(tm_completed, completed)

【讨论】:

  • 打我一拳; +1 速度
  • 哇...不知道这只是与我已有的略有不同,只是将“IN”替换为“EXISTS”..​​.谢谢!
【解决方案2】:

我很高兴将这个不需要子查询的解决方案放在一起:

SELECT count(p1.payment_id) as signup_count, 
       sum(p1.amount)       as signup_amount  

  FROM payments p1
       LEFT JOIN payments p2 
       ON p1.member_id = p2.member_id
   AND p2.completed = 1
   AND p2.tm_completed < date '2009-05-01'

 WHERE p1.completed > 0
   AND p1.tm_completed between date '2009-05-01' and date '2009-05-30'
   AND p2.member_id IS NULL;

【讨论】:

  • 这种技术非常有效,尤其是在 mysql 中(历史上一直存在子查询问题)。
  • 我也喜欢这个答案......显然,当我在这里选择的两个答案上运行 EXPLAIN 时,我得到了相同的性能/资源使用情况(计算速度比使用“IN”时快大约 12,000 倍)子查询)。惊人的!谢谢...
【解决方案3】:

避免在子查询中使用 IN; MySQL 没有很好地优化这些(尽管在 5.4 和 6.0 中对此有待优化(请参阅here)。将其重写为连接可能会提高性能:

SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
LEFT JOIN (SELECT p2.member_id
          FROM payments p2
          WHERE p2.completed=1
          AND p2.tm_completed < '2009-05-01'
          AND p2.tm_completed IS NOT NULL
          GROUP BY p2.member_id) foo
ON p.member_id = foo.member_id AND foo.member_id IS NULL
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL

其次,我必须查看您的表架构;你在使用索引吗?

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-01-24
    • 2021-11-16
    • 1970-01-01
    • 2011-12-15
    • 2012-10-06
    • 2023-03-03
    相关资源
    最近更新 更多