【问题标题】:how to count median properly in mysql 5.7如何在 mysql 5.7 中正确计算中位数
【发布时间】:2020-12-19 04:17:15
【问题描述】:

这是我的小提琴https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=7946871d9c25cd8914353c70fde1fe8d

所以这是我的查询 select count(user_id) as itung, user_Id from

(SELECT t1.user_id, 
       t1.createdAt cretecompare1, 
       t2.createdAt cretecompare2,
       DATEDIFF(t2.createdAt, t1.createdAt) diff
-- table for a transaction
FROM test t1
-- table for prev. transaction
JOIN test t2 ON t1.user_id = t2.user_id 
            AND t1.createdAt < t2.createdAt
            AND 7 NOT IN (t1.status_id, t2.status_id)
JOIN (SELECT t3.user_id
      FROM test t3
      WHERE t3.status_id != 7
      GROUP BY t3.user_id
      HAVING SUM(t3.createdAt < '2020-04-01') > 1
         AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
WHERE NOT EXISTS (SELECT NULL
                   FROM test t5
                   WHERE t1.user_id = t5.user_id
                     AND t5.status_id != 7
                     AND t1.createdAt < t5.createdAt
                     AND t5.createdAt < t2.createdAt)
HAViNG cretecompare2  BETWEEN '2020-02-01' AND '2020-04-01') aa
group by user_Id
output table:
    +--------+---------+
    |  itung | user_Id |
    +--------+---------+
    |      1 |      13 |
    |      2 |      14 |
    +--------+---------+

基于该表,我想通过此查询找出最大值(itung)、最小值(itung)和中值(itung)

select max(itung), min(itung), format(avg(itung), 2),  IF(count(*)%2 = 1, CAST(SUBSTRING_INDEX(SUBSTRING_INDEX( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ',')
, ',', 50/100 * COUNT(*)), ',', -1) AS DECIMAL), ROUND((CAST(SUBSTRING_INDEX(SUBSTRING_INDEX
( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ','), ',', 50/100
* COUNT(*) + 1), ',', -1) AS DECIMAL) + CAST(SUBSTRING_INDEX(SUBSTRING_INDEX
( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ','), ',', 50/100
* COUNT(*)), ',', -1) AS DECIMAL)) / 2)) as median from
(select count(user_id) as itung, user_Id from 
(SELECT t1.user_id, 
       t1.createdAt cretecompare1, 
       t2.createdAt cretecompare2,
       DATEDIFF(t2.createdAt, t1.createdAt) diff
-- table for a transaction
FROM test t1
-- table for prev. transaction
JOIN test t2 ON t1.user_id = t2.user_id 
            AND t1.createdAt < t2.createdAt
            AND 7 NOT IN (t1.status_id, t2.status_id)
JOIN (SELECT t3.user_id
      FROM test t3
      WHERE t3.status_id != 7
      GROUP BY t3.user_id
      HAVING SUM(t3.createdAt < '2020-04-01') > 1
         AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
WHERE NOT EXISTS (SELECT NULL
                   FROM test t5
                   WHERE t1.user_id = t5.user_id
                     AND t5.status_id != 7
                     AND t1.createdAt < t5.createdAt
                     AND t5.createdAt < t2.createdAt)
HAViNG cretecompare2  BETWEEN '2020-02-01' AND '2020-04-01') aa
group by user_Id) ab

output table:
+------------+------------+-----------------------+--------+
| max(itung) | min(itung) | format(avg(itung), 2) | median |
+------------+------------+-----------------------+--------+
|          2 |          1 |                  1.50 |      2 |
+------------+------------+-----------------------+--------+

您知道对中位数的查询是错误的,因为中位数应该是 1,5 而不是 2。我在中位数查询中的错误在哪里?

【问题讨论】:

    标签: mysql median


    【解决方案1】:

    您有 ROUND() 将报告的中位数四舍五入为整数。如果您不想要,请将其删除:

    select max(itung), min(itung), format(avg(itung), 2),  IF(count(*)%2 = 1, CAST(SUBSTRING_INDEX(SUBSTRING_INDEX( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ',') , ',', 50/100 * COUNT(*)), ',', -1) AS DECIMAL), (CAST(SUBSTRING_INDEX(SUBSTRING_INDEX ( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ','), ',', 50/100 * COUNT(*) + 1), ',', -1) AS DECIMAL) + CAST(SUBSTRING_INDEX(SUBSTRING_INDEX ( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ','), ',', 50/100 * COUNT(*)), ',', -1) AS DECIMAL)) / 2) as median
    

    或保留四舍五入并添加小数位数以四舍五入,此处为 3:

    select max(itung), min(itung), format(avg(itung), 2),  IF(count(*)%2 = 1, CAST(SUBSTRING_INDEX(SUBSTRING_INDEX( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ',') , ',', 50/100 * COUNT(*)), ',', -1) AS DECIMAL), ROUND((CAST(SUBSTRING_INDEX(SUBSTRING_INDEX ( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ','), ',', 50/100 * COUNT(*) + 1), ',', -1) AS DECIMAL) + CAST(SUBSTRING_INDEX(SUBSTRING_INDEX ( GROUP_CONCAT(itung ORDER BY itung SEPARATOR ','), ',', 50/100 * COUNT(*)), ',', -1) AS DECIMAL)) / 2, 3)) as median
    

    请注意,只有在行数不多的情况下,才能从 GROUP_CONCAT 逗号分隔的所有值列表中查找中值,因为 GROUP_CONCAT 将在 @@group_concat_max_len 处截断,在 MySQL 上默认为 1024 个字符或在 10.2 之前的 MariaDB 上。

    【讨论】:

    • 我在小提琴中试过,但语法错误,检查dbfiddle.uk/…
    • 呵呵;我从它正在工作的命令行粘贴它。检查现在出了什么问题
    • 最后一位 () / 2) as median) 遗漏了,抱歉; dbfiddle.uk/…
    • 谢谢这就是我想要的,但是如果我有这么多行,我必须添加 group_concat_max_len
    • @FachryDzaky 你的CAST( ... AS DECIMAL) 也舍入为整数;我摆脱了它,在非舍入查询中将 0+ 添加到 IF 的奇数分支中,在舍入查询中,将舍入移动到整个 if:dbfiddle.uk/…
    猜你喜欢
    • 1970-01-01
    • 2018-01-24
    • 1970-01-01
    • 2018-02-09
    • 2011-07-21
    • 2018-08-12
    • 1970-01-01
    • 2016-11-17
    • 2014-11-08
    相关资源
    最近更新 更多