【问题标题】:How to calculate median of an inner join field when using group by?使用group by时如何计算内部连接字段的中位数?
【发布时间】:2017-03-30 14:23:49
【问题描述】:

我有以下查询,我在其中检索特定商品的销售数量以及这些销售的平均价格。

SELECT COUNT(1) AS num_sales, DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date, AVG(prices.price) AS avg_price
FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC

我还有一个 for 循环,它每天执行一个单独的查询以获得中位数价格(假设结果数是偶数):

SELECT prices.price FROM sales INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7503 
AND (`prices`.`source` = 0 or (`prices`.`price` >= 400 and `prices`.`source` > 0))
AND DATE(sales.created_at) = "<THE DATE OF THE CURRENT FOR-LOOP OBJECT>"
ORDER BY prices.price ASC
LIMIT 1 OFFSET <NUMBER OF THE MIDDLE ROW>

您可以想象,这非常慢,因为在某些情况下,必须在一张大表上完成数百个查询(sales 表有几亿行)。

如何重写第一个 SQL 查询,使其也计算prices.price 的中位数,类似于AVG(prices.price)?我已经查看了诸如 this one 之类的答案,但不知道如何针对我的特定场景进行调整。

我已经花了几个小时试图完成这个,但我的 SQL 知识还不够好。任何帮助将不胜感激!

root@ns525077:~# mysql -V
mysql  Ver 14.14 Distrib 5.7.13, for Linux (x86_64) using  EditLine wrapper

表架构:

CREATE TABLE `prices` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `item_id` int(11) unsigned NOT NULL,
 `price` decimal(8,2) NOT NULL,
 `net_price` decimal(8,2) NOT NULL,
 `source` tinyint(4) NOT NULL,
 `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 PRIMARY KEY (`id`),
 UNIQUE KEY `id` (`id`),
 KEY `prices_ibfk_1` (`item_id`),
 CONSTRAINT `prices_ibfk_1` FOREIGN KEY (`item_id`) REFERENCES `items` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=4861375 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `sales` (
 `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
 `price_id` int(11) unsigned DEFAULT NULL,
 `item_key` varchar(40) COLLATE utf8_unicode_ci NOT NULL,
 `created_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 `updated_at` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
 PRIMARY KEY (`id`),
 UNIQUE KEY `id` (`id`),
 UNIQUE KEY `item_key` (`item_key`),
 KEY `price_id` (`price_id`),
 KEY `created_at` (`created_at`),
 KEY `price_id__created_at__IX` (`price_id`,`created_at`),
 CONSTRAINT `sales_ibfk_1` FOREIGN KEY (`price_id`) REFERENCES `prices` (`id`) ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=386156944 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

我的第一个查询的输出示例:

【问题讨论】:

  • 发布您的节目创建表格输出
  • 请分享 - 价格数据类型?每天的最大行数?
  • @e4c5 我已经添加了创建表输出。每天的最大行数取决于记录的销售数量。这可能是几十万。
  • 我们能否也请提供几行示例数据,我不清楚为什么您需要在两个表中都有时间。

标签: mysql sql


【解决方案1】:

经过大量搜索,我找到了问题here 的答案。也许我最初的问题没有正确表达。

我已经根据自己的情况调整了解决方案,这是有效的查询:

SELECT COUNT(1) AS num_sales,
       DATE_FORMAT(sales.created_at, '%Y-%m-%d') AS date,
       AVG(prices.price) AS avg_price,
       CASE(COUNT(1) % 2)
       WHEN 1 THEN SUBSTRING_INDEX(
           SUBSTRING_INDEX(
               group_concat(prices.price
                            ORDER BY prices.price SEPARATOR ',')
               , ',', (count(*) + 1) / 2)
           , ',', -1)
       ELSE (SUBSTRING_INDEX(
                 SUBSTRING_INDEX(
                     group_concat(prices.price
                                  ORDER BY prices.price SEPARATOR ',')
                     , ',', count(*) / 2)
                 , ',', -1)
             + SUBSTRING_INDEX(
                 SUBSTRING_INDEX(
                     group_concat(prices.price
                                  ORDER BY prices.price SEPARATOR ',')
                     , ',', (count(*) + 1) / 2)
                 , ',', -1)) / 2
       END median_price
FROM sales
  INNER JOIN prices ON prices.id = sales.price_id
WHERE prices.item_id = 7381
      AND (`prices`.`source` = 0
           OR (`prices`.`price` >= 400
               AND `prices`.`source` > 0))
GROUP BY date
ORDER BY date ASC;

【讨论】:

    猜你喜欢
    • 2017-05-19
    • 2013-01-08
    • 1970-01-01
    • 2013-01-03
    • 2022-11-21
    • 2015-04-14
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多