MySql 拆分查询组百分比答案

【问题标题】：MySql Split Query Group PercentageMySql 拆分查询组百分比
【发布时间】：2016-12-30 10:31:15
【问题描述】：

我正在尝试执行以下操作，但我还无法做到：(。

我有这些表：

table1 -> tb1_id, tb1_name 

Sample Data:
--------------
1  group1
2  group2
3  group3
4  group4
5  group5

table2 -> tb2_id, tb2_sector, tb2_tb3_id   

Sample Data:
--------------
1  alpha   1
2  beta    2
3  gamma   2
4  delta   2
5  epsilon 4

table3 -> tb3_id, tb3_mid, tb3_section 

Sample Data:
--------------
1   234   alpha,beta,gama,delta

这是我正在寻找的输出：

Name    Count     %
------  -----    -----
group1   1       25%
group2   3       75%
group3   0       0%
group4   0       0%
group5   0       0%

基本上我需要拆分一个由逗号分隔的列值（table3 中的 tb3_section），然后为每个值找到正确的组（table2 给了我与 table1 链接的组 id），然后按组进行总计数和获取百分比（假设总数为 100%）。

这是我目前尝试过的查询：

我搜索了拆分值样本，并找到了一个通过首先创建数字表进行拆分的样本：

create table numbers (
  `n` INT(11) SIGNED 
  , PRIMARY KEY(`n`)
) 

INSERT INTO numbers(n) SELECT @row := @row + 1 FROM 
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t,
(SELECT 0 UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) t2, 
(SELECT 0 UNION ALL SELECT 1) t8, 
(SELECT @row:=0) ti;

后来，我这样做了：

select tb3_section, count(1) from ( 
  select 
    tb3_mid, 
    substring_index(
      substring_index(tb3_section, ',', n), 
      ',', 
      -1
    ) as tb3_section  from table3 
  join numbers
    on char_length(tb3_section) 
      - char_length(replace(tb3_section, ',', '')) 
      >= n - 1
) tb3_section_dashboard
group by 1

这并没有给我组数。只是拆分 tb3_section 但没有给我正确的计数和等效百分比。任何想法都会非常感谢它。

最新更新

首先，我要感谢@eggyal 为我指出了正确的方向，感谢@Shadow 鄙视知道我没有采取最好的方法，他想出了一个快速解决我的问题的方法。我设法改变了方法并从 table3 中删除了逗号分隔的值。相反，现在我为每个新值添加多行（并添加了一个约束以避免重复）。

现在 table3 看起来像：

Sample Data:
--------------
1   234   alpha
2   234   beta
3   234   gama
4   234   delta
5   235   alpha

这是我从@shadow 示例中获取的查询：

SELECT t1.tb1_name, COUNT(t3.tb3_section) AS no_per_group,
COUNT(t3.tb3_section) / t4.no_of_groups AS percentage 
FROM t1 left
JOIN t2 ON t1.tb1_id=t2.tb2_tb3_id 
INNER JOIN t3 ON t2.tb2_sector=t3.tb3_section>0 
JOIN (SELECT COUNT(*) AS no_of_groups
       FROM t3 INNER JOIN t2 ON t2.tb2_sector=t3.tb3_section>0) t4 
GROUP BY t1.tb1_name

现在我使用 = 来匹配确切的值，而不是使用 find_in_set。现在我得到类似以下的结果，但百分比看起来很奇怪，我错过了一个没有匹配的组：

Name        no_per_group     percentage
-----       -------------    ----------
group1      2                0.1053
group3      3                0.1579
group4      3                0.1579
group5      3                0.1579

虽然我仍然需要类似的东西：

Name    Count     %
------  -----    -----
group1   1       25%
group2   3       75%
group3   0       0%
group4   0       0%
group5   0       0%

请注意，如果组中没有匹配项，我仍然需要显示该组。因为我有数千条彼此不同的记录，所以我需要添加另一个条件：where tb3_mid=234。喜欢这个，结果是用到tb3_mid。

【问题讨论】：

您可能有兴趣阅读@Bill Karwin对Is storing a delimited list in a database column really that bad?的回答
非常感谢@eggyal 指出比尔的好答案。就我而言，问题是我没有完全控制实际的表，因为我没有创建它们:(。但是，我现在可以体验在一列中存储分隔值的问题。
你可以从 MySQL 的 FIND_IN_SET() 函数构造一个连接谓词。

标签： c# mysql stored-procedures

【解决方案1】：

最好的解决方案是重新设计表结构并将分隔值列表中的数据移动到单独的表中。

快速的解决方案是利用 MySQL 的 find_in_set() 函数。

获取消息表（table3）中的条目总数：

select count(*) as no_of_groups
from t3 inner join t2 on find_in_set(t2.tb2_sector,t3.tb3_section)>0

要获取每个组的计数，请将连接添加到 table1 并按组名分组。要计算百分比，请将上述查询添加为子查询：

select t1.tb1_name, count(t3.tb3_section) as no_per_group, count(t3.tb3_section) / t4.no_of_groups as percentage
from t1 left join t2 on t1.tb1_id=t2.tb2_tb3_id
inner join t3 on find_in_set(t2.tb2_sector,t3.tb3_section)>0
join (select count(*) as no_of_groups
      from t3 inner join t2 on find_in_set(t2.tb2_sector,t3.tb3_section)>0) t4  --no join condition makes a Cartesian join
group by t1.tb1_name

【讨论】：

非常感谢影子花时间帮助我。您的查询解决了原来的问题。我设法说服我的同事改变我接受你的和@eggyal 建议的表结构并修改了结构。我现在将用我所做的更新我的问题。现在我几乎得到了我想要的结果，除了没有匹配时，我没有得到 0%，但我认为一定是左连接在某处丢失。