计算列中唯一单词的数量

时间:2017-02-26 23:49:01

标签: mysql

我有一张类似于此的表

quantity | part_numbers
1        | T101; T103
3        | T103; T102
1        | T101; T102; T103

我正在尝试编写一个返回的脚本

part_number | quantity
T101        | 2
T102        | 4
T103        | 5

我发现此脚本有效,但没有考虑数量

SELECT SUM(total_count) as total, value
FROM (                   

SELECT (count(*)) AS total_count, REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value
FROM (                   
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.part_numbers, ' ', n.n), ' ', -1) value
  FROM order_items t CROSS JOIN 
(                        
   SELECT a.N + b.N * 10 + 1 n
     FROM                
    (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
   ,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
    ORDER BY n           
) n                      
 WHERE n.n <= 1 + (LENGTH(t.part_numbers) - LENGTH(REPLACE(t.part_numbers, ' ', '')))
 ORDER BY value          

) AS x                   
GROUP BY x.value         

) AS y                   
GROUP BY value           
order by total desc   

1 个答案:

答案 0 :(得分:2)

首先,您应该修复数据结构。将列表存储为分隔字符串在SQL中是错误的。你应该有一个表,每个项目和每个部分有一行。

有时,我们会遇到其他人非常糟糕的设计决策。您可以修改包含数量的查询:

24135AB6 24135AB6

我对查询做出的唯一真正的改变是在各个子查询中包含SELECT value, SUM(total_count), SUM(total_quantity) FROM (SELECT COUNT(*) as total_count, SUM(quantity) as total_quantity REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value FROM (SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.part_numbers, ' ', n.n), ' ', -1) as value, oi.quantity FROM order_items oi CROSS JOIN (SELECT d1.N + d2.N * 10 + 1 n FROM (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ) d1 CROSS JOIN (SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9 ) d2 ) n WHERE n.n <= 1 + LENGTH(t.part_numbers) - LENGTH(REPLACE(t.part_numbers, ' ', '')) ) x GROUP BY x.value ) y GROUP BY value ;

其他变化:

  • 删除了quantity子句中的逗号。我厌恶这样的逗号。要明确;使用FROM
  • 命名子查询比CROSS JOINa更有意义。我使用bd1,在我看来是“数字1”和“数字2”。
  • 同样,d2使用oi而不是order_items