MySQL with group by给出了错误的结果

时间:2015-08-31 16:35:43

标签: mysql

这是the SQLFiddle的架构和数据。

我正在尝试总结2列,一列在父级别,另一列在子级别。 我正在使用的当前查询在子级别上为我提供了正确的金额,但由于子级别涉及另一个1-many关系,因此在父级别上的金额翻了一倍。

呃......这是一个可怕的解释 - 这是英文版: 推销员Joe参与了2次销售。 对于第一次销售,他获得了2套佣金,基于2种不同的佣金类型。我试图显示Joe的总销售价值,以及他适用拆分的总价值。拆分总价值很好,但是销售价值增加了​​一倍,因为我很明显,分组/加入不正确(参见下面的最后一个例子)。

这很好:

select sp.person_name, pr.description,
sum(spl.split) as SplitValue
from sale s, product pr, sales_person sp, sales_split spl
where s.product_id = pr.id
and s.id = spl.sale_id
and sp.id = spl.sales_person_id
group by sp.id;

person_name | description | SplitValue
-----------   ----------- | ----------
Joe         | Widget 1    | 50
Sam         | Widget 1    | 10

这也产生正确的拆分和销售价值,但现在为Joe显示3行(即第2行是第1行的副本) - 我只想显示Joe的“Widget 1”促销一次,所以不是正确的:

select sp.person_name, pr.description,
sum(s.sale_value) as SaleValue, sum(spl.split) as SplitValue
from sale s, product pr, sales_person sp, sales_split spl, sales_split_agreement ssa
where s.id = spl.sale_id
and s.product_id = pr.id
and sp.id = spl.sales_person_id
and sp.id = ssa.sales_person_id
and spl.sales_person_id = ssa.sales_person_id
and ssa.id = spl.sales_split_agreement_id
group by sp.id, spl.id;

person_name | description | SplitValue | SaleValue
-----------   -----------   ----------   ---------
Joe         | Widget 1    | 10         | 20
Joe         | Widget 1    | 10         | 20
Joe         | Widget 2    | 30         | 30
Sam         | Widget 1    | 10         | 20

现在重复的行已经消失,但Joe的SaleValue不正确 - 它应该 50 ,而不是 70

select sp.person_name, pr.description,
sum(spl.split) as SplitValue, sum(s.sale_value) as SaleValue
from sale s, product pr, sales_person sp, sales_split spl, sales_split_agreement ssa
where s.id = spl.sale_id
and s.product_id = pr.id
and sp.id = spl.sales_person_id
and sp.id = ssa.sales_person_id
and spl.sales_person_id = ssa.sales_person_id
and ssa.id = spl.sales_split_agreement_id
group by sp.id;


person_name | description | SplitValue | SaleValue
-----------   -----------   ---------   ----------
Joe         | Widget 1    | 50         | 70
Sam         | Widget 1    | 10         | 20

即。我在查询后会产生这个结果(即Joe的正确SaleValue为50):

person_name | description | SplitValue | SaleValue
-----------   -----------   ---------   ----------
Joe         | Widget 1    | 50         | 50
Sam         | Widget 1    | 10         | 20

任何帮助将不胜感激!

更新1:

为清楚起见 - 这里是来自小提琴的架构和测试数据:

CREATE TABLE product
    (`id` int, `description` varchar(12))
;

INSERT INTO product
    (`id`, `description`)
VALUES
    (1, 'Widget 1'),
    (2, 'Widget 2')
;


CREATE TABLE sales_person
    (`id` int, `person_name` varchar(7))
;

INSERT INTO sales_person
    (`id`, `person_name`)
VALUES
    (1, 'Joe'),
    (2, 'Sam')
;


CREATE TABLE sale
    (`id` int, `product_id` int, `sale_value` int)
;

INSERT INTO sale
    (`id`, `product_id`, `sale_value`)
VALUES
    (1, 1, 20.00),
    (2, 2, 30.00)
;

CREATE TABLE split_type
    (`id` int, `description` varchar(6))
;

INSERT INTO split_type
    (`id`, `description`)
VALUES
    (1, 'Type 1'),
    (2, 'Type 2')
;

CREATE TABLE sales_split_agreement
    (`id` int, `sales_person_id` int, `split_type_id` int, `percentage` int)
;

INSERT INTO sales_split_agreement
    (`id`, `sales_person_id`, `split_type_id`, `percentage`)
VALUES
    (1, 1, 1, 50),
    (2, 1, 2, 50),
    (3, 2, 1, 50),
    (4, 1, 1, 100)
;


CREATE TABLE sales_split
    (`id` int, `sale_id` int, `sales_split_agreement_id` int, `sales_person_id` int, `split` int )
;

INSERT INTO sales_split
    (`id`, `sale_id`, `sales_split_agreement_id`, `sales_person_id`, `split`)
VALUES
    (1, 1, 1, 1, 10),
    (2, 1, 2, 1, 10),    
    (3, 1, 3, 2, 10),
    (4, 2, 4, 1, 30)
;

1 个答案:

答案 0 :(得分:2)

我认为你已走上正轨,但我决定从头开始重新开始。获取每个人的SplitValue不需要所有这些表。事实上,您需要的只是sales_splitsales_person,如下所示:

SELECT sp.person_name, SUM(ss.split) AS SplitValue
FROM sales_person sp
JOIN sales_split ss ON sp.id = ss.sales_person_id
GROUP BY sp.id;

同样,您可以获得salesales_splitsales_person之间加入的每个人的总销售价值:

SELECT sp.person_name, SUM(s.sale_value) AS SaleValue
FROM sale s
JOIN sales_split ss ON ss.sale_id = s.id
JOIN sales_person sp ON sp.id = ss.sales_person_id
GROUP BY sp.id;

此时,我意识到您的预期结果出错(对于此数据集)。事实上,Joe的销售价值为70,因为销售ID 1(值20),2(价值20)和4(价值30)加起来为70.但是,我仍然认为这个查询会帮助你超过70你拥有的那个。

此时,您可以通过将这两个子查询连接到sales_person表来获取每个sales_person_id的值。我在子查询中取消了对sales_person的加入,因为它现在变得无关紧要了。它甚至使子查询更清晰:

SELECT sp.person_name, COALESCE(t1.SplitValue, 0) AS SplitValue, COALESCE(t2.SaleValue, 0) AS SaleValue
FROM sales_person sp
LEFT JOIN(
  SELECT ss.sales_person_id, SUM(ss.split) AS SplitValue
  FROM sales_split ss
  GROUP BY ss.sales_person_id) t1 ON t1.sales_person_id = sp.id
LEFT JOIN(
  SELECT ss.sales_person_id, SUM(s.sale_value) AS SaleValue
  FROM sale s
  JOIN sales_split ss ON ss.sale_id = s.id
  GROUP BY ss.sales_person_id) t2 ON t2.sales_person_id = sp.id;

以下是SQL Fiddle示例。

编辑:我现在明白为什么Joe的实际销售价格为50,因为他在销售ID 1上拆分两次。为了解决这个问题,我首先得到了每个销售人员的不同销售清单,如下所示:

SELECT DISTINCT sale_id, sales_person_id
FROM sales_split;

这样,sales_person_id = 1和sale_id = 1只有一行。然后,很容易将其加入sale表并获得每个sales_person的正确销售值:

SELECT t.sales_person_id, SUM(s.sale_value) AS SaleValue
FROM(
  SELECT DISTINCT sale_id, sales_person_id
  FROM sales_split) t
JOIN sale s ON s.id = t.sale_id
GROUP BY t.sales_person_id;

我上面的其余答案仍然合适。我写了一个查询来获取SplitValue,还有一个查询来获取SaleValue,我将它们加在一起。所以,我现在要做的就是替换我刚才给你的子查询,进一步使用不正确的子查询:

SELECT sp.person_name, COALESCE(t1.SplitValue, 0) AS SplitValue, COALESCE(t2.SaleValue, 0) AS SaleValue
FROM sales_person sp
LEFT JOIN(
  SELECT ss.sales_person_id, SUM(ss.split) AS SplitValue
  FROM sales_split ss
  GROUP BY ss.sales_person_id) t1 ON t1.sales_person_id = sp.id
LEFT JOIN(
  SELECT t.sales_person_id, SUM(s.sale_value) AS SaleValue
  FROM(
    SELECT DISTINCT sale_id, sales_person_id
    FROM sales_split) t
  JOIN sale s ON s.id = t.sale_id
  GROUP BY t.sales_person_id) t2 ON t2.sales_person_id = sp.id;

以下是更新后的SQL Fiddle

您在评论中提到您为了简洁而缩短了数据,这很好。我正在离开我的连接,我相信它给你足够的方向,你可以相应地调整它们以匹配你的正确结构。