SQL Server:将类似的销售组合在一起

时间:2013-04-09 10:34:43

标签: sql sql-server group-by sql-server-2012 aggregate-functions

我正在尝试在SQL Server中进行一些报告。 这是基本的表格设置:

  

订单(ID,DateCreated,状态)

     

产品(ID,名称,价格)

     

Order_Product_Mapping (OrderID,ProductID,Quantity,Price,DateOrdered)

在这里,我想创建一个报告,以便在一段时间内对具有相似销售额的产品进行分组:

超过1个月的销售额:

  
      
  1. Coca,Pepsi,Tiger:平均20000美元(可口可乐:21000美元,百事可乐:19000美元,老虎:20000美元)
  2.   
  3. 面包,肉:$ 10000 avg(面包:$ 11000,肉:$ 9000)
  4.   

请注意,()中的文字只是为了澄清,而不是在报告中。 用户定义可以考虑类似的销售之间的差异。低于5%的示例销售额被认为是相似的,应该组合在一起。时间段也是用户定义的。

我可以计算一段时间内的总销售额,但对于如何按销售额变化将它们组合在一起没有任何想法。我正在使用SQL Server 2012。 任何帮助表示赞赏。

抱歉,我的英语不是很好:)

更新: * 我想出了我真正需要的东西;) *

对于已知的数字数组,如:1,2,3,50,52,100,102,105

我需要将它们分组到至少有3个数字的组中,并且组中任意两个项目之间的差异小于10.

对于上面的数组,输出应为:

[1,2,3]

[100102105]

=>该算法采用3个参数:数组,最小项组成一组和2项之间的最大差异。

如何在C#中实现它?

4 个答案:

答案 0 :(得分:1)

我简直不敢相信我做到了~~~

-- this threshold is the key in this query
-- it means that 
-- if the difference between two values are less than the threshold
-- these two values are belong to one group
-- in your case, I think it is 200
DECLARE @th int
SET @th = 200

-- very simple, calculate total price for a time range
;WITH totals AS ( 
  SELECT p.name AS col, sum(o.price * op.quantity) AS val
  FROM order_product_mapping op
  JOIN [order] o ON o.id = op.orderid
  JOIN product p ON p.id = op.productid
  WHERE dateordered > '2013-03-01' AND dateordered < '2013-04-01'
  GROUP BY p.name
),
-- give a row number for each row
cte_rn AS ( -- 
  SELECT col, val, row_number()over(ORDER BY val DESC) rn
  FROM totals
),
-- show starts now,
-- firstly, we make each row knows the row before it 
cte_last_rn AS (
  SELECT col, val, CASE WHEN rn = 1 THEN 1 ELSE rn - 1 END lrn
  FROM cte_rn
),
-- then we join current to the row before it, and calculate 
-- the difference between the total price of current row and that of previous row
-- if the the difference is more than the threshold we make it '1', otherwise '0'
cte_range AS (
  SELECT
    c1.col, c1.val,
    CASE
      WHEN c2.val - c1.val <= @th THEN 0
      ELSE 1
    END AS range,
    rn
  FROM cte_last_rn c1
  JOIN cte_rn c2 ON lrn = rn
),
-- even tricker here,
-- now, we join last cte to itself, and for each row
-- sum all the values (0, 1 that calculated previously) of rows before current row
cte_rank AS (
  SELECT c1.col, c1.val, sum(c2.range) rank
  FROM cte_range c1
  JOIN cte_range c2 ON c1.rn >= c2.rn
  GROUP BY c1.col, c1.val
)
-- now we have properly grouped theres total prices, and we can group on it's rank 
SELECT 
  avg(c1.val) AVG,
  (
    SELECT c2.col + ', ' AS 'data()'
    FROM cte_rank c2
    WHERE c2.rank = c1.rank
    ORDER BY c2.val desc
    FOR xml path('')
  ) product,
  (
    SELECT cast(c2.val AS nvarchar(MAX)) + ', ' AS 'data()'
    FROM cte_rank c2
    WHERE c2.rank = c1.rank
    ORDER BY c2.desc
    FOR xml path('')
  ) price
FROM cte_rank c1
GROUP BY c1.rank
HAVING count(1) > 2

结果如下:

AVG     PRODUCT     PRICE
28      A, B, C     30, 29, 27
12      D, E, F     15, 12, 10
3       G, H, I     4, 3, 2

为了解我如何连接,请阅读: Concatenate many rows into a single text string?

答案 1 :(得分:1)

顺便说一下,如果你只想要c#:

var maxDifference = 10;
var minItems = 3;     

// I just assume your list is not ordered, so order it first
var array = (new List<int> {3, 2, 50, 1, 51, 100, 105, 102}).OrderBy(a => a);

var result = new List<List<int>>();
var group = new List<int>();
var lastNum = array.First();
var totalDiff = 0;
foreach (var n in array)
{
    totalDiff += n - lastNum;

    // if distance of current number and first number in current group
    // is less than the threshold, add into current group
    if (totalDiff <= maxDifference)
    {
        group.Add(n); 
        lastNum = n;
        continue;
    }

    // if current group has 3 items or more, add to final result
    if (group.Count >= minItems)
        result.Add(group);

    // start new group
    group = new List<int>() { n };
    lastNum = n;
    totalDiff = 0;   
}

// forgot the last group...
if (group.Count >= minItems)
    Result.Add(group);

这里的关键是,数组需要排序,这样你就不需要跳转或存储值来计算距离

答案 2 :(得分:0)

此查询应生成您期望的内容,它会显示您订购的每个月的产品销售额:

SELECT CONVERT(CHAR(4), OP.DateOrdered, 100) + CONVERT(CHAR(4), OP.DateOrdered, 120) As Month , 
Product.Name , 
AVG( OP.Quantity * OP.Price ) As Turnover
FROM Order_Product_Mapping OP
INNER JOIN Product ON Product.ID = OP.ProductID
GROUP BY  CONVERT(CHAR(4), OP.DateOrdered, 100) + CONVERT(CHAR(4), OP.DateOrdered, 120) ,
          Product.Name

未经测试,但如果您提供样本数据,我可以使用它

答案 3 :(得分:0)

看起来我让事情变得更加复杂。 以下是应该解决问题的方法:

- 运行查询以获取每种产品的销售额。

-Run K-mean或一些类似的算法。