Question

我有一个这样的数据集

id     firstevent   allevents
1       apple       apple, orange
1       apple       apple
1       orange      orange,apple
2       orange      orange,apple
2       orange      orange,apple
3       apple       apple
4       banana      banana,orange, apple
4       orange      orange, apple
4       apple       apple

我正在使用 STRING_AGG 将每个 Id 的所有值与以下查询连接起来。

SELECT  id,
STRING_AGG(FirstEvent,';') as FirstEvent ,
STRING_AGG(FirstEvent,';') as allEvents
from mProcessingTime 
  GROUP BY id

我的输出如下：

id      FirstEvent                allevents
1       apple; apple; orange      apple, orange; apple; orange,apple
2       orange;orange             orange,apple; orange,apple
3       apple                     apple
4       banana; apple; orange     banana,orange, apple; orange, apple; apple

我想将此输出修改为仅集合中的不同值。我的预期输出是：

id      FirstEvent                 allevents
1       apple; orange             apple, orange; apple; orange,apple
2       orange                    orange,apple
3       apple                     apple
4       banana; apple; orange     banana,orange, apple; orange, apple; apple

我尝试在 STRING_AGG 函数中使用 distinct ，但它不起作用。

你能帮我吗？

编辑：添加额外信息以获得更清晰的图片。

Answer 1

我用一栏来说明解决方案，但您明白了：

select id,
STRING_AGG(FirstEvent,';') as FirstEvent 
, STRING_AGG(case when rw = 1 then FirstEvent else null end,';') as allevents 
from (
select * , row_number() over (partition by id,firstevent order by id) rw
from xx
) t
group by t.id

Answer 2

一种选择是使用 distinct 和 subquery，如下所示：

SELECT top 10 id,
STRING_AGG(FirstEvent,';') as FirstEvent 
from (select distinct id  firstevent from mProcessingTime ) t
  GROUP BY id

在一个查询中查找并连接所有不同的值

2 个答案: