Question

我有一张包含A，B和C列的表格。 A列可能有重复。

我需要一个查询，它会在A列中为我提供一个具有唯一值的结果集，而我并不关心它可能需要哪些重复。

事先我对其他数据一无所知。

一个例子可能是：

A    B    C
1    8    8
1    7    7
2    10   10

在这种情况下，我想选择：

A    B    C
1    x    x
2    10   10

x =选择哪个值无关紧要。

亲切的问候，

Matthias Vance

修改

我以为我找到了解决方案：

SELECT * FROM (
   SELECT * FROM test GROUP BY a
) table_test;

但这毕竟不起作用。

这将导致：

[Microsoft][ODBC Excel Driver] Cannot group on fields selected with '*'

Answer 1

这个简单的查询不会起作用：

SELECT A, MIN(B), MIN(C) FROM test GROUP BY A

它按A分组，只选择A行中B和C的最小值.B和C的值可能来自不同的行，例如

A  B  C
1  2  3
1  4  1

将返回

A  B  C
1  2  1

Answer 2

困难的部分是从同一行获取b和c。以下查询使用子查询来消除b或c的值不是最低的行。它将表连接到自身，并表示不能存在b或c值较低的行。 “not”由WHERE子句中的prev.a is null实现。

子查询称为semiunique，因为仍然可能存在具有相同b和c的重复行。外部查询负责处理GROUP BY。由于b和c相同，因此我们选择哪一行无关紧要，因此我们可以使用min()选择一行。

select a, min(b), min(c)
from (
    select cur.a, cur.b, cur.c
    from YourTable cur
    left outer join YourTable prev
        on cur.a = prev.a
        and (cur.b > prev.b
            or (cur.b = prev.b and cur.c > prev.c))
   where prev.a is null             
) semiunique
group by semiunique.a

根据您的评论，一个更简单的版本可以为b和c抓取“某些内容”：

select a, min(b), min(c)
from YourTable
group by a

Answer 3

这适用于SQL Server 2008，它说明了这个概念。您需要一个独特的列。

declare @temp as table (
id int identity(1,1),
a int,
b int, 
c int)

insert into @temp
    select 1 as A, 8 as B, 8 as C
    union
    select 1, 7, 7
    union 
    select 2, 10, 10

select a, b, c from @temp
where id in (select MAX(id) from @temp
group by a)

看到你正在使用Excel，我会使用相同的原则。将另一列添加到电子表格并确保它是唯一的。将该列用作您的ID列。

Answer 4

试试这个：

select A, B, C
from test x
where not exists (select *
                  from test y
                  where y.A = x.A
                        and (y.B < x.B or (y.B = x.B and y.C < x.C))
order by A

但是因为它包含相关的子查询，所以它可能很慢。（OTOH至少在理论上数据库引擎可以将其优化为我在下面显示的内容。）

SQL之外的事情怎么样？你打算怎么处理结果？

如果你打算用一些程序来处理它，为什么不直接得到：

select A, B, C from test order by A, B, C

然后执行以下操作：

prev_a = None
for a, b, c in get_query_result():
    if a != prev_a:
        prev_a = a
        yield (a, b, c)

在您的申请中？

我不知道PHP，但我猜它会是这样的：

$query = "SELECT a,b,c FROM test ORDER BY a,b,c";
$result = odbc_exec($connect, $query);
$prev_a = NULL;  # I don't know what you would normally use here in PHP
while (odbc_fetch_row($result)) {
  $a = odbc_result($result, 1);
  if (is_null($prev_a) or $a != $prev_a) { 
    $b = odbc_result($result, 2);
    $c = odbc_result($result, 3);
    print("A = $a, B = $b, C = $c\n");
    $prev_a = $a;
  }
}

Answer 5

Select A
    , Max(b) //Since You don't care about the Value
    , Max(c) //Since You don't care about the Value
From table t
Group By A

Answer 6

在A

中具有唯一值的所有行

SELECT * FROM table t1 INNER JOIN
(SELECT A FROM table GROUP BY A HAVING COUNT(A) = 1) as t2 
ON t1.A = t2.A

我不明白你的意思是“A中有一个重复值的行之一”。你能解释一下吗？

使用您的示例，在MySQL中执行

SELECT * FROM table GROUP BY A

会给你想要的结果：

A    B    C
1    8    8
2    10   10

Answer 7

-- All rows that are unique in column A
select *
from table
where col_a in (select col_a from table group by col_a having count(*)=1)
-- One row per dupe
select * 
from table
where col_a in (select max(col_a) from table group by col_a having count(*)>1)

Answer 8

另一种选择是使用ROW_NUMBER（） - 函数。但不确定它是否在ODBC Excel驱动程序中有效：

select a, b, c from (
select * 
, ROW_NUMBER() OVER (PARTITION BY A ORDER BY A) as RN
from @temp
) q where rn = 1

Answer 9

select * 
from table T 
where id = (
  select min(id) from table where a = T.a
)

UPD。但如果你的表中没有主键（为什么？），那么：

select A, min(B), min(C)
from TABLE
group by A

Answer 10

我知道这是一种肮脏的方式，但会适用于这种情况。

伪代码：

使用主键创建表#tmpStaging为col（A）

for flatFile / excel /中的每一行开始开始尝试插入#tmpstaging 结束尝试

开始捕捉 - 没做什么结束捕获端

select * from #tmpstaging将为您提供没有重复的行

Answer 11

这将为您提供每个重复的第一个

SELECT  DISTINCT
    A,
    (SELECT TOP 1 B FROM @Table tB WHERE tb.A = t.A) B,
    (SELECT TOP 1 C FROM @Table tB WHERE tb.A = t.A) C
FROM    @Table t

Answer 12

试试这个，

SELECT UT.[A],
(SELECT TOP 1 B FROM [YourTable] WHERE [YourTable].A= UT.A) AS B,
(SELECT TOP 1 C FROM [YourTable] WHERE [YourTable].A= UT.A) AS C  FROM [YourTable] AS UT GROUP BY UT.[A]

我还没试过......谁知道:)。

选择唯一身份和其中一个双打

12 个答案: