bigquery - 仅筛选出独特的结果

时间:2015-06-24 12:35:26

标签: unique google-bigquery

我的数据库看起来如下:

Entry-Key     Name     Surname     Age
10a           Smith    Alex        35
11b           Finn     John        41
10a           Smith    Al          35
10c           Finn     Berta       28
11b           Fin      John        41

我需要从中获取独特的行。 分组依据无效,因为有时名称/姓氏列中存在不准确之处。

我想只按Entry-Keys进行分组,然后在表格中找到Key的第一次出现,只取这一行。我知道如何在Excel中执行此操作,但由于数据库有大约100,000行,因此Excel不是一个真正的选择。

这个想法最终得到了这个表:

10a           Smith    Alex        35
11b           Finn     John        41
12c           Finn     Berta       28

请帮忙!

1 个答案:

答案 0 :(得分:2)

根据您的逻辑,您可以执行以下查询:

select key, first(name), first(surname), first(age) from 
(select '10a' as key,           'Smith' as name,    'Alex' as surname,        35 as age),
(select '11b' as key,           'Finn' as name,     'John' as surname,        41 as age),
(select '10a' as key,           'Smith' as name,    'Al' as surname,          35 as age),
(select '10c' as key,           'Finn' as name,     'Berta' as surname,       28 as age),
(select '11b' as key,           'Fin' as name,      'John' as surname,        41 as age),
group by key

返回:

+-----+-----+-------+-------+-----+---+
| Row | key |  f0_  |  f1_  | f2_ |   |
+-----+-----+-------+-------+-----+---+
|   1 | 10a | Smith | Alex  |  35 |   |
|   2 | 11b | Finn  | John  |  41 |   |
|   3 | 10c | Finn  | Berta |  28 |   |
+-----+-----+-------+-------+-----+---+