Question

我想在BigQuery中执行的查询的目标是，基于一个按组计算的特定值，我希望覆盖某个特定值。

我当前的查询

UPDATE MAN_18.MAN_4y
SET actual_related_customer = customer_code
WHERE (SELECT IF(arc_count > 1,1,0) AS double_cust FROM (SELECT 
COUNT(DISTINCT(actual_related_customer)) AS arc_count
FROM MAN_18.MAN_4y
GROUP BY customer_code)
WHERE double_cust = 1)

所需结果如下：

在表MAN_4y中，我想使用customer_code和actual_related_customer列对其执行操作。首先，我想知道一个customer_code是否可以有多个不同的actual_related_customer。如果是这样，那么在这种情况下意味着arc_count大于1，我想使用customer_code的集合在MAN_4y表中查找。我想查看这些customer_code的actual_related_customer值。如果这些customer_code中的一个的行具有与customer_code值不同的actual_related_customer，我想用当前的customer_code值覆盖它。您也可以这样解释：对于customer_code的arc_count> 1的所有customer_code值，然后将该customer_code值写入actual_related_customer。

有人可以帮助我吗？

Answer 1

下面是我如何通过一个简单的一步（BigQuery标准SQL）

#standardSQL
UPDATE `project.MAN_18.MAN_4y`
SET actual_related_customer = customer_code
WHERE customer_code IN (
  SELECT customer_code
  FROM `project.MAN_18.MAN_4y`
  GROUP BY customer_code
  HAVING COUNT(DISTINCT actual_related_customer) > 1
)

作为一个极其简化的示例：

如果原始表格如下所示：

Row     customer_code   actual_related_customer  
1       1               3    
2       1               4    
3       2               5

然后在应用UPDATE之后-表已更新为

Row     customer_code   actual_related_customer  
1       1               1    
2       1               1    
3       2               5

除非我读错了问题-这正是预期的结果

Answer 2

您尝试在BQ中执行的操作。

您需要做的是将子查询写入目标表，并在UPDATE语句中引用目标表，因为数据库引擎没有提供在子查询中引用同一表的方法。

这导致两个步骤。

BigQuery查询，根据按计数分组的条件更新字段

2 个答案: