"表格对于JOIN来说太大了。考虑使用JOIN EACH"但是没有连接

时间:2015-11-19 16:54:51

标签: join google-bigquery

我收到来自BigQuery的错误声明"表格对于JOIN来说太大了。考虑使用JOIN EACH",但SQL语句中没有任何显式连接。我看过另一个类似的问题,但我不明白他们给出的答案与我遇到的问题有什么关系。

'Too large to JOIN' error when I'm not using JOIN

这是SQL

SELECT
userId AS userId,
deviceId AS deviceId,
MAX(CASE WHEN event_parameters.Name = 'Result' THEN event_parameters.Value END) AS ResultValue,
MAX(CASE WHEN event_parameters.Name = 'Points' THEN event_parameters.Value END) AS _Points,
MAX(CASE WHEN event_parameters.Name = 'Time' THEN event_parameters.Value END) AS _Duration,
CASE WHEN event_name IN ('Level setA_lvl_01_onboarding Finished') THEN 1 WHEN event_name IN ('Level setA_lvl_02_onboarding Finished') THEN 2 WHEN event_name IN ('Level setA_lvl_03_onboarding Finished') THEN 3 WHEN event_name IN ('Level setA_lvl_04_onboarding Finished') THEN 4 WHEN event_name IN ('Level setA_lvl_05 Finished') THEN 5 WHEN event_name IN ('Level setA_lvl_06 Finished') THEN 6 WHEN event_name IN ('Level setA_lvl_07 Finished') THEN 7 WHEN event_name IN ('Level setB_lvl_01 Finished') THEN 8 WHEN event_name IN ('Level setB_lvl_02 Finished') THEN 9 WHEN event_name IN ('Level setB_lvl_03 Finished') THEN 10 WHEN event_name IN ('Level setB_lvl_04 Finished') THEN 11 WHEN event_name IN ('Level setB_lvl_05 Finished') THEN 12 WHEN event_name IN ('Level setC_lvl_01 Finished') THEN 13 WHEN event_name IN ('Level setC_lvl_02 Finished') THEN 14 WHEN event_name IN ('Level setC_lvl_03 Finished') THEN 15 WHEN event_name IN ('Level setC_lvl_04 Finished') THEN 16 WHEN event_name IN ('Level setC_lvl_05 Finished') THEN 17 WHEN event_name IN ('Level setA_lvl_08 Finished') THEN 18 WHEN event_name IN ('Level setA_lvl_09 Finished') THEN 19 WHEN event_name IN ('Level setA_lvl_10 Finished') THEN 20 WHEN event_name IN ('Level setA_lvl_11 Finished') THEN 21 WHEN event_name IN ('Level setA_lvl_12 Finished') THEN 22 WHEN event_name IN ('Level setA_lvl_13 Finished') THEN 23 WHEN event_name IN ('Level setA_lvl_14 Finished') THEN 24 WHEN event_name IN ('Level setB_lvl_06 Finished') THEN 25 WHEN event_name IN ('Level setB_lvl_07 Finished') THEN 26 WHEN event_name IN ('Level setB_lvl_08 Finished') THEN 27 WHEN event_name IN ('Level setB_lvl_09 Finished') THEN 28 WHEN event_name IN ('Level setB_lvl_10 Finished') THEN 29 WHEN event_name IN ('Level setB_lvl_11 Finished') THEN 30 WHEN event_name IN ('Level setC_lvl_06 Finished') THEN 31 WHEN event_name IN ('Level setC_lvl_07 Finished') THEN 32 WHEN event_name IN ('Level setC_lvl_08 Finished') THEN 33 WHEN event_name IN ('Level setC_lvl_09 Finished') THEN 34 WHEN event_name IN ('Level setC_lvl_10 Finished') THEN 35 WHEN event_name IN ('Level setC_lvl_11 Finished') THEN 36 WHEN event_name IN ('Level setC_lvl_12 Finished') THEN 37 WHEN event_name IN ('Level setA_lvl_15 Finished') THEN 38 WHEN event_name IN ('Level setA_lvl_16 Finished') THEN 39 WHEN event_name IN ('Level setA_lvl_17 Finished') THEN 40 WHEN event_name IN ('Level setA_lvl_18 Finished') THEN 41 WHEN event_name IN ('Level setA_lvl_19 Finished') THEN 42 WHEN event_name IN ('Level setA_lvl_20 Finished') THEN 43 WHEN event_name IN ('Level setA_lvl_21 Finished') THEN 44 WHEN event_name IN ('Level setB_lvl_12 Finished') THEN 45 WHEN event_name IN ('Level setB_lvl_13 Finished') THEN 46 WHEN event_name IN ('Level setB_lvl_14 Finished') THEN 47 WHEN event_name IN ('Level setB_lvl_15 Finished') THEN 48 WHEN event_name IN ('Level setB_lvl_16 Finished') THEN 49 WHEN event_name IN ('Level setB_lvl_17 Finished') THEN 50 WHEN event_name IN ('Level setB_lvl_18 Finished') THEN 51 WHEN event_name IN ('Level setC_lvl_13 Finished') THEN 52 WHEN event_name IN ('Level setC_lvl_14 Finished') THEN 53 WHEN event_name IN ('Level setC_lvl_15 Finished') THEN 54 WHEN event_name IN ('Level setC_lvl_16 Finished') THEN 55 WHEN event_name IN ('Level setC_lvl_17 Finished') THEN 56 WHEN event_name IN ('Level setC_lvl_18 Finished') THEN 57 WHEN event_name IN ('Level setC_lvl_19 Finished') THEN 58 WHEN event_name IN ('Level setC_lvl_20 Finished') THEN 59 ELSE 60 END AS _level,
FROM
TABLE_QUERY([dbset:dbname], "table_id CONTAINS 'game_table1' or table_id CONTAINS 'game_table2'  or table_id CONTAINS 'game_table3'")
WHERE
event_name LIKE 'Level%Finished'
GROUP BY
userId,
deviceId

我怀疑问题是" CASE WHEN event_name IN",因为链接的问题表明WHERE IN是一个隐式连接,所以我想知道它是否也是这种情况,或者如果LIKE声明也可能是问题的一部分。

谢谢, 布拉德

2 个答案:

答案 0 :(得分:1)

半连接语句(遵循模式WHERE ... IN (SELECT ...)的查询)有时会发生这种情况。您始终可以将半连接重写为JOIN,这通常会有所帮助,因为您可以使用JOIN EACH(没有IN EACH运算符)。

但是,你的查询虽然有IN,但它不是半连接(Mosha指出)。正如他所要求的,如果你可以分享你的工作ID,BigQuery的一位工程师可以调查发生的事情。

答案 1 :(得分:0)

我同意Mosha和Jordan我们需要jobid所以我们可以进一步调试这个问题。完全不清楚为什么您提供的查询会导致联接。

但是,我想在这里提到其他问题:通过使用连接来替换那个长CASE语句,你可能会获得更好的性能。我首先感到困惑的是你在那里使用IN而不是=。如果你做了那个转换,我认为你会得到相同的语义。

假设这是真的,你可以更好地创建一个小尺寸表,将“Level set *”字符串映射到整数,以便你可以将它连接回你的表。

希望这是有道理的。

相关问题