如何在bigquery中查询数组?

时间:2017-07-18 18:03:50

标签: google-bigquery

bigquery中的

架构 字段:项目 type:string

项目字段中的值存储为字符串 {"data": [{"id": "1234", "plan": {"sub_id": "567", "metadata": {"currentlySelling": "true", "custom_attributes": "{\"shipping\": true,\"productLimit\":10}", "Features": "[\"10 products\", \"Online support\"]"}, "name": "Personal", "object": "plan"}, "quantity": 1}], "has_more": false}

两个问题1)我如何在数组中查询,例如:运输是真实的还是其中一个功能是"在线支持" 2)我必须将数据存储为字符串的原因因为" custom_attributes"价值可以改变。当其中一个嵌套键的值可以改变时,是否有更好的方法在bigquery中存储数据?

1 个答案:

答案 0 :(得分:2)

您的查询将是这样的:

#standardSQL
SELECT game
FROM YourTable
WHERE EXISTS (SELECT 1 FROM UNNEST(participant) WHERE name = 'sam');

这将返回'sam'为参与者的所有游戏。这是一个独立的例子:

#standardSQL
WITH YourTable AS (
  SELECT 'A' AS game, ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('tony', 12), ('julia', 12)] AS participant UNION ALL
  SELECT 'B', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('jacob', 12)] UNION ALL
  SELECT 'C', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('julia', 12)]
)
SELECT game
FROM YourTable
WHERE EXISTS (SELECT 1 FROM UNNEST(participant) WHERE name = 'sam');

如果您想将数据透视为每个参与者都有一个列,您可以使用如下查询:

#standardSQL
CREATE TEMP FUNCTION WasParticipant(
    p_name STRING, participant ARRAY<STRUCT<name STRING, age INT64>>) AS (
  EXISTS(SELECT 1 FROM UNNEST(participant) WHERE name = p_name)
);

WITH YourTable AS (
  SELECT 'A' AS game, ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('tony', 12), ('julia', 12)] AS participant UNION ALL
  SELECT 'B', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('jacob', 12)] UNION ALL
  SELECT 'C', ARRAY<STRUCT<name STRING, age INT64>>[('sam', 12), ('max', 12), ('julia', 12)]
)
SELECT
  ARRAY_AGG(IF(WasParticipant('sam', participant), game, NULL) IGNORE NULLS) AS sams_games,
  ARRAY_AGG(IF(WasParticipant('tony', participant), game, NULL) IGNORE NULLS) AS tonys_games,
  ARRAY_AGG(IF(WasParticipant('julia', participant), game, NULL) IGNORE NULLS) AS julias_games,
  ARRAY_AGG(IF(WasParticipant('max', participant), game, NULL) IGNORE NULLS) AS maxs_games
FROM YourTable;

这将返回一个数组,其中包含为每个参与者播放的游戏。

相关问题