在 Snowflake 中加入两个 JSON 对象

时间:2021-02-03 15:33:02

标签: sql json snowflake-cloud-data-platform

怎么了,

我有两个 JSON 对象,它们是从同一个 Snowflake 表生成的(此处为表 1)。 我想在他们的“_id”字段上加入/合并它们,以生成这种嵌套的 json 结构。

  1. 我该怎么做?我尝试为它们设置别名并使用 SELECT * from dc JOIN rs ON rs.:_id = dc:_id,但我遇到了无效标识符错误或“意外的关键字开启”错误。
  2. 是否有更简单的方法来完成此合并,而无需执行两个单独的 json OBJECT CONSTRUCT 查询?

我在下面包含了 JSON 示例

{ "_id": 786433, "rmpostcode": "LL65 1HL" }
{ "_id": 786434, "rmpostcode": "LL65 1HN" }
{ "_id": 786435, "rmpostcode": "LL65 1HP" }
{ "_id": 786436, "rmpostcode": "LL65 1HR" }
{ "_id": 786437, "rmpostcode": "LL65 1HS" }

从表中生成

SELECT OBJECT_CONSTRUCT(
    '_id', h."ID",
    'rmpostcode', "rmpostcode"
)
FROM TABLE1

还有一个

{ "_id": 524323, "coords": [ { "eastings": 265099, "northings": 666879 } ] }
{ "_id": 524381, "coords": [ { "eastings": 265787, "northings": 668537 } ] }
{ "_id": 524447, "coords": [ { "eastings": 265024, "northings": 668238 } ] }
{ "_id": 524496, "coords": [ { "eastings": 268534, "northings": 665428 } ] }
{ "_id": 524785, "coords": [ { "eastings": 260938, "northings": 664166 } ] }

使用生成

SELECT OBJECT_CONSTRUCT(
    '_id', h."ID",
    'coords', array_agg(object_construct(
                        'northings', h."northings",
                        'eastings', h."eastings"))
)
FROM TABLE1
group by "ID"

编辑:尝试@Felipe Hoffa 建议的答案仍然无效。代码如下:

with dc AS
(
SELECT OBJECT_CONSTRUCT(
    '_id', h."ID",
    'coords', array_agg(object_construct(
                        'northings', h."northings",
                        'eastings', h."eastings"))
)
FROM "V_TABLES_09092020"."DEV"."v31av8oct20hyperoptic" h
group by "ID"
),
rs AS
(SELECT OBJECT_CONSTRUCT(
    '_id', h."ID",
    'rmpostcode', "rmpostcode"
)
FROM "V_TABLES_09092020"."DEV"."v31av8oct20hyperoptic" h
)

SELECT my_object_assign(dc, rs)
FROM dc 
JOIN rs 
ON rs:"_id" = dc:"_id";

给我一​​个SQL compilation error: error line 23 at position 3 invalid identifier 'RS'

我也在努力以这种方式创建临时表:

create or replace temp table dc AS 
SELECT OBJECT_CONSTRUCT(
    '_id', h."ID",
    'coords', array_agg(object_construct(
                        'northings', h."northings",
                        'eastings', h."eastings"))
)
FROM "V_TABLES_09092020"."DEV"."v31av8oct20hyperoptic" h
group by "ID"

但我明白

SQL compilation error: Missing column specification

2 个答案:

答案 0 :(得分:1)

如何修复查询:您需要在 "_id" 中添加引号:

SELECT * 
from dc 
JOIN rs 
ON rs:"_id" = dc:"_id";

enter image description here

设置:

create or replace temp table dc as
select parse_json(value) dc
from table(split_to_table('{ "_id": 786433, "rmpostcode": "LL65 1HL" }
{ "_id": 786434, "rmpostcode": "LL65 1HN" }
{ "_id": 786435, "rmpostcode": "LL65 1HP" }
{ "_id": 786436, "rmpostcode": "LL65 1HR" }
{ "_id": 786437, "rmpostcode": "LL65 1HS" }', '\n'))
;

create or replace temp table rs as
select parse_json(value) rs
from table(split_to_table('{ "_id": 786433, "coords": [ { "eastings": 265099, "northings": 666879 } ] }
{ "_id": 786434, "coords": [ { "eastings": 265787, "northings": 668537 } ] }
{ "_id": 524447, "coords": [ { "eastings": 265024, "northings": 668238 } ] }
{ "_id": 524496, "coords": [ { "eastings": 268534, "northings": 665428 } ] }
{ "_id": 524785, "coords": [ { "eastings": 260938, "northings": 664166 } ] }', '\n'))
;

--

更新:如果你想合并两个对象,你可以用一个简单的 assign() JS UDF 来解决这个问题:

create or replace function my_object_assign(o1 VARIANT, o2 VARIANT) 
returns VARIANT 
language javascript 
as 'return Object.assign(O1, O2);';

SELECT my_object_assign(dc, rs)
FROM dc 
JOIN rs 
ON rs:"_id" = dc:"_id";

enter image description here

答案 1 :(得分:0)

好的,根据您的评论,我认为您的情况是您定义了一个与此类似的表:

create or replace temporary table public.test1 (
  "ID" numeric(38,0),
  "rmpostcode" varchar,
  "eastings" numeric(38,0),
  "northings" numeric(38,0)
);

在此表中,您在 ID 和 rmpostcode 以及每个东/北组合之间具有 1:1 的关系。像这样:

insert into public.test1 ("ID", "rmpostcode", "eastings", "northings")
values  (1111, 'ABC123', 256789, 887345), 
        (2222, 'LH9ZXQ', 678443, 921009),
        (9876, 'PZ12RR', 876234, 237862),
        (4567, 'W113LM', 234233, 123244)
;

要将此表格数据提取到 JSON 对象中,其中 Eastings/northings 值位于数组中,您正确地尝试使用 OBJECT_CONSTRUCT 和 ARRAY_AGG 函数,但根据您的示例代码,我认为您的问题不是定义 GROUP BY正确表达。

此代码应生成一个类似于您所描述的 JSON 对象:

select object_construct(
    '_id', "ID", 
    'rmpostcode', "rmpostcode", 
    'coords', array_agg(object_construct(
                        'northings', "northings",
                        'eastings', "eastings")
                       )
     ) 
from public.test1 
group by "ID", "rmpostcode"

结果:

{    "_id": 9876,    "coords": [      {        "eastings": 876234,        "northings": 237862      }    ],    "rmpostcode": "PZ12RR"  }
{    "_id": 4567,    "coords": [      {        "eastings": 234233,        "northings": 123244      }    ],    "rmpostcode": "W113LM"  }
{    "_id": 1111,    "coords": [      {        "eastings": 256789,        "northings": 887345      }    ],    "rmpostcode": "ABC123"  }
{    "_id": 2222,    "coords": [      {        "eastings": 678443,        "northings": 921009      }    ],    "rmpostcode": "LH9ZXQ"  }
相关问题