SQL查询以匹配并返回所有出现的搜索字符串

时间:2019-09-17 20:31:57

标签: sql amazon-athena

我在具有表(TABLE)的列(记录)中有一个json文档,如下所示。需要编写一个SQL查询,以将所有出现的字段“ a”,“ b”,“ k”的值都包含在aaagroup中。

结果应为:

NAME1   age1    comment1
NAME2   age2    
NAME3            comment3

JSON数据:

{
    "reportfile": {
        "aaa": {
            "aaagroup": [{
                "a": "NAME1",
                "b": "age1",
                "k": "comment1"
            },
        {
                "a": "NAME2",
                "b": "age2"
            },
        {
                "a": "NAME3",
                "k": "comment3"
            }]
        },
        "dsa": {
            "dsagroup": [{
                "j": "Name"
            },
            {
                "j": "Title"
            }]
        }
    }
}

我使用以下查询一次:

数据:

{"reportfile":{"aaa":{"aaagroup":[{"a":"NAME1","k":"age1}]},"dsa":{"dsagroup":[{"j":"USERNAME"}],"l":"1","m":"1"}}}

查询:

select 
    substr(cc.BUS_NME, 1, strpos(cc.BUS_NME,'"')-1) as BUS_NME,
    substr(cc.AGE, 1, strpos(cc.AGE,'"')-1) as AGE
from 
    (substr(bb.aaa,strpos(bb.aaa,'"a":"')+5) as BUS_NME,
     substr(bb.aaa,strpos(bb.aaa,'"k":"')+5) as AGE 
from 
    (substr(aa.G, strpos(aa.G,'"aaagroup'),strpos(aa.G,'},')) as aaa                 
from 
    (select substr(record, strpos(record,'"aaagroup')) as G 
     from TABLE) aa) bb) cc

1 个答案:

答案 0 :(得分:0)

ush rani –如果我正确地回答了您的问题,那么您将有一个这样的外部表,您可以尝试在下面的查询中从外部表获取所需结果

示例外部表:

CREATE EXTERNAL TABLE Ext_JSON_data(
reportfile string
  )
ROW FORMAT SERDE 
  'org.openx.data.jsonserde.JsonSerDe' 
WITH SERDEPROPERTIES (  
'serialization.format' = '1'
  )
LOCATION
  's3://bucket/folder/'

查询以获取所需结果:

WITH the_table AS (
SELECT CAST(social AS MAP(VARCHAR, JSON)) AS social_data
  FROM (
    VALUES
    (JSON '{"aaa": {"aaagroup": [{"a": "NAME1","b": "age1","k": "comment1"},{"a": "NAME2","b": "age2"},{"a": "NAME3","k": "comment3"}]},"dsa": {"dsagroup": [{"j": "Name"},{"j": "Title"}]}}')
) AS t (social)    
),
cte_first_level as
(  
SELECT 
    first_level_key
  ,CAST(first_level_value AS MAP(VARCHAR, JSON))As first_level_value
  FROM the_table
  CROSS JOIN UNNEST (social_data) AS t (first_level_key, first_level_value)
),
cte_second_level as
(
Select 
first_level_key
,SECOND_level_key
,SECOND_level_value
from 
cte_first_level  
CROSS JOIN UNNEST (first_level_value) AS t (SECOND_level_key, SECOND_level_value)
)
SELECT
first_level_key
,SECOND_level_key
,SECOND_level_value
,items
,items['a'] value_of_a
,items['b'] value_of_b
,items['k'] value_of_k
from 
cte_second_level
cross join unnest(cast(json_extract(SECOND_level_value, '$') AS ARRAY<MAP<VARCHAR, VARCHAR>>)) t (items)

查询输出:

enter image description here