Bigquery SQL中的分割功能

时间:2018-08-15 00:42:05

标签: google-bigquery

我正在尝试使用Bigquery SQL中的简单拆分函数,并尝试获取数组元素(从Hive SQL中获取的查询)。但是Bigquery SQL拆分功能使字段可重复,并且无法提供期望的结果。有人可以在Bigquery #standardSQL

中创建类似的查询吗

-配置单元查询
选择层次结构,将hier_array,hier_array [0]用作级别0,将hier_array [1]用作级别1,将hier_array [2]用作级别2 来自
( 选择层次结构,将split(hierarchy,'-')作为hier_array
来自gcs_publish.cr_party_dnm_gu_rel )z
限制10;

-所需输出

hierarchy   hier_array  level0  level1
10000-211817-26510-25429    ["10000","211817","26510","25429"]  10000   211817
10019-10369 ["10019","10369"]   10019   10369
10021   ["10021"]   10021   
10022-17256 ["10022","17256"]   10022   17256
10033   ["10033"]   10033   
10037-3098187   ["10037","3098187"] 10037   3098187
10042   ["10042"]   10042   
10050-11038-211637808-34880075  ["10050","11038","211637808","34880075"]    10050   11038
10052   ["10052"]   10052   
10053   ["10053"]   10053   

1 个答案:

答案 0 :(得分:2)

以下内容适用于BigQuery Standard SQL,可帮助您入门

#standardSQL
SELECT 
  hierarchy, 
  hier_array, 
  hier_array[SAFE_OFFSET(0)] AS level0,  
  hier_array[SAFE_OFFSET(1)] AS level1, 
  hier_array[SAFE_OFFSET(2)] AS level2
FROM (
  SELECT hierarchy, SPLIT(hierarchy,'-') AS hier_array  
  FROM `gcs_publish.cr_party_dnm_gu_rel`
) z  
LIMIT 10   

结果如下所示

Row hierarchy                   hier_array  level0  level1  level2   
1   10000-211817-26510-25429    10000       10000   211817  26510    
                                211817               
                                26510                
                                25429                
2   10019-10369                 10019       10019   10369   null     
                                10369                
3   10021                       10021       10021   null    null       

如果您要检查此结果的JSON表示形式(例如第一行)

  {
    "hierarchy": "10000-211817-26510-25429",
    "hier_array": [
      "10000",
      "211817",
      "26510",
      "25429"
    ],
    "level0": "10000",
    "level1": "211817",
    "level2": "26510"
  },  

对我来说,它符合您的期望