将列中的值拆分为BigQuery中行的所有列

时间:2017-03-14 13:20:02

标签: sql google-bigquery

我有一个包含7列的表格如下:

date                         org   cus_id   prod_id   sales_qty   sales_amount   profit_amount
30-AUG-14 55 12 34 56 78 99  null   null      null     null           null        null
31-AUG-14 22 32 43 65 76 88  null   null      null     null           null        null

实际上,第一列中的值由每行中所有列的值连接在一起。我想通过将第一列中的值拆分为所有列来修复它。预期输出应为以下

date        org   cus_id   prod_id   sales_qty   sales_amount   profit_amount
 30-AUG-14   55       12        34          56             78              99  
 31-AUG-14   22       32        43          65             76              88  

我认为拆分这样的字符串值是适用的,但是我不熟悉将它拆分并放入现有的列中。我可以提出你的建议吗?先谢谢你。

2 个答案:

答案 0 :(得分:1)

您可以使用“用户定义的函数”将值扩展为现有列或新列。

import { Component } from '@angular/core';
import { NavController } from 'ionic-angular';
import { Storage } from '@ionic/storage';

@Component({
  selector: 'page-home',
  templateUrl: 'home.component.html'
})
export class HomeComponent {


  constructor(public navCtrl: NavController,
    private storage: Storage,
    private homeService: HomeService) {
    var context = this;
    storage.ready().then(() => {
      context.loadItems();
    });
  }

 loadItems() {}
}

要使用标准SQL将行值传递给JavaScript函数,请定义一个函数,该函数采用与表相同的行类型的结构。

例如:

#standardSQL
CREATE TEMPORARY FUNCTION AddField(s STRUCT<tdate STRING, org INT64,cus_id INT64,prod_id INT64,sales_qty INT64,sales_amount INT64,profit_amount INT64>)
  RETURNS STRUCT<tdate STRING, org INT64,cus_id INT64,prod_id INT64,sales_qty INT64,sales_amount INT64,profit_amount INT64> LANGUAGE js AS """
var fields = s.tdate.split(' ');
  s.org=fields[1];
  s.cus_id=fields[2];
  s.prod_id=fields[3];
  s.sales_qty=fields[4];
  s.sales_amount=fields[5];
  s.profit_amount=fields[6];
  return s;
""";
with mytable as (
select 
"30-AUG-14 55 12 34 56 78 99" as tdate,  null as org,   null as cus_id,    null as prod_id , null as sales_qty ,null as  sales_amount ,null as  profit_amount
union all
select "31-AUG-14 22 32 43 65 76 88" as tdate,  null as org,   null as cus_id,    null as prod_id , null as sales_qty ,null as  sales_amount ,null as  profit_amount
)
SELECT AddField(t).*
FROM mytable AS t;

然后使用Javascript代码转换现有值

s STRUCT<tdate STRING, org INT64,cus_id INT64,prod_id INT64,sales_qty INT64,sales_amount INT64,profit_amount INT64>

你可以添加逻辑,IF存在不覆盖,或创建为新的colum,然后为整行运行这样的查询

 var fields = s.tdate.split(' ');
 s.org=fields[1];

您可以在migration guideUDF docs中找到多个复杂的UDF。

答案 1 :(得分:1)

尝试以下

#standardSQL
SELECT 
  SPLIT(date, ' ')[OFFSET(0)] AS date,        
  SPLIT(date, ' ')[OFFSET(1)] AS org,   
  SPLIT(date, ' ')[OFFSET(2)] AS cus_id,   
  SPLIT(date, ' ')[OFFSET(3)] AS prod_id,   
  SPLIT(date, ' ')[OFFSET(4)] AS sales_qty,   
  SPLIT(date, ' ')[OFFSET(5)] AS sales_amount,   
  SPLIT(date, ' ')[OFFSET(6)] AS profit_amount  
FROM yourTable

您可以使用相关示例中的虚拟数据进行测试

#standardSQL
WITH yourTable AS (
  SELECT '30-AUG-14 55 12 34 56 78 99' AS date, NULL AS org, NULL AS cus_id, NULL AS prod_id, NULL AS sales_qty, NULL AS sales_amount, NULL AS profit_amount UNION ALL
  SELECT '31-AUG-14 22 32 43 65 76 88', NULL, NULL, NULL, NULL, NULL, NULL
)
SELECT 
  SPLIT(date, ' ')[OFFSET(0)] AS date,        
  SPLIT(date, ' ')[OFFSET(1)] AS org,   
  SPLIT(date, ' ')[OFFSET(2)] AS cus_id,   
  SPLIT(date, ' ')[OFFSET(3)] AS prod_id,   
  SPLIT(date, ' ')[OFFSET(4)] AS sales_qty,   
  SPLIT(date, ' ')[OFFSET(5)] AS sales_amount,   
  SPLIT(date, ' ')[OFFSET(6)] AS profit_amount  
FROM yourTable  

如果您需要将字段转换为INT,请使用以下

#standardSQL
SELECT 
  SPLIT(date, ' ')[OFFSET(0)] AS date,        
  CAST(SPLIT(date, ' ')[OFFSET(1)] AS INT64) AS org,   
  CAST(SPLIT(date, ' ')[OFFSET(2)] AS INT64) AS cus_id,   
  CAST(SPLIT(date, ' ')[OFFSET(3)] AS INT64) AS prod_id,   
  CAST(SPLIT(date, ' ')[OFFSET(4)] AS INT64) AS sales_qty,   
  CAST(SPLIT(date, ' ')[OFFSET(5)] AS INT64) AS sales_amount,   
  CAST(SPLIT(date, ' ')[OFFSET(6)] AS INT64) AS profit_amount  
FROM yourTable