将列拆分为多列

时间:2017-04-26 13:56:12

标签: google-bigquery

请求Google Big Query的帮助 我们有一个数据表,其中一列是捕获
的状态 机票的生命周期。
该字段具有票据分配日期,正在进行的工作,已关闭,待处理等 各种类别。以下示例: '已分配(03/01/2017 06:13:47 AM) - >正在进行中(03/02/2017 05:27:52 AM) - >已解决(04/06/2017 03:34:16 AM)' 我们需要根据此列创建多个列 - 一个用于已分配,另一个用于 解决等 我们尝试了一些选项,如下所示

substr(STATUS_TRAIL,11,20) assigned_date, right(STATUS_TRAIL,34) as date,

但是这样结果不是最新的,因为每列可能会错过其中一个 其他状态,即某些机票尚未关闭或任何未处理的机票,而且有些机票已关闭。

2 个答案:

答案 0 :(得分:2)

在下面尝试BigQuery Standard SQL

  
#standardSQL
WITH yourTable AS (
  SELECT 'Assigned (03/01/2017 06:13:47 AM) -> Work In Progress (03/02/2017 05:27:52 AM) -> Resolved (04/06/2017 03:34:16 AM)' AS ticket 
)
SELECT 
  REGEXP_EXTRACT(ticket, r'Assigned \((\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)\)') AS assigned,
  REGEXP_EXTRACT(ticket, r'Work In Progress \((\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)\)') AS inprogress,
  REGEXP_EXTRACT(ticket, r'Resolved \((\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)\)') AS resolved
FROM yourTable

答案 1 :(得分:1)

尝试SPLIT功能:

#standardSQL
WITH Input AS (
   SELECT 'Assigned (03/01/2017 06:13:47 AM) -> Work In Progress (03/02/2017 05:27:52 AM) -> Resolved (04/06/2017 03:34:16 AM)' AS STATUS_TRAIL
)
SELECT
  events[SAFE_OFFSET(0)] AS assigned_event,
  events[SAFE_OFFSET(1)] AS progress_event,
  events[SAFE_OFFSET(2)] AS resolved_event
FROM (
  SELECT
    SPLIT(STATUS_TRAIL, ' -> ') AS events
  FROM Input
);

或者,作为查看数据的另一种方式,您可以将其建模为包含事件类型和时间戳条目的数组:

#standardSQL
WITH Input AS (
   SELECT 'Assigned (03/01/2017 06:13:47 AM) -> Work In Progress (03/02/2017 05:27:52 AM) -> Resolved (04/06/2017 03:34:16 AM)' AS STATUS_TRAIL
)
SELECT
  ARRAY(
    SELECT AS STRUCT
      parts[SAFE_OFFSET(0)] AS type,
      PARSE_TIMESTAMP('%m/%d/%Y %T %p)', parts[SAFE_OFFSET(1)]) AS timestamp
    FROM (
      SELECT SPLIT(event_string, ' (') AS parts
      FROM UNNEST(event_strings) AS event_string
    )
  ) AS events
FROM (
  SELECT SPLIT(STATUS_TRAIL, ' -> ') AS event_strings
  FROM Input
);

输出中的每一行都类似于:

[{Assigned, 2017-03-01 06:13:47+00},
 {Work In Progress, 2017-03-02 05:27:52+00},
 {Resolved, 2017-04-06 03:34:16+00}]