根据条件条件返回值

时间:2019-05-10 11:31:34

标签: sql google-bigquery

我试图基于每个uid至少出现一次,并使用2个不同的分类变量从uid列返回值:

+--------+--------+---------+
|  uid   |  type  | period  |
+--------+--------+---------+
| abc123 | event1 | control |
| abc123 | event1 | test    |
| def456 | event1 | control |
| def456 | event1 | control |
+--------+--------+---------+

在这种情况下,abc123将为事件1返回计数2,因为uid出现在测试周期和控制周期中,def456将不会返回计数,因为它仅在一个周期内发生期间,给出中间表:

+--------+-----------+
|  uid   | typecount |
+--------+-----------+
| abc123 |         2 |
+--------+-----------+

到目前为止,这是我的代码:

with cb as(
  select uid, count(type) as cbuffercount, period
    from `AJG.ABV_buff_wperiods`
    where type="bufferStart" and seq>12 and not uid="null" and not uid="" and period="control"
    group by uid, period
    having count(uid)>1),
tb as(
  select uid, count(type) as tbuffercount, period
    from `AJG.ABV_buff_wperiods`
    where type="bufferStart" and seq>12 and not uid="null" and not uid="" and period="test"
    group by uid, period
    having count(uid)>1),
ci as(
  select uid, count(instance) as cinstancecount, period
    from `AJG.ABV_buff_wperiods`
    where seq>12 and not uid="null" and not uid="" and period="control"
    group by uid, period
    having count(uid)>1),
ti as(
    select uid, count(instance) as tinstancecount, period
    from `AJG.ABV_buff_wperiods`
    where seq>12 and not uid="null" and not uid="" and period="test"
    group by uid, period
    having count(uid)>1)
select uid, cb.cbuffercount, tb.tbuffercount, ci.cinstancecount, ti.tinstancecount,
cb.cbuffercount-tb.tbuffercount as absbufferddx, (cb.cbuffercount/ci.cinstancecount)-(tb.tbuffercount/tb.tinstancecount) as proportionalbufferddx
from
  cb join tb
  using(uid)
where
  cb.uid=tb.uid
order by absbufferddx desc

我还有一个问题,当我尝试从with子句中选择变量时,Bigquery无法识别我在ci.cinstancecount子句中定义的最后2个表。我运行了一个包含cbtb的查询。不知道为什么添加两个额外的表会破坏它吗?

1 个答案:

答案 0 :(得分:1)

这是您想要的吗?

select uid, count(distinct period)
from t
group by uid
having count(distinct period) >= 2;

如果您想同时计算eventperiod,那么我将建议您使用字符串munging。 BigQuery在数组或结构上均不支持count(distinct),因此您也可以这样做:

select uid, count(distinct concat(event, '|', period))
from t
group by uid
having count(distinct concat(event, '|', period)) >= 2;