计数随时间的意外输出

时间:2018-03-20 15:20:51

标签: sql amazon-redshift

在以下查询中,我获得累积代表,并且他们各自的客户加班(月末):

select month, count(rep_id), sum(cliets) 
  from
( 
  select date_trunc('month', date)::date as month, rep_id, 
    row_number() over (PARTITION BY rep_id, date_trunc('month', date) order by date desc) as rnk,
    sum(case when applied_date <= date then 1 else 0 end) clients
  from
  ( 
    select r.rep_id, r.created_date, u.id as user_id, u.applied_date
    from reps r
    left outer join clients u on r.id = u.rep_id
  ) z
cross join
  (select * from calendar
    where date between '2018-01-01' and convert_timezone('PST', getdate())
  ) c
group by date, rep_id
) sub
where rnk=1
group by 1

这是输出:

month       count   sum
1/1/2018    1000    2000
2/1/2018    1000    3000
3/1/2018    1000    4000

客户加班确实是正确的,但是,我怎样才能累积得到代表的正确计数:

month       count   sum
1/1/2018    350     2000
2/1/2018    700     3000
3/1/2018    1000    4000

仅供参考我可以通过AGG(rep_id)在Tableau中使用它。

1 个答案:

答案 0 :(得分:1)

查询中有几件事需要解决。

count(rep_id)正在计算非NULL值,因为可能没有NULL rep_id,这与计算行数相同。

group by date, rep_id每个rep_id返回1行,每个日期不是每月。我知道你需要按日期分组,因为你在row_number()函数中使用它。

如果不修改原始查询太多,我会这样写:(抱歉未经测试)

select month, count(DISTINCT rep_id), count(DISTINCT user_id) 
  from
( 
  select date_trunc('month', date)::date as month, 
    CASE WHEN created_date <= date THEN rep_id ELSE NULL END rep_id, 
    case when applied_date <= date then user_id else null end) user_id
  from
  ( 
    select r.rep_id, r.created_date, u.id as user_id, u.applied_date
    from reps r
    left outer join clients u on r.id = u.rep_id
  ) z
cross join
  (select * from calendar
    where date between '2018-01-01' and convert_timezone('PST', getdate())
  ) c
) sub
group by 1

此:

   CASE WHEN created_date <= date THEN rep_id ELSE NULL END rep_id, 
    case when applied_date <= date then user_id else null end) user_id

在日期之前看到时返回rep_id或user_id。

然后

count(DISTINCT rep_id), count(DISTINCT user_id) 

计算当月或之前的数量。