返回

时间:2015-04-23 12:55:37

标签: sql sas proc-sql

我在SAS中有一个表,例如customer_id和5个列,带有他的月度状态。客户有6种不同的状态。 例如

customer_id   month1    month2    month3    month4    month5 
12345678      Waiting   Inactive  Active    Active    Canceled

我希望从month1 - month5返回一个值,这是最常见的。在这种情况下,它是值Active。 结果将是

customer_id   frequent
12345678      Active    

SAS有什么功能吗?我有一些想法如何使用sql,但它会很复杂,有很多案例条件等。我是SAS的新手,所以我想会有更好的解决方案。

2 个答案:

答案 0 :(得分:2)

如果使用数组将数据集拆分为客户历史记录中每个月的一个观察值,则可以使用proc sql中的汇总函数轻松获取最常见的事件并使用最近一个月(假设是第5个月)打破关系。

data want1;
    set have;
    array m(*) month1 -- month5;
    do i = 1 to dim(m);
        cid = customer_id;
        frequent = m(i);
        position = i;
        output;
    end;
    keep cid frequent position;
run;

proc sql;
    create table want2 as select
    cid as customer_id,
    frequent,
    max(position) as max_pos,
    count(frequent) as count
    from want1
    group by cid, frequent;
quit;

proc sort data = want2; by customer_id descending count descending max_pos; run;

data want3;
    set want2;
    by customer_id descending count descending max_pos;
    if first.customer_id;
    drop max_pos count;
run;

答案 1 :(得分:0)

有点差的解决方案,但它确实适用于2个不同的值,在这种情况下为5个月。如果有效数量> = 3,则这是最常见的值:

select customer_id, case when (case when month1 = 'Active' then 1 else 0 end +
                               case when month2 = 'Active' then 1 else 0 end +
                               case when month3 = 'Active' then 1 else 0 end +
                               case when month4 = 'Active' then 1 else 0 end +
                               case when month5 = 'Active' then 1 else 0 end) >= 3
                             then 'Active' else 'Waiting' end
from tablename

另一种方式,UNION ALL

select customer_id, month, count(*) as cnt
(
    select customer_id, month1 as month from tablename
    UNION ALL
    select customer_id, month2 from tablename
    UNION ALL
    select customer_id, month3 from tablename
    UNION ALL
    select customer_id, month4 from tablename
    UNION ALL
    select customer_id, month5 from tablename
)
group by customer_id, month
order by cnt
fetch first 1 row only

FETCH FIRST是ANSI SQL,对于某些dbms产品可能是TOPLIMIT

相关问题