如何基于r数据帧中的多个列条件创建基于排名的列

时间:2018-01-02 05:27:49

标签: r if-statement dataframe multiple-columns cumsum

我有一个包含3列的数据框,我想根据其他列中给出的值创建第4列。对于创建new_rank列,我们从1开始作为所有用户的起点,当matric_1大于15且matric_2大于20时,将后续排名值增加1 ..

我觉得我需要在r中使用cumsum函数,但我正在努力处理ifelse条件。数据帧代码如下:

<servlet>
        <servlet-name>springmvc</servlet-name>
        <servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
        <init-param>
            <param-name>contextConfigLocation</param-name>
            <param-value>WEB-INF/config/common/springmvc-servlet.xml</param-value>
        </init-param>
        <load-on-startup>1</load-on-startup>
</servlet>`

2 个答案:

答案 0 :(得分:1)

基于@akrun解决方案,但使用data.table

library('data.table')
setDT(df)
df[, rank := shift( x = cumsum(matric_1 > 15 & matric_2 > 20) + 1,
                    fill = 1, 
                    type = "lag" ), 
   by = user_id]
df
#    user_id matric_1 matric_2 new_rank rank
# 1:       a       10       10        1    1
# 2:       a       23       25        1    1
# 3:       a        4       10        2    2
# 4:       a        5       13        2    2
# 5:       b       17       21        1    1
# 6:       b        5       10        2    2
# 7:       b       40        7        2    2
# 8:       c        1        3        1    1
# 9:       c        2        4        1    1
# 10:       c       18       22        1    1
# 11:       c       19       21        2    2
# 12:       c        5        4        3    3
# 13:       d       18       23        1    1
# 14:       d        2        4        2    2
# 15:       d       19       21        2    2
# 16:       d        2        4        3    3

数据:

df <- data.frame(user_id=c("a","a","a","a","b","b","b","c","c","c","c","c","d","d","d","d"),matric_1=c(10,23,4,5,17,5,40,1,2,18,19,5,18,2,19,2),matric_2=c(10,25,10,13,21,10,7,3,4,22,21,4,23,4,21,4),new_rank=c(1,1,2,2,1,2,2,1,1,1,2,3,1,2,2,3))

答案 1 :(得分:1)

按&#39; user_id&#39;分组后,创建&#39; new_rank1&#39;通过获取逻辑lag

cumsum的{​​{1}}
vector