关于forvalues的等效命令

时间:2019-02-20 16:33:35

标签: r stata

如果像这样的长格式有多个事件结果(实际数据包含许多ID,则这是简化数据)。

data <- data.frame(id=c(rep(1, 4), rep(2, 3), rep(3, 3)), 
               event=c(1, 1, 0, 0, 1, 1, 0, 1, 1, 0), 
               eventcount=c(1, 2, 0, 0, 1, 2, 0, 1, 2, 0), 
               firstevent=c(1, 0, 0, 0, 1, 0, 0, 1, 0, 0), 
               time=c(100, 250, 150, 300, 240, 400, 150, 350, 700, 200) )

我想在从第一场比赛开始的特定时间内接听比赛。在这种情况下,我想在100days-150days之内检测到第二个事件。在Stata中,我们可以使用

gen event2=1 if id==id[_n-1]& time-time[_n-1]>100 & time-time[_n-1]<=150 & firstevent[_n-1]==1 & firstevent==0 & event==1
forvalues i = 2/3
{
replace event2=1 if id==id[_n-`i']& time-time[_n-`i']>100 &time-time[_n-`i']<=150 & firstevent[_n-`i']==1 & firstevent==0 & event==1
}

在这种情况下

data_after <- data.frame(id=c(rep(1, 4), rep(2, 3), rep(3, 3)), 
                     event=c(1, 1, 0, 0, 1, 1, 0, 1, 1, 0),  
                     eventcount=c(1, 2, 0, 0, 1, 2, 0, 1, 2, 0),  
                     firstevent=c(1, 0, 0, 0, 1, 0, 0, 1, 0, 0), 
                     time=c(100, 250, 150, 300, 240, 400, 150, 350, 700, 200),  
                     event2=c(NA, 1, NA, NA, NA, NA, NA, NA, NA, NA))

我应该如何用R写这个?

1 个答案:

答案 0 :(得分:0)

intervals = ave(
    data$time,
    data$id,
    FUN = function(x)
        c(0, diff(x))
)
intervals
# [1]    0  150 -100  150    0  160 -250    0  350 -500

meets_duration_requirement = ave(
    intervals,
    data$id,
    FUN = function(x)
        x >= 100 & x <= 150
) == 1 & data$event == 1

choose_second = meets_duration_requirement == 1 &
    ave(meets_duration_requirement, data$id, FUN = seq_along) == 2 #if you want third event, change this to 3

replace(x = rep(NA, NROW(data)),
        list = choose_second,
        1)
# [1] NA  1 NA NA NA NA NA NA NA NA