R中同一行的条件累积总和

时间:2018-11-27 10:03:32

标签: r if-statement excel-formula dataset

我有一个像这样的数据集

dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
              Col1 = rep(c("B","S","S","B"), 4),
              Col2 = rep(c(1,2,3,4), 4),
              Col3 = rep(c(0.1,0.2,0.3,0.4), 4))

我正在尝试创建如下所示的第四列

dat1 <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
               Col1 = rep(c("B","S","S","B"), 4),
               Col2 = rep(c(1,2,3,4), 4),
               Col3 = rep(c(0.1,0.2,0.3,0.4), 4),
               Col4 = c(1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4, 1, 0.8, 1.26, 4))

到目前为止我一直在尝试,

d1 <- dat %>% 
  group_by(Col0) %>% 
  mutate(Col4 = if_else(Col1 == 'B', Col2,
                        if_else(Col1 == 'S' & lag(Col1 == "B"), lag(Col2)- Col3*lag(Col2), 0)))
d1

我得到的答案不是Col4中所希望的。 获得Col4的条件是:

 if Col1 is B then get the value of Col2 as it is,

 if Col1 is S & Previous Value of Col1 is B then 1-(0.2*1) which is equal to 0.8
 if Col1 is S & Previous Value of Col1 is S as well then (1+0.8) -((1+0.8)*0.3) which is 1.26

基本上,这就像先执行差异,然后执行包括该差异的累加总和,等等。

就目前而言,我仅以一个简单的例子来了解我要实现的目标,实际数据集已超过100万个Obs。还有数千个组,更糟糕的是“ B”和“ S”的组合发生了变化。就像在某些小组中一样,B,B,S,S等等……

对此我的任何帮助将不胜感激,因为我尝试了if_else()以外的其他事情,并且看到许多条件累积总和Ques,但无济于事。

我认为使用SUMIF()函数可以在Excel中轻松完成此操作,但是我需要使用R

1 个答案:

答案 0 :(得分:0)

感觉您没有完成if_else

dat <- data.frame(Col0 =rep(c("grp1","grp2","grp3", "grp4"), each = 4),
          Col1 = rep(c("B","S","S","B"), 4),
          Col2 = rep(c(1,2,3,4), 4),
          Col3 = rep(c(0.1,0.2,0.3,0.4), 4))
d1 <- dat %>% 
   group_by(Col0) %>% 
   mutate(Col4 = if_else(Col1 == 'B', Col2,
                    if_else(Col1 == 'S' & lag(Col1) == "B", 1-(0.2*1),
                            if_else(Col1 == 'S' & lag(Col1) == 'S',1.26,0))))
d1
相关问题