从上面和下面的行获取信息

时间:2014-10-22 07:53:37

标签: r

我是R的新手,不知道如何从我的数据中获得正确的输出:

我的数据:

row1    101 woody   5
row2    101 woody   0
row3    111 kiln    23
row4    200 weez    2
row5    315 rowt    0

例如,在第3行中,第3列中的元素大于0,其第1列值在101(第1行)和第111行(第3行)之间。因此,条件是,对于任何行,如果column3中的值大于0,并且如果其列1的值介于上面和下面的列之间。

必需的输出:

        col1 col2   col3
row1    101 woody   After_none
row2    101 woody   0
row3    111 kiln    Between_woody_weez
row4    200 weez    Between_Kiln_rowt
row5    315 rowt    0
如果有人可以帮助我,我会很高兴。感谢

添加了更多数据来运行Akru的代码:

col1    col2    col3
255 mwu 21
77031   netw    0
77031   netw    0
77031   netw    0
82513   cuu 91
88206   cxum    0
88206   cxum    0
88206   cxum    0
188450  xaii    25
188450  xaii    0
188450  xaii    0
188450  xaii    0
188450  xaii    0
199800  aau 0

代码使用此数据样本运行,但输出不太正确:

col1 col2 col3              colN
255  mwu   21        After_none
77031 netw    0              <NA>
77031 netw    0              <NA>
77031 netw    0              <NA>
82513  cuu   91  Between_mwu_netw
88206 cxum    0              <NA>
88206 cxum    0              <NA>
88206 cxum    0              <NA>
188450 xaii   25 Between_netw_cxum
188450 xaii    0              <NA>
188450 xaii    0              <NA>
188450 xaii    0              <NA>
188450 xaii    0              <NA>
199800  aau 0                 <NA>

但预期的输出是:

col1 col2 col3              
255  mwu   21        
77031 netw    0              
77031 netw    0              
77031 netw    0              
82513  Between_mwu_cxum   91
88206 cxum    0              
88206 cxum    0              
88206 cxum    0              
188450 Between_cxum_aau   25 
188450 xaii    0              
188450 xaii    0              
188450 xaii    0              
188450 xaii    0   
199800  aau 0           

OR与额外列“colN”将是正常的

预期产出:

col1 col2 col3              
255  mwu   21        
77031 netw    0              
77031 netw    0              
77031 netw    0              
82513  Between_mwu_cxum   91
88206 cxum    0              
88206 cxum    0              
88206 cxum    0              
188450 Between_cxum_aau   25 
188450 xaii    0              
88450 xaii    0              
188450 xaii    0              
188450 xaii    0   
199800  aau 0   

1 个答案:

答案 0 :(得分:0)

一种方法是:

  indx <- df$col3 >0
  df$colN <- df$col3
  df$colN[indx] <- sapply(which(indx), function(i) {
      i1 <- 1:(i - 1)
      i2 <- (i + 1):nrow(df)
      indx1 <- with(df, col1[i] > col1[i1])
      indx2 <- with(df, col1[i] < col1[i2])
      if (any(indx1) & any(indx2)) 
       paste("Between", df$col2[i1][max(which(indx1))], df$col2[i2][min(which(indx2))], 
             sep = "_") else df$col3[i]
   })

  df
  #     col1 col2 col3              colN
  #1     255  mwu   21                21
  #2   77031 netw    0                 0
  #3   77031 netw    0                 0
  #4   77031 netw    0                 0
  #5   82513  cuu   91 Between_netw_cxum
  #6   88206 cxum    0                 0
  #7   88206 cxum    0                 0
  #8   88206 cxum    0                 0
  #9  188450 xaii   25  Between_cxum_aau
  #10 188450 xaii    0                 0
  #11 188450 xaii    0                 0
  #12 188450 xaii    0                 0
  #13 188450 xaii    0                 0
  #14 199800  aau    0                 0

更新

如果您想更改col2,请执行以下操作:

 df$col2[indx] <-sapply(which(indx), function(i) {
     i1 <- 1:(i - 1)
     i2 <- (i + 1):nrow(df)
     indx1 <- with(df, col1[i] > col1[i1])
     indx2 <- with(df, col1[i] < col1[i2])
     if (any(indx1) & any(indx2)) 
     paste("Between", df$col2[i1][max(which(indx1))], df$col2[i2][min(which(indx2))], 
         sep = "_") else df$col2[i] #replaced here
   })

 df
 #    col1              col2 col3
 #1     255               mwu   21
 #2   77031              netw    0
 #3   77031              netw    0
 #4   77031              netw    0
 #5   82513 Between_netw_cxum   91
 #6   88206              cxum    0
 #7   88206              cxum    0
 #8   88206              cxum    0
 #9  188450  Between_cxum_aau   25
 #10 188450              xaii    0
 #11 188450              xaii    0
 #12 188450              xaii    0
 #13 188450              xaii    0
 #14 199800               aau    0

数据

 df <-  structure(list(col1 = c(255L, 77031L, 77031L, 77031L, 82513L, 
 88206L, 88206L, 88206L, 188450L, 188450L, 188450L, 188450L, 188450L, 
 199800L), col2 = c("mwu", "netw", "netw", "netw", "cuu", "cxum", 
 "cxum", "cxum", "xaii", "xaii", "xaii", "xaii", "xaii", "aau"
 ), col3 = c(21L, 0L, 0L, 0L, 91L, 0L, 0L, 0L, 25L, 0L, 0L, 0L, 
 0L, 0L)), .Names = c("col1", "col2", "col3"), class = "data.frame", 
 row.names = c(NA,-14L))