数据处理问题

时间:2013-01-17 05:52:58

标签: r

我想创建两个名为preypreyrow的新列。 prey是下一个y位置,但在x个值内。 preyrow值是同一row值内的下一个x值。

原始表格如下:

   x           y row
1  1  0.60697546   1
2  1 -0.68600911   2
3  1 -0.53499454   3
4  1  0.05591587   4
5  2  0.11937963   5
6  2 -0.39951846   6
7  2  0.97430697   7
8  3  0.42852135   8
9  3  0.27695563   9
10 4 -0.29530769  10

我希望输出表看起来像:

   x           y row        prey prerow
1  1  0.60697546   1 -0.68600911      2
2  1 -0.68600911   2 -0.53499454      3
3  1 -0.53499454   3  0.05591587      4
4  1  0.05591587   4          NA     NA
5  2  0.11937963   5 -0.39951846      6
6  2 -0.39951846   6  0.97430697      7
7  2  0.97430697   7          NA     NA
8  3  0.42852135   8  0.27695563      9
9  3  0.27695563   9 -0.29530769     10
10 4 -0.29530769  10          NA     NA

1 个答案:

答案 0 :(得分:2)

我认为这就是您所需要的(使用data.table):

require(data.table)
df <- structure(list(x = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L), 
      y = c(0.60697546, -0.68600911, -0.53499454, 0.05591587, 0.11937963, 
      -0.39951846, 0.97430697, 0.42852135, 0.27695563, -0.29530769), 
      row = 1:10), .Names = c("x", "y", "row"), class = "data.frame", 
      row.names = c("1", "2", "3", "4", "5", "6", "7", "8", "9", "10"))
dt <- data.table(df, key="x")
dt.out <- dt[, .SD[2:(nrow(.SD)+1)], by=x]
setnames(dt.out, c("x", "prey", "preyrow"))
dt.out <- cbind(dt, subset(dt.out, select=-c(x)))

> dt.out

    x           y row        prey preyrow
 1: 1  0.60697546   1 -0.68600911       2
 2: 1 -0.68600911   2 -0.53499454       3
 3: 1 -0.53499454   3  0.05591587       4
 4: 1  0.05591587   4          NA      NA
 5: 2  0.11937963   5 -0.39951846       6
 6: 2 -0.39951846   6  0.97430697       7
 7: 2  0.97430697   7          NA      NA
 8: 3  0.42852135   8  0.27695563       9
 9: 3  0.27695563   9          NA      NA
10: 4 -0.29530769  10          NA      NA