Question

我没有在其他帖子中找到答案，如果他们处理类似的话题，我也不理解答案，因为我对R和编程总体上相对较新。我有以下调查输出X我正在使用（摘录）：

A1B1         A1B2       A1B3          A1B4          A2B1          A2B2        A2B3      ...
-0.37014356  1.08841141 -0.126574243 -0.59169360  1.682673457 -0.427706432 -0.76091938  ...
3.03017573  1.39812421  0.243516558 -4.67181650 -0.378640756  2.039940436 -0.40785893   ...
3.50183121  1.51249433 -0.775449944 -4.23887560 -0.456911873  0.431838943  0.91108052   ...

...

我想计算前4个（diff(range(X[i,n:m]))等于n:m）的最大范围1:4的差异，第二个(5:8)和第三个(9:12) { {1}} X的每一行的列，并将结果放入具有i行和3列的第二个矩阵中。

E.g。对于第一行和前四个列，它将是1.08841141+0.59169360=1.68010501.

为此，我创建了一个新矩阵，并试图用值填充它：

newmatrix <- matrix(0,nrow(X),3)
newmatrix[1:nrow(X),1] <- for (i in (1:nrow(X))) {diff(range(X[i,1:4]))}  
newmatrix[1:nrow(X),2] <- for (i in (1:nrow(X))) {diff(range(X[i,5:8]))}   
newmatrix[1:nrow(X),3] <- for (i in (1:nrow(X))) {diff(range(X[i,9:12]))}

我收到输出错误：

Error in newmatrix[1:nrow(RBetas), 1] <- for (i in (1:nrow(RBetas))) { : 
  number of items to replace is not a multiple of replacement length

感谢您的帮助！

Answer 1

假设列块基于前两个字符，即A1，A2，我们可以使用substr将其分成不同的块来提取前两个字符从列名称中使用它作为split的索引。然后，我们可以使用apply与range和diff来获得结果，也可以使用pmax和pmin。

  indx <- substr(colnames(df), 1,2)

如果分组不是基于column names，而是基于位置，那么这也应该有效

  indx <- (1:ncol(df)-1)%/%4 +1


  res1 <- sapply(split(seq_len(ncol(df)), indx),
               function(i) do.call(pmax,df[,i, drop=FALSE])-
                                 do.call(pmin, df[,i, drop=FALSE]))

或者

 res2 <- sapply(split(seq_len(ncol(df)), indx),
            function(i) apply(df[,i, drop=FALSE], 1,
                          function(x) diff(range(x))) )



 identical(res1, res2)
 #[1] TRUE
 res1
 #        A1       A2
 #[1,] 1.680105 2.443593
 #[2,] 7.701992 2.447799
 #[3,] 7.740707 1.367992

或使用您的代码

 newmatrix <- matrix(0, nrow(df), 2) #here the example dataset is only 7 columns
 for(i in (1:nrow(df))) newmatrix[i,1] <-  diff(range(df[i,1:4]))
 for(i in (1:nrow(df))) newmatrix[i,2] <-  diff(range(df[i,5:7]))
 newmatrix
 #    [,1]     [,2]
 #[1,] 1.680105 2.443593
 #[2,] 7.701992 2.447799
 #[3,] 7.740707 1.367992

如果您有多个列块，则可以尝试双for循环

 lst <- split(seq_len(ncol(df)), indx) #keep the columns to group in a `list`
 newmatrix <- matrix(0, nrow(df), 2) #he
 for(i in 1:nrow(df)){
 for(j in seq_along(lst)){
   newmatrix[i,j] <- diff(range(df[i, lst[[j]]]))
  }
 }

newmatrix
#      [,1]     [,2]
#[1,] 1.680105 2.443593
#[2,] 7.701992 2.447799
#[3,] 7.740707 1.367992

数据

df <- structure(list(A1B1 = c(-0.37014356, 3.03017573, 3.50183121), 
A1B2 = c(1.08841141, 1.39812421, 1.51249433), A1B3 = c(-0.126574243, 
0.243516558, -0.775449944), A1B4 = c(-0.5916936, -4.6718165, 
-4.2388756), A2B1 = c(1.682673457, -0.378640756, -0.456911873
), A2B2 = c(-0.427706432, 2.039940436, 0.431838943), A2B3 = c(-0.76091938, 
-0.40785893, 0.91108052)), .Names = c("A1B1", "A1B2", "A1B3", 
"A1B4", "A2B1", "A2B2", "A2B3"), class = "data.frame", row.names = c(NA, 
-3L))

用另一个矩阵的计算值填充矩阵

1 个答案:

数据