稀疏矩阵中的列联表

时间:2016-12-27 14:34:13

标签: r sparse-matrix contingency

我有一个大的稀疏矩阵。现在我想制作一对列的所有组合的列联表。 例如 : 让我们说我的稀疏矩阵是Mat

D1   D2  D3  D4  D5  ..  Dn
1    0   1   0   0   ..  0
0    1   1   1   1   ..  1
..   ..  ..  ..  ..  ..  ..
1    0   1   0   1   ..  1

现在需要为Di和Dj的所有组合制作列联表,例如(D1,D2),(D1,D3),(D1,D4)的偶然表..(D1,Dn),( D2,D3),(D2,D4)..(D2,Dn)..(Dn-1,Dn)

每个列联表的结构

 r1  r2
 r3  r4



#where r1 is total number of 1's in Di column 
#         r2 is total number of 1's in Di AND Dj column
#         r3 is total number of 1's in Di AND Dj column 
#         r4 is total number of 1's in Dj column

此外:

for each i in (1:n-1) {
    for each j in (i+1 : n) {
        Calculate r1,r2,r3,r4
        create contingency table for Ri and Rj
        apply fisher test on that 
    }
}

我想要一些快速实施,因为它需要超过2-3天

1 个答案:

答案 0 :(得分:1)

这是获得所有2 x 2矩阵的一个想法,

fun1 <- function(x,y){
 matrix(data = c(sum(m1[,x]), sum(m1[,c(x,y)]), sum(m1[,c(x,y)]), sum(m1[,y])), 
                                                               nrow = 2, ncol = 2)
 }
#where m1 is your original matrix

ind1 <- combn(1:ncol(m1),2)[1,]
ind2 <- combn(1:ncol(m1),2)[2,]
final.list <- Map(fun1, ind1, ind2)

head(final.list, 2)
#[[1]]
#     [,1] [,2]
#[1,]    3    6
#[2,]    6    3

#[[2]]
#     [,1] [,2]
#[1,]    3    6
#[2,]    6    3

数据

dput(m1)
structure(c(0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 
1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1, 0, 1), .Dim = c(6L, 
6L), .Dimnames = list(NULL, c("D1", "D2", "D3", "D4", "D5", "D6"
)))

或类似地,

fun2 <- function(x,y){
     matrix(data = c(c.sums[x], sum(c.sums[c(x,y)]), sum(c.sums[c(x,y)]), c.sums[y]),
                                                                    nrow = 2, ncol = 2)
 }

ind1 <- combn(1:ncol(m1),2)[1,]
ind2 <- combn(1:ncol(m1),2)[2,]
c.sums <- colSums(m1)

final.list2 <- Map(fun2, ind1, ind2)