快速检查R中另一个列表中包含的列表元素数量

时间:2016-06-07 15:19:16

标签: r list comparison match

给出两个字符元素列表:

set.seed(0)  

indexes <- list(c("1","2","3"),c("2","3","4"))
> indexes
[[1]]
 [1] "1" "2" "3"

[[2]]
 [1] "2" "3" "4"

try <- list(as.character(round(rnorm(10,2,2),0)),
        as.character(round(rnorm(10,2,2),0)),
        as.character(round(rnorm(10,2,2),0)))
> try
[[1]]
 [1] "5"  "1"  "5"  "5"  "3"  "-1" "0"  "1"  "2"  "7" 

[[2]]
 [1] "4" "0" "0" "1" "1" "1" "3" "0" "3" "0"

[[3]]
 [1] "2"  "3"  "2"  "4"  "2"  "3"  "4"  "1"  "-1" "2" 

我想检查每个&#34;子列表&#34;的多少个字符。内部try包含在每个的&#34;子列表&#34;在{&#34;成对比较&#34; -ish方式中indexes

例如:在try[[1]]中我们("1","3","1","2")包含indexes[[1]],因此此匹配的结果为4。然后,对于try[[2]]indexes[[1]]之间的匹配,我们有("1","1","1","3","3"),因此结果将为5。 try[[3]]indexes[[1]]的相同推理 然后我们传递try[[1]]indexes[[2]]之间由("3","2")表示的匹配,因此此处的结果为2,依此类推。
我希望将结果作为输出存储在变量中(参见下面的示例)

我找到了一个可行的解决方案,但我有一个大量的列表来应用它(我的真实try列表有400万个元素,我的indexes列表有100个元素),那么什么我做的非常慢 这是我的解决方案:

for(i in 1:length(indexes)){
  tmp <- lapply(try,function(x) sum(x %in% indexes[[i]]))
  assign(paste0("a",i),tmp)
}

> a1
[[1]]
 [1] 4

[[2]]
 [1] 5

[[3]]
 [1] 7

> a2
[[1]]
 [1] 2

[[2]]
 [1] 3

[[3]]
 [1] 8

2 个答案:

答案 0 :(得分:2)

如果这仍然太慢,您可能需要考虑使用已编译的代码,例如使用Rpcc。我没有看到使用矢量化函数的方法:

combs <- expand.grid(try = seq_along(try), indexes = seq_along(indexes))
combs$n_match <-  mapply(function(i, j, a, b) sum(a[[i]] %in% b[[j]]), 
       combs[,1], combs[,2], 
       MoreArgs = list(a = try, b = indexes))
#  try indexes n_match
#1   1       1       4
#2   2       1       5
#3   3       1       7
#4   1       2       2
#5   2       2       3
#6   3       2       8

答案 1 :(得分:0)

予。多少元素

length(try[[1]][which(try[[1]] %in% indexes[[1]])])
# [1] 7
length(try[[2]][which(try[[2]] %in% indexes[[1]])])
# [1] 3
length(try[[3]][which(try[[3]] %in% indexes[[1]])])
# [1] 6
length(try[[1]][which(try[[1]] %in% indexes[[2]])])
# [1] 5
length(try[[2]][which(try[[2]] %in% indexes[[2]])])
# [1] 4
length(try[[3]][which(try[[3]] %in% indexes[[2]])])
# [1] 5

II。哪些元素

try[[1]][which(try[[1]] %in% indexes[[1]])]
# [1] "2" "1" "1" "1" "3" "1" "1"
try[[2]][which(try[[2]] %in% indexes[[1]])]
# [1] "3" "1" "2"
try[[3]][which(try[[3]] %in% indexes[[1]])]
# [1] "3" "1" "2" "3" "3" "2"
try[[1]][which(try[[1]] %in% indexes[[2]])]
# [1] "2" "3" "4" "4" "4"
try[[2]][which(try[[2]] %in% indexes[[2]])]
# [1] "3" "4" "4" "2"
try[[3]][which(try[[3]] %in% indexes[[2]])]
# [1] "3" "2" "3" "3" "2"