Question

我有一个以下结构的列表，我打算在列表的第二个变量（b）中找到X2的最大值

样本数据

java.lang.AssertionError: 
Expecting:
  <[["s1.1", "s1.2"],
    ["s2.1", "s2.2"],
    ["s3.1", "s3.2"]]>
to contain only:
  <[["s1.1",
    "s1.2",
    "s2.1",
    "s2.2",
    "s3.1",
    "s3.2"]]>

我研究了可以应用于列表的多个过滤器，例如：

[[1]]

[[1]]$a

[1] 2

[[1]]$b

   X1  X2
1  58 1686729

2 106 1682303


[[2]]

[[2]]$a

[1] 3

[[2]]$b

   X1  X2

1  24 1642468

2  89 1695581

3 156 1634019

并尝试了library(rlist) list.filter(result, max(b$area))，但没有成功

lapply

我需要以下输出：

lapply(result, function(x) x[which.max(x$b)])

Answer 1

使用lapply()可以在每个列表的X2中找到$b的最大值，然后在cbind()元素中找到a。

l_max <- lapply(l, function(x) {
  b <- x$b
  cbind(a=x$a, b[which.max(b$X2),])
})

使用bind_rows()中的dplyr进行绑定。

l_max %>% 
  dplyr::bind_rows()
#     a X1      X2
# 1   2 58 1686729
# 2   3 89 1695581

示例数据：

l <- list(
  list(a = 2, 
       b = data.frame(X1 = c(58, 106),  X2 = c(1686729, 1682303))),
  list(a = 3, 
       b = data.frame(X1 = c(24, 89,156),  X2 = c(1642468, 1695581,1634019)))
)

以您的示例为例：

l_max <- lapply(l, function(x) {
  b <- x$b
  cbind(a = x$a, b[which.max(b[,2]),]) # NOTICE I used [,2] to refer to the second column
                             #b$area works too if all df share the col name
})

l_max %>% 
  dplyr::bind_rows()
#   a  rt    area
# 1 2  58 1686729
# 2 3  89 1695581
# 3 4 101 1679889
# 4 5  88 1695983
# 5 6 105 1706445
# 6 7 121 1702019

使用purrr::map_df()的另一种解决方案避免使用bind_rows()：

purrr::map_df(l, function(x) {
  b <- x$b
  cbind(a = x$a, b[which.max(b[,2]),]) 
})

所有使用mapply()的基数R：

t(mapply(function(x) {
  b <- x$b
  cbind(a = x$a, b[which.max(b[,2]),]) 
}, l))

或使用Map()：

do.call("rbind", Map(function(x) {
  b <- x$b
  cbind(a = x$a, b[which.max(b[,2]),]) 
}, l))

Answer 2

您也可以使用sapply()：

t(sapply(list, function(elem){
  c(a = elem$a, elem$b[which.max(elem$b$area), ])
}))

在列表中找到变量的最大值

2 个答案: