按列分组,保持最小值,并保持连接所选值

时间:2019-01-25 05:19:05

标签: r dataframe

我有一个看起来像这样的数据框:

new_df <- structure(list(intype = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("A30", 
    "A31"), class = "factor"), inerror = c(0.54, 0.14, 0.94, 0, 2.11, 
    0), inmethod = structure(c(1L, 2L, 3L, 1L, 2L, 3L), .Label = c("A", 
    "B", "C"), class = "factor")), class = "data.frame", row.names = c(NA, 
    -6L))

我想创建一个新的数据框,该框将仅保留具有最小错误的最佳方法,但在联系上,我想连接最佳方法。 结果数据框应如下所示:

+--------+---------+----------+
| intype | inerror | inmethod |
+--------+---------+----------+
| A30    |    0.14 |        B |
| A31    |    0.00 |      A,C |
+--------+---------+----------+

当前,我正在使用

require(plyr)
new_df[new_df$inerror == ddply(new_df, .(intype), summarise, Value = min(inerror))$Value,]

但这不起作用。

5 个答案:

答案 0 :(得分:2)

这是使用dplyr的一种方法-

new_df %>% 
  group_by(intype) %>% 
  filter(inerror == min(inerror)) %>% 
  group_by(intype, inerror) %>% 
  summarise(inmethod = toString(inmethod)) %>% 
  ungroup()

# A tibble: 2 x 3
intype inerror inmethod
<chr>    <dbl> <chr>   
1 A30       0.14 B       
2 A31       0    A, C

答案 1 :(得分:2)

另一种tidyverse解决方案与Shree的解决方案略有不同:

df %>%
  group_by(intype, inerror) %>%
  summarise(inmethod = toString(inmethod)) %>%
  arrange(intype, inerror) %>%
  distinct(intype, .keep_all = T)

# A tibble: 2 x 3
# Groups:   intype [2]
  intype inerror inmethod
  <fct>    <dbl> <chr>   
1 A30       0.14 B       
2 A31       0    A, C    

答案 2 :(得分:2)

不是一个很好的答案,但是使用data.table

df <- data.table(df)
df <- df[df[,.(inerror == min(inerror)), .(intype)]$V1]
df <- df[, inmethod := toString(inmethod), .(intype)]
df <- unique(df)
df

   intype inerror inmethod
1:    A30    0.14        B
2:    A31    0.00     A, C

答案 3 :(得分:1)

使用<div id='asd'>lalala</div> <button onclick='myfunction()'>click me</button> <script> function myfunction(){ var arr = []; fetch('https://api.blockchair.com/bitcoin-sv/blocks?a=month,median(transaction_count)') .then(response => response.json()) .then(result => arr.push(result)); console.log(arr) document.getElementById('asd').innerHTML = arr[0] } </script> ,您可以执行以下操作:

data.table

答案 4 :(得分:1)

仅出于完整性考虑,一个基本的R解决方案:

do.call(rbind, lapply(split(new_df, new_df$intype),  function(x) {
  x <- x[x$inerror == min(x$inerror), ]
  data.frame(intype = x$intype[1], 
             inerror = x$inerror[1], 
             inmethod = paste0(x$inmethod, collapse = ","))
}))