权重变量在dplyr top_n中

时间:2014-07-28 07:35:48

标签: r dplyr

我正在尝试使用包top_n中的dplyr函数,但是当我让函数使用默认权重(数据框中的最后一个变量)时,它似乎才有效。以下示例(使用默认权重)有效:

library(babynames)
ba <- babynames
ba %>% filter(year == 2013) %>% group_by(sex) %>% top_n(n = 5)

Selecting by prop
Source: local data frame [10 x 5]

然而,这些不是:

ba %>% filter(year == 2013) %>% group_by(sex) %>% top_n(n = 5, wt = "prop")
Source: local data frame [33,072 x 5]

ba %>% filter(year == 2013) %>% group_by(sex) %>% top_n(n = 5, wt = prop)
Error in top_n(`ba %>% filter(year == 2013) %>% group_by(sex)`, n = 5,  : 
  object 'prop' not found

1 个答案:

答案 0 :(得分:2)

这似乎是一个错误。请提交错误报告。这是一个似乎按预期工作的更正版本。

top_n <- function (x, n, wt = NULL) 
{
  wt <- substitute(wt) # new line to correct is.null(wt)
  if (is.null(wt)) {
    vars <- tbl_vars(x)
    message("Selecting by ", vars[length(vars)])
    wt <- as.name(vars[length(vars)])
  }
  call <- substitute(filter(x, rank(desc(wt), ties.method = "min") <= 
                              n), list(n = n, wt = substitute(wt)))
  eval(call)
}