R中按条件分组的动态列名

时间:2021-06-21 07:43:33

标签: r dplyr group-by

我想使用动态输入列名称按条件分组。

df:
col1
a
b
c
d
a
c
d
b
a
b
d

我创建了如下函数

fun1 <- function(df,column_name){
  
  col_name1 = noquote(column_name)
  
  out_df = df %>% group_by(col_name1)%>%dplyr::summarise('Count'=n())
                                                              
  return(out_df)
}

where column_name is string. Example: column_name = 'col1'

当应用该函数时,它给出以下错误:

Error: Must group by variables found in `.data`.
* Column `col_name1` is not found.

即使列存在,我也遇到上述错误。我哪里出错了?

2 个答案:

答案 0 :(得分:1)

library(dplyr)
fun1 <- function(df,column_name){
  
  col_name1 <-  sym(column_name)
  
  out_df <-  df %>% 
    group_by(!!col_name1) %>%
    summarise('Count' = n())
  
  return(out_df)
}

fun1(iris, "Species")

# A tibble: 3 x 2
  Species    Count
  <fct>      <int>
1 setosa        50
2 versicolor    50
3 virginica     50

这也应该有效,优点是能够使用多个字符串:

fun1 <- function(df, column_name){
  df %>% 
    group_by(across(one_of(column_name))) %>%
    summarise('Count' = n())
  
}

答案 1 :(得分:0)

您可以使用 .data 代词 -

fun1 <- function(df,column_name){

  out_df = df %>% group_by(.data[[column_name]]) %>% summarise(Count = n())
  return(out_df)
}

fun1(df, 'col1')

#  col1  Count
#  <chr> <int>
#1 a         3
#2 b         3
#3 c         2
#4 d         3 

这也可以用 count 编写,其工作方式相同 -

fun2 <- function(df,column_name){
  df %>% count(.data[[column_name]], name = 'Count')
}
fun2(df, 'col1')
相关问题