使用数据框标题/名称重命名列名称

时间:2017-04-21 15:37:53

标签: r dplyr

我有一个名为“Something”的数据框。我正在使用summarize对其中一个数字列进行聚合,我希望该列的名称包含“Something” - 列名称中的数据框标题。

示例:

    temp <- Something %>% 
    group_by(Month) %>% 
    summarise(avg_score=mean(score))

但我想将聚合列命名为“avg_Something_score”。这有意义吗?

5 个答案:

答案 0 :(得分:4)

我们可以使用dplyr的devel版本(即将发布0.6.0)来执行quosure s

library(dplyr)
myFun <- function(data, group, value){
      dataN <- quo_name(enquo(data))
      group <- enquo(group)
      value <- enquo(value)

      newName <- paste0("avg_", dataN, "_", quo_name(value))
     data %>%
        group_by(!!group) %>%
        summarise(!!newName := mean(!!value))
 }

myFun(mtcars, cyl, mpg)
# A tibble: 3 × 2
#   cyl avg_mtcars_mpg
#  <dbl>          <dbl>
#1     4       26.66364
#2     6       19.74286
#3     8       15.10000

myFun(iris, Species, Petal.Width)
# A tibble: 3 × 2
#     Species avg_iris_Petal.Width
#     <fctr>                <dbl>
#1     setosa                0.246
#2 versicolor                1.326
#3  virginica                2.026

此处,enquosubstitute获取base R等输入参数并转换为quosurequo_name,我们可以将其转换为字符串,通过在quosure内部取消引用(!!UQ)来评估group_by/summarise/mutate等。分配的{lh}上的列名称(:=)也可以通过取消引用来评估获取感兴趣的列

答案 1 :(得分:3)

您可以rename_ dplyr使用deparse(substitute(Something)) Something %>% group_by(Month) %>% summarise(avg_score=mean(score))%>% rename_(.dots = setNames("avg_score", paste0("avg_",deparse(substitute(Something)),"_score") )) ,如下所示:

  public int CompareReading(int reading, string serialNo)
    {
        SqlConnection connection = new SqlConnection (Correct connection);
        connection.Open();



        SqlCommand cmd = connection.CreateCommand();
        cmd.CommandText = "SELECT Reading from MeterReading Where SerialNo = @serialNo";
        cmd.Parameters.AddWithValue("@serialNo", serialNo);
        int result = ((int)cmd.ExecuteScalar());
        connection.Close();


        int theReading = db.MeterReading.Find(result).Reading;

        Debug.WriteLine(theReading);

        return theReading;

    }

答案 2 :(得分:2)

python3 slackbot.py | awk -F "friendly_name:" {'print $2'}

答案 3 :(得分:2)

似乎更有意义的是动态生成新列名,这样您就不必在setNames内硬编码数据框的名称。也许类似下面的函数,它采用数据框,分组变量和数字变量:

library(dplyr)
library(lazyeval)

my_fnc = function(data, group, value) {

  df.name = deparse(substitute(data))

  data %>%
    group_by_(group) %>%
    summarise_(avg = interp(~mean(v), v=as.name(value))) %>%
    rename_(.dots = setNames("avg", paste0("avg_", df.name, "_", value)))
}

现在让我们在两个不同的数据框上运行该函数:

my_fnc(mtcars, "cyl", "mpg")
    cyl avg_mtcars_mpg
  <dbl>          <dbl>
1     4       26.66364
2     6       19.74286
3     8       15.10000
my_fnc(iris, "Species", "Petal.Width")
     Species avg_iris_Petal.Width
1     setosa                0.246
2 versicolor                1.326
3  virginica                2.026

答案 4 :(得分:0)

你可以使用colnames(Something)&lt; -c(“score”,“something_avg_score”)