动态生成列名

时间:2018-08-31 19:27:05

标签: r dplyr

我想根据cut语句的宽度动态生成列。

如何像下面的示例那样动态生成AGE1到AGEn?

config.assets.debug = true

1 个答案:

答案 0 :(得分:1)

创建一个函数。它包含一个for循环解决方案

cut_function <- function(df, num_cuts) {
  num_by <- num_cuts
  df_out <- df %>%
  mutate(AGEGROUP = cut(AGE, breaks = seq(10, 20, by = num_by), right = F)) %>%
  group_by(AGEGROUP) %>%
  summarise(SUM.NUM = sum(NUM)) %>%
  mutate(AGELOW = as.numeric(substr(as.character(AGEGROUP), 2, 3)))
         # generate AGEn from 1:(num_by-1)

for(i in 2:num_by-1) { 

# this is the core of the function 
# it assigns a new column based on the index i
# i depends on the length of your num_by

 df_out[[paste0('AGE',i)]] <- df_out$AGELOW + i
 df_out
}
 df_out %>% select(-AGEGROUP) %>% 
   gather(AGE, value, AGELOW:paste0('AGE',num_by-1), -c(SUM.NUM))
}

测试

cut_function(df,2)
    # A tibble: 10 x 3
   SUM.NUM AGE    value
     <dbl> <chr>  <dbl>
 1   0.311 AGELOW    10
 2  -3.43  AGELOW    12
 3  -0.237 AGELOW    14
 4   1.82  AGELOW    16
 5   0.332 AGELOW    18
 6   0.311 AGE1      11
 7  -3.43  AGE1      13
 8  -0.237 AGE1      15
 9   1.82  AGE1      17
10   0.332 AGE1      19

cut_function(df,3)
    # A tibble: 12 x 3
   SUM.NUM AGE    value
     <dbl> <chr>  <dbl>
 1  -2.56  AGELOW    10
 2  -0.799 AGELOW    13
 3   1.58  AGELOW    16
 4   0.569 AGELOW    NA
 5  -2.56  AGE1      11
 6  -0.799 AGE1      14
 7   1.58  AGE1      17
 8   0.569 AGE1      NA
 9  -2.56  AGE2      12
10  -0.799 AGE2      15
11   1.58  AGE2      18
12   0.569 AGE2      NA

但是

从数据帧中查看所需的输出,我认为有一种更轻松的方法来获取所需的内容。只需在通话中将summarise替换为mutate

df %>%
  mutate(AGEGROUP = cut(AGE, breaks = seq(10, 20, by = num_by), right = F)) %>%
  group_by(AGEGROUP) %>%
  mutate(SUM.NUM = sum(NUM)) 

#gives basically exactly the same output as your df_out2

# A tibble: 10 x 4
# Groups:   AGEGROUP [5]
     AGE     NUM AGEGROUP SUM.NUM
   <int>   <dbl> <fct>      <dbl>
 1    10  0.463  [10,12)    0.311
 2    11 -0.151  [10,12)    0.311
 3    12 -2.87   [12,14)   -3.43 
 4    13 -0.562  [12,14)   -3.43 
 5    14 -0.276  [14,16)   -0.237
 6    15  0.0392 [14,16)   -0.237
 7    16  1.99   [16,18)    1.82 
 8    17 -0.168  [16,18)    1.82 
 9    18 -0.236  [18,20)    0.332
10    19  0.569  [18,20)    0.332

您可以创建上述函数,而无需for循环