如何添加指标变量列和添加名称?

时间:2020-10-11 21:06:59

标签: r dataframe

我正在使用一个数据框,在其中创建了二进制变量,这些变量指示“播放器”列中是否存在某个个人。

Layer       Grade       Players                    Var 2             NYAL 08   NYAL 27        
Top           A         NYAL 08; NYAL 27; NYAL 80  NYAL 08; MAAC 48    1       1      ...
Bottom        D         MAAC 27; MAAC 45; MAAC 65  NYAL 27             0       0      ...    
Middle        B         NYAL 08; MAAC 48; NYAL 66  MAAC 48;MAAC 22     0       0      ...       
...

我想将二进制变量添加到同一数据集,该数据集简单地指示变量2中是否存在某个个体。但是,由于大多数个体都是相同的,因此我想在其后面添加字母“ B”列名将这些新指标列与现有指标列分开。怎么可能做到这一点?

Layer       Grade       Players             Var 2            NYAL 08 NYAL 27 NYAL 08B NYAL 27B    
Top           A         NYAL 08; NYAL 27   NYAL 08; MAAC 48    1       1      1       0
Bottom        D         MAAC 27; MAAC 45   NYAL 27             0       0      0       1
Middle        B         NYAL 08; MAAC 48   NYAL 27; MAAC 22    0       0      0       1

2 个答案:

答案 0 :(得分:1)

根据显示的示例

library(qdapTools)
#players_out <- mtabulate(strsplit(df1$Players, ";\\s+"))
var2_out <- mtabulate(strsplit(df1$Var2, ";\\s+"))
nm1 <- intersect(names(players_out), names(df1)[-(1:4)])
df1[paste0(nm1, "B")] <- var2_out[nm1]

-输出

df1
#    Layer Grade          Players             Var2 NYAL 08 NYAL 27 NYAL 08B NYAL 27B
#1    Top     A NYAL 08; NYAL 27 NYAL 08; MAAC 48       1       1        1        0
#2 Bottom     D MAAC 27; MAAC 45          NYAL 27       0       0        0        1
#3 Middle     B NYAL 08; MAAC 48 NYAL 27; MAAC 22       0       0        0        1

数据

df1 <- structure(list(Layer = c("Top", "Bottom", "Middle"), Grade = c("A", 
"D", "B"), Players = c("NYAL 08; NYAL 27", "MAAC 27; MAAC 45", 
"NYAL 08; MAAC 48"), Var2 = c("NYAL 08; MAAC 48", "NYAL 27", 
"NYAL 27; MAAC 22"), `NYAL 08` = c(1L, 0L, 0L), `NYAL 27` = c(1L, 
0L, 0L)), row.names = c(NA, -3L), class = "data.frame")

答案 1 :(得分:1)

基本R选项

u <- t(sapply(strsplit(df$Var2,";\\s+"),function(v) +sapply(tail(names(df),2),`%in%`, v)))
df <- cbind(df,`colnames<-`(u,paste0(colnames(u),"B")))

给出

   Layer Grade          Players             Var2 NYAL 08 NYAL 27 NYAL 08B
1    Top     A NYAL 08; NYAL 27 NYAL 08; MAAC 48       1       1        1
2 Bottom     D MAAC 27; MAAC 45          NYAL 27       0       0        0
3 Middle     B NYAL 08; MAAC 48 NYAL 27; MAAC 22       0       0        0
  NYAL 27B
1        0
2        1
3        1

数据

> dput(df)
structure(list(Layer = c("Top", "Bottom", "Middle"), Grade = c("A", 
"D", "B"), Players = c("NYAL 08; NYAL 27", "MAAC 27; MAAC 45",
"NYAL 08; MAAC 48"), Var2 = c("NYAL 08; MAAC 48", "NYAL 27",
"NYAL 27; MAAC 22"), `NYAL 08` = c(1L, 0L, 0L), `NYAL 27` = c(1L,
0L, 0L)), row.names = c(NA, -3L), class = "data.frame")