按列名称和常量列名称的向量过滤数据帧

时间:2018-03-22 04:23:46

标签: r dataframe filter

这肯定很容易,但对于我的生活,我找不到合适的语法。

我想保留所有“ID_”列,无论列数和附加数字如何,并保持其他列的常量名称。

类似下面的命令不起作用(每次都在重新创建的数据上):

###Does not work, but shows what I am trying to do
testdf1 <- df1[,c(paste(idvec, collapse="','"),"ConstantNames_YESwant")]

重新创建的数据:

rand <- sample(1:2, 1)
if(rand==1){
  df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    ID_2=11,
    ID_3=111,
    LotsOfColumnsWithVariousNames_NOwant="unwanted_data",
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
  desired.df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    ID_2=11,
    ID_3=111,
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
}
if(rand==2){
  df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    LotsOfColumnsWithVariousNames_NOwant="unwanted_data",
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
  desired.df1 <- data.frame(
    ID_0=0,
    ID_1=1,
    ConstantNames_YESwant="wanted_data",
    stringsAsFactors = FALSE
  )
}

2 个答案:

答案 0 :(得分:2)

这是你想要的吗?

library(tidyverse)

df1 %>% 
  select(matches("ID_*"), ConstantNames_YESwant)

df1 %>% 
  select(starts_with("ID"), ConstantNames_YESwant)

# ID_0 ID_1 ConstantNames_YESwant
# 1    0    1           wanted_data

答案 1 :(得分:2)

在基地R,你可以做

#Get all the ID columns
idvec <- grep("ID", colnames(df1), value = TRUE)
#Select ID columns and the constant names you want. 
df1[c(idvec, "ConstantNames_YESwant")]

#  ID_0 ID_1 ConstantNames_YESwant
#1    0    1           wanted_data