根据列名称的多个条件选择列

时间:2019-03-28 21:18:35

标签: r grep

我试图对包含某些字符的列求和,但我不知道如何使用grep-实现它。

我想总结以下几列。它们的名称中都标有“ 1.30以下”和“仅5年以下”:

"Under 1.30: - Married-couple family: - With related children of the householder under 18 years: - Under 5 years only"

"Under 1.30: - Other family: - Male householder, no wife present: - With related children of the householder under 18 years: - Under 5 years only"  

"Under 1.30: - Other family: - Female householder, no husband present: - With related children of the householder under 18 years: - Under 5 years only" 

我尝试了以下代码,但是除了上面显示的3列之外,它还返回了更多列。

names(B17022[,grep("^Under 1.30.[Under 5 years only]", names(B17022))]) 

例如,它也返回:

"Under 1.30: - Married-couple family:" 

2 个答案:

答案 0 :(得分:0)

怎么样:

names(B17022[grep("^Under.*Under 5 years only", names(B17022))])

编辑:说明

.*匹配除换行符之外的任何零个或多个字符。因此,它将基本上匹配两个“ Under”之间的任何内容。

答案 1 :(得分:0)

您可以使用grepl,它为您提供带有TRUE或FALSE的向量,这使得它很容易应用于多种条件。

names(B17022)[grepl("Under 1.30", names(B17022)) & 
grepl("Under 5 years only", names(B17022))]

使用的数据:

B17022 <- data.frame(matrix(rnorm(3), ncol= 3))

names(B17022) <- c("Under 1.30: - Married-couple family:  - With related children of the householder under 18 years:  - Under 5 years only", "Under 1.30: - Other family: - Male. householder, no wife present: - With related children of the householder under 18 years: - Under 5 years only", "Under 1.30: - Other family: - Female householder, no husband present: - With related children of the householder under 18 years: - Under 5 years only")