如何计算出现字符串的次数

时间:2020-06-14 14:23:12

标签: r dataframe

如何计算一个字符串有多少次(是)并在新列中给出结果

plugins {
    id 'java'
    id 'org.jetbrains.intellij' version '0.4.21'
    id 'org.jetbrains.kotlin.jvm' version '1.3.72'
}

group 'com.rahul.gqlformal'
version '1.0'

sourceCompatibility = 1.8

repositories {
    mavenCentral()
}

dependencies {
    implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk8"
    testCompile group: 'junit', name: 'junit', version: '4.12'
}

intellij {
    version = '2020.1'
    plugins = ['java']
}


buildSearchableOptions {
    enabled = false
}

patchPluginXml {
    version = project.version
    sinceBuild = '201'
    untilBuild = '201.*'
}

和预期结果

Affiliation
Yes,Yes
Yes,Yes,Yes,Yes
Yes,Yes,Yes
Yes,No,Yes
No,Yes,Yes
Yes,Yes,Yes,Yes,Yes,Yes,Yes
Yes,Yes,No,Yes,Yes,Yes,Yes

任何建议都会有所帮助

1 个答案:

答案 0 :(得分:1)

您可以使用str_count中的stringr来计算每一行"Yes"发生的次数。

library(dplyr)
library(stringr)

df %>% mutate(count = str_count(Affiliation, 'Yes'))

#                  Affiliation count
#1                     Yes,Yes     2
#2             Yes,Yes,Yes,Yes     4
#3                 Yes,Yes,Yes     3
#4                  Yes,No,Yes     2
#5                  No,Yes,Yes     2
#6 Yes,Yes,Yes,Yes,Yes,Yes,Yes     7
#7  Yes,Yes,No,Yes,Yes,Yes,Yes     6

在基数R中,您可以用逗号分割字符串并计数"Yes"

sapply(strsplit(df$Affiliation, ','), function(x) sum(x == 'Yes'))

数据

df <- structure(list(Affiliation = c("Yes,Yes", "Yes,Yes,Yes,Yes", 
"Yes,Yes,Yes", "Yes,No,Yes", "No,Yes,Yes", "Yes,Yes,Yes,Yes,Yes,Yes,Yes", 
"Yes,Yes,No,Yes,Yes,Yes,Yes")), class = "data.frame", row.names = c(NA, -7L))