我有一个包含3列的数据框:
df
A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5
.
.
.
.
.
round100 test30 testing30
round100 test30 testing31
如何删除列B
和C
的字符串中的数值匹配的行?
答案 0 :(得分:2)
只需提取数字部分并进行比较即可。
NumB = sub("\\D+(\\d+).*", "\\1", DAT$B)
NumC = sub("\\D+(\\d+).*", "\\1", DAT$C)
DAT = DAT[NumB != NumC,]
DAT = read.table(text="A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5",
header=TRUE, stringsAsFactors = FALSE)
答案 1 :(得分:2)
用空字符串替换非数字"\\D"
并比较剩下的内容:
subset(DF, gsub("\\D", "", B) != gsub("\\D", "", C))
在下面的注释:
中给出输入DF
可重复显示的位置
A B C
2 round1 test1 testing2
3 round1 test1 testing3
4 round1 test1 testing4
5 round1 test1 testing5
6 round2 test2 testing1
8 round2 test2 testing3
9 round2 test2 testing4
10 round2 test2 testing5
12 round100 test30 testing31
可重复形式的输入是:
Lines <- "
A B C
round1 test1 testing1
round1 test1 testing2
round1 test1 testing3
round1 test1 testing4
round1 test1 testing5
round2 test2 testing1
round2 test2 testing2
round2 test2 testing3
round2 test2 testing4
round2 test2 testing5
round100 test30 testing30
round100 test30 testing31"
DF <- read.table(text = Lines, header = TRUE)