如果您能为我的问题分享一些帮助,那就太好了。基本上我的数据集有点不同。看起来如下。
1 2
1 [34, 67], [17, 76] [17, 76], , , , , ,
我想摆脱“ [”,“]”和多余的“,”,并做一个数字向量。
理想情况下,其外观应如下所示
1 2
1 "[34, 67]", "[17, 76]" "[17, 76]"
或
1 2
1 "34, 67", "17, 76" "17, 76"
我尝试以下
a=trimws(df[1,1])
a=unlist(strsplit(a, split=", "))
,但返回“ [34”,“ 67””,“ [17”,“ 76]”。有没有简单的方法可以做到这一点?
这是我从dput()获得的示例:
structure(list(rse1e = structure(c(3L, 7L), .Label = c("", ", , , , , , ",
"[118, 25], [17, 76], [56, 56], [34, 67], , , ", "[17, 76], , , , , , ",
"[34, 67], [118, 25], [17, 76], [0, 84], [84, 42], [56, 56], [151, 8]",
"[34, 67], [168, 0], , , , , ", "[56, 56], [0, 84], [34, 67], [168, 0], [151, 8], , ",
"[56, 56], [118, 25], [0, 84], , , , ", "{\"ImportId\":\"rse1e\"}",
"rse1e"), class = "factor"), rse2e = structure(6:7, .Label = c("",
", , , , , , , ", "[0, 54], [173, 11], [22, 49], [108, 27], [86, 32], [43, 43], [130, 22], [216, 0]",
"[108, 27], [0, 54], , , , , , ", "[151, 16], [216, 0], [108, 27], , , , , ",
"[22, 49], [108, 27], [86, 32], [151, 16], , , , ", "[43, 43], [108, 27], [173, 11], [130, 22], [0, 54], , , ",
"[86, 32], , , , , , , ", "{\"ImportId\":\"rse2e\"}", "rse2e"
), class = "factor")), row.names = 15:16, class = "data.frame")
答案 0 :(得分:1)
不太确定您的数据是什么样子,但是可以像这样删除括号并按|
进行拆分:
f <- "1 [34, 67], [17, 76] | [17, 76]"
[1] "1 [34, 67], [17, 76] | [17, 76]"
# remove the brackets
gsub("\\[|\\]", "", f)
[1] "1 34, 67, 17, 76 | 17, 76"
# split by |, we need unlist here since strsplit() returns a list
unlist(strsplit(a, "(?<=[|])", perl = TRUE))
[1] "1 34, 67, 17, 76 |" " 17, 76"
如果您不想保留|
作为分隔符,则可以执行以下操作:
unlist(strsplit(a, "[|]", perl = TRUE))
[1] "1 34, 67, 17, 76 " " 17, 76"
答案 1 :(得分:0)
您可以尝试
df[]<-trimws(gsub("\\[|\\]|,","",as.matrix(df)))
如此
> df
rse1e rse2e
15 118 25 17 76 56 56 34 67 22 49 108 27 86 32 151 16
16 56 56 0 84 34 67 168 0 151 8 43 43 108 27 173 11 130 22 0 54
编辑: 用括号将字符串分割
s <- "[34, 67], [118, 25], [17, 76], [0, 84], [84, 42], [56, 56], [151, 8]"
> unlist(regmatches(s,gregexpr("\\[.*?\\]",s)))
[1] "[34, 67]" "[118, 25]" "[17, 76]" "[0, 84]" "[84, 42]" "[56, 56]" "[151, 8]"
答案 2 :(得分:0)
我们还可以删除所有不是数字的空格字符。
df[] <- trimws(gsub('\\D', ' ', unlist(df)))
要获得不同列的输出,我们可以使用cSplit
splitstackshape::cSplit(df, names(df), sep = " ")