如何将矩阵的列拆分为两列?

时间:2014-05-21 14:38:25

标签: r

我有矩阵,其中一列有ID,用#34;,#34;进行分配。我只是想把这个cloumn分成两列,每个新列只有一部分ID。 最简单的方法是什么?

我的矩阵是:

> L
     a    u                    
[1,] "10" "mature,MIMAT0000062"
[2,] "20" "stemloop"           
[3,] "40" "mature,MIMAT0000062"

,预期输出为:

> k
     a    u          v             
[1,] "10" "mature"   "MIMAT0000062"
[2,] "20" "stemloop" "NA"          
[3,] "40" "mature"   "MIMAT0000062"
> 

修改

现在我必须根据" NA"列将该矩阵拆分为两个矩阵。价值观,一个与所有" NA"和其他没有" NA"。

输入:

>k
       a    u         v
[1,] "10" "mature"    "MIMAT0000062"
[2,] "20" "stemloop"  "NA"
[3,] "40" "mature_2"  "MIMAT0000043"

输出应该像,

>k1
       a    u         v
[1,] "10" "mature"    "MIMAT0000062"
[2,] "40" "mature_2"  "MIMAT0000043"

>k2
       a    u         v
[1,] "20" "stemloop"  "NA"

5 个答案:

答案 0 :(得分:2)

a function called cSplit速度很快,很容易处理这些类型的问题。

以下是一些正在使用的功能示例,以及一些需要考虑的不同情况:

您现有的样本数据:

M1 <- cbind(a = c(10,20,40), 
            u = c("mature,MIMAT0000062", 
                  "stemloop", "mature,MIMAT0000062"))
cSplit(data.frame(M1), "u", ",")
#     a      u_1          u_2
# 1: 10   mature MIMAT0000062
# 2: 20 stemloop           NA
# 3: 40   mature MIMAT0000062

一开始有逗号的“u”值:

M2 <- cbind(a = c(10,20,40), 
            u = c(",MIMAT0000062", 
                  "stemloop", "mature,MIMAT0000062"))
cSplit(data.frame(M2), "u", ",")
#     a      u_1          u_2
# 1: 10          MIMAT0000062
# 2: 20 stemloop           NA
# 3: 40   mature MIMAT0000062

一个“u”值分为3列:

M3 <- cbind(a = c(10,20,40), 
            u = c("mature,MIMAT0000062", 
                  "stemloop,,something", "mature,MIMAT0000062"))
cSplit(data.frame(M3), "u", ",")
#     a      u_1          u_2       u_3
# 1: 10   mature MIMAT0000062        NA
# 2: 20 stemloop              something
# 3: 40   mature MIMAT0000062        NA

答案 1 :(得分:1)

当值以逗号分隔时,此方法有效:

sep_cols = matrix(unlist(strsplit(as.character(L$u), ",")), ncol = 2)
new_L = cbind(L, sep_cols)

答案 2 :(得分:1)

另一种方式..

a <- c(10,20,40)
u <- c("mature,MIMAT0000062", "stemloop", "mature,MIMAT0000062")
L <- data.frame(a,u)     #better use a data.frame


v <- strsplit(as.character(L$u), ",")
L$u <- sapply(v, `[`, 1)
L$v <- sapply(v, `[`, 2)

> L
#   a        u            v
#1 10   mature MIMAT0000062
#2 20 stemloop         <NA>
#3 40   mature MIMAT0000062

答案 3 :(得分:1)

两个班轮:

L$v =sapply(strsplit(as.character(L$u),","), "[", 2)
L$u =sapply(strsplit(as.character(L$u),","), "[", 1)
#L
#   a        u            v
#1 10   mature MIMAT0000062
#2 20 stemloop         <NA>
#3 40   mature MIMAT0000062

答案 4 :(得分:1)

使用reshape2::colsplit作为joran的另一种选择建议:

library(reshape2)
k = cbind(a =L$a,colsplit(L$u,",",c("u","v")))
#k
#   a   u       v
#1  10  mature  MIMAT0000062
#2  20  stemloop     
#3  40  mature  MIMAT0000062