为什么不在数据框中删除带零的行?

时间:2015-03-03 11:05:45

标签: r

我有以下数据框:

dat <- structure(list(V1 = structure(c(11L, 11L, 11L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), .Label = c("XXX_LN_06.ID", 
"xxx_LN_06.ID", "aaa_LN_06.ID", "bbb_LN_06.ID", "ccc_LN_06.ID", 
"ddd_LN_06.ID", "eee_LN_06.ID", "fff_LN_06.ID", "ggg_LN_06.IN", 
"hhh_LN_06.ID", "iii_LN_06.ID", "jjj_LN_06.ID", "kkk_LN_06.ID", 
"lll_LN_06.ID", "mmm_LN_06.ID", "nnn_LN_06.ID", "ooo_LN_06.ID", 
"ppp_LN_06.ID", "qqq_IC_LN_06.ID", "rrr_LN_06.ID", "sss_LN_06.ID"
), class = "factor"), V2 = structure(c(1L, 2L, 3L, 4L, 5L, 6L, 
7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L), .Label = c("Bcells", 
"DendriticCells", "Macrophages", "Monocytes", "NKCells", "Neutrophils", 
"StemCells", "StromalCells", "abTcells", "gdTCells"), class = "factor"), 
    V3 = c(4474.2737, 5893.97307, 9414.21112, 5743.65136, 4100.84016, 
    7280.7078, 5317.92682, 11905.14762, 4697.03516, 4661.754, 
    0, 0, 0, 0, 0, 0, 0, 0, 0, 0), V4 = c(1.501, 1.978, 3.159, 
    1.927, 1.376, 2.443, 1.785, 3.995, 1.576, 1.564, 0, 0, 0, 
    0, 0, 0, 0, 0, 0, 0)), .Names = c("V1", "V2", "V3", "V4"), row.names = 191:210, class = "data.frame")

显示为:

> dat
              V1             V2        V3    V4
191 iii_LN_06.ID         Bcells  4474.274 1.501
192 iii_LN_06.ID DendriticCells  5893.973 1.978
193 iii_LN_06.ID    Macrophages  9414.211 3.159
194 iii_LN_06.ID      Monocytes  5743.651 1.927
195 iii_LN_06.ID        NKCells  4100.840 1.376
196 iii_LN_06.ID    Neutrophils  7280.708 2.443
197 iii_LN_06.ID      StemCells  5317.927 1.785
198 iii_LN_06.ID   StromalCells 11905.148 3.995
199 iii_LN_06.ID       abTcells  4697.035 1.576
200 iii_LN_06.ID       gdTCells  4661.754 1.564
201 ggg_LN_06.IN         Bcells     0.000 0.000
202 ggg_LN_06.IN DendriticCells     0.000 0.000
203 ggg_LN_06.IN    Macrophages     0.000 0.000
204 ggg_LN_06.IN      Monocytes     0.000 0.000
205 ggg_LN_06.IN        NKCells     0.000 0.000
206 ggg_LN_06.IN    Neutrophils     0.000 0.000
207 ggg_LN_06.IN      StemCells     0.000 0.000
208 ggg_LN_06.IN   StromalCells     0.000 0.000
209 ggg_LN_06.IN       abTcells     0.000 0.000
210 ggg_LN_06.IN       gdTCells     0.000 0.000

我想要做的是用零删除。 产生

> dat
              V1             V2        V3    V4
191 iii_LN_06.ID         Bcells  4474.274 1.501
192 iii_LN_06.ID DendriticCells  5893.973 1.978
193 iii_LN_06.ID    Macrophages  9414.211 3.159
194 iii_LN_06.ID      Monocytes  5743.651 1.927
195 iii_LN_06.ID        NKCells  4100.840 1.376
196 iii_LN_06.ID    Neutrophils  7280.708 2.443
197 iii_LN_06.ID      StemCells  5317.927 1.785
198 iii_LN_06.ID   StromalCells 11905.148 3.995
199 iii_LN_06.ID       abTcells  4697.035 1.576
200 iii_LN_06.ID       gdTCells  4661.754 1.564

为什么这不起作用?

row_sub = apply(dat, 1, function(row) any(row ==0 ))
dat[row_sub,]

2 个答案:

答案 0 :(得分:4)

你可以试试这个:

a <- which(dat==0, arr.ind=T)

dat[-a[,1],]

或者根据@ David的评论如下:

dat[rowSums(dat == 0L) == 0L, ]

或者:

dat[!rowSums(dat == 0L), ]

输出:

> dat[-a[,1],]
              V1             V2        V3    V4
191 iii_LN_06.ID         Bcells  4474.274 1.501
192 iii_LN_06.ID DendriticCells  5893.973 1.978
193 iii_LN_06.ID    Macrophages  9414.211 3.159
194 iii_LN_06.ID      Monocytes  5743.651 1.927
195 iii_LN_06.ID        NKCells  4100.840 1.376
196 iii_LN_06.ID    Neutrophils  7280.708 2.443
197 iii_LN_06.ID      StemCells  5317.927 1.785
198 iii_LN_06.ID   StromalCells 11905.148 3.995
199 iii_LN_06.ID       abTcells  4697.035 1.576
200 iii_LN_06.ID       gdTCells  4661.754 1.564

你的问题:

在你的情况下,row_sub是一个仅为FALSE的向量,因此它不会返回任何行。在向量为TRUE的情况下返回行。

> row_sub
  191   192   193   194   195   196   197   198   199   200   201   202   203 
FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
  204   205   206   207   208   209   210 
FALSE FALSE FALSE FALSE FALSE FALSE FALSE 

答案 1 :(得分:4)

因为apply首先将数据转换为字符。您可以(并且应该)首先调试这些事情,如下所示:

apply(dat, 1, function(row) { print(str(row)) } )

部分输出是:

NULL
 Named chr [1:4] "ggg_LN_06.IN" "StromalCells" "    0.000" "0.000"
 - attr(*, "names")= chr [1:4] "V1" "V2" "V3" "V4"

你可以很容易地看到它是所有角色。