包含数组值的子集数据框

时间:2016-02-24 21:47:08

标签: r

我有这个数据框:

'data.frame':   114034 obs. of  4 variables:
 $ Ore     : chr  "01 00" "01 01" "01 02" "01 03" ...
 $ SquareID: chr  "10000" "10000" "10000" "10000" ...
 $ Intens  : num  0.0118 0.00987 0.00538 0.01318 0.00273 ...
 $ Count   : num  69 78 51 86 35 ...

我需要对整个数据帧进行子集化。数据帧的行,其中SquareID等于那些数组值。

Df<-subset.data.frame(Df,Df$SquareID==c(4702,4703,4704,4705,5434,5435,4706,4707,4708,4709,4820,4939,4821,4822,5551,5057,4823,5058,4824,4825,5059,4826,5174,4940,4941,5175,4942,4943,5177,5178,4944,5060,4945,5061,4946,5062,5063,4728,5295,5296,5297,5180,5181,4845,4846,4847,4963,4964,5199,4353,5082,4355,4356,4585,9536,4586,4587,4470,4588,4471,4589,4472,4473,5412,5413,5414,9653,4590,4591,5530,5315,5316,5318)

通过这种方式,我得到了这个警告并得到了错误的结果:

  

较长的物体长度不是较短物体长度的倍数

2 个答案:

答案 0 :(得分:1)

nrussell是对的。
假设您有数据框

> df <- data.frame(SquareId = c("1","2","3","4"), Intens = c(15,30,45,60))
> df
  SquareId Intens
1        1     15
2        2     30
3        3     45
4        4     60

您可以像这样对其进行子集化:

> df <- subset(df, SquareId %in% c("2","3"))
> df
  SquareId Intens
2        2     30
3        3     45

答案 1 :(得分:1)

问题在于行中的逻辑比较:

Df<-subset.data.frame(Df,Df$SquareID==c(4702,4....))

Df$SquareID是长度为114034(数据帧的行数)的向量,而c(4702,4....)是长度为73的向量,因此longer object length is not a multiple of shorter object length。 nrussell建议您需要%in%。

 Df<-subset(Df,Df$SquareID %in% c(4702,4....))