对网格数据使用百分位数的子集

时间:2018-11-22 08:55:25

标签: r

我的栅格数据的每日最高温度(K)为24249 obs和963 var。我正在寻找一种在r中选择最高温度高于90%的全天的方法。

> dim(DailyT)
[1] 24249   963
> DailyT[1:4,1:7]
     x    y  1988-05-01 1988-05-02 1988-05-03 1988-05-04 1988-05-05
1 34.000 33   291.7603   291.8044   291.6158   292.9659   293.7032
2 34.125 33   291.7240   291.7951   291.5439   292.9451   293.7017
3 34.250 33   291.6884   291.7866   291.4721   292.9250   293.7001
4 34.375 33   291.6521   291.7781   291.4010   292.9049   293.6986

我这样做了,但是没用

df<- DailyT[DailyT[,3:963] <= quantile(DailyT[,3:963],.9, na.rm = T, type = 6) ] 

1 个答案:

答案 0 :(得分:0)

首先,您需要一个id列,以便以后标识行。然后,计算所有温度值的90%。最后,子集数据超过q any 个行单元格。

DailyT <- cbind(id=rownames(DailyT), DailyT)  # to identify rows later
q <- quantile(as.matrix(DailyT[, -(1:3)]), .9, na.rm = T, type = 6)  # 293.7003
DailyT.q <- DailyT[which(sapply(1:nrow(DailyT), function(x) any(DailyT[x, -(1:2)] >= q))), ]

产量

> DailyT.q
  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05
1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032
2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017

编辑: 要按行获取分位数,请使用apply()

q90 <- apply(DailyT[, 4:8], MARGIN=1, quantile, .9,na.rm = T, type = 6)

> data.frame(DailyT, q90=q90)
  id      x  y X1988.05.01 X1988.05.02 X1988.05.03 X1988.05.04 X1988.05.05      q90
1  1 34.000 33    291.7603    291.8044    291.6158    292.9659    293.7032 293.7032
2  2 34.125 33    291.7240    291.7951    291.5439    292.9451    293.7017 293.7017
3  3 34.250 33    291.6884    291.7866    291.4721    292.9250    293.7001 293.7001
4  4 34.375 33    291.6521    291.7781    291.4010    292.9049    293.6986 293.6986

数据

> dput(DailyT)
structure(list(x = c(34, 34.125, 34.25, 34.375), y = c(33L, 33L, 
                                                       33L, 33L), X1988.05.01 = c(291.7603, 291.724, 291.6884, 291.6521
                                                       ), X1988.05.02 = c(291.8044, 291.7951, 291.7866, 291.7781), X1988.05.03 = c(291.6158, 
                                                                                                                                   291.5439, 291.4721, 291.401), X1988.05.04 = c(292.9659, 292.9451, 
                                                                                                                                                                                 292.925, 292.9049), X1988.05.05 = c(293.7032, 293.7017, 293.7001, 
                                                                                                                                                                                                                     293.6986)), class = "data.frame", row.names = c(NA, -4L))
相关问题