使用'进行聚合和数据表中的次要指标

时间:2017-11-03 10:37:34

标签: r data.table

查看第4个data.table vignette here(二级索引和自动索引),它看起来像示例2f。返回错误的月份标签。

flights <- read.csv(url("https://github.com/arunsrinivasan/flights/wiki/NYCflights14/flights14.csv"))

示例给出:

> head(flights["JFK", max(dep_delay), keyby = month, on = "origin"])

   month   V1
1:     1  881
2:     1 1014
3:     1  920
4:     1 1241
5:     1  853
6:     1  798

但是在不使用二级索引的情况下复制它会产生:

> head(flights[origin == "JFK", max(dep_delay), keyby = month])

   month   V1
1:     1  881
2:     2 1014
3:     3  920
4:     4 1241
5:     5  853
6:     6  798

通过使用dep_delay == 1014

查找行可以看到错误
> flights[month =="1" & dep_delay == 1014]
Empty data.table (0 rows) of 17 cols: year,month,day,dep_time,dep_delay,arr_time...


> flights[month =="2" & dep_delay == 1014]
year month day dep_time dep_delay arr_time arr_delay cancelled carrier tailnum flight origin dest air_time distance hour min
1: 2014     2  21      844      1014     1151      1007         0      DL  N983DL   2459    JFK  MCO      139      944    8  44

这是示例代码中的错误,还是data.table缺陷?

0 个答案:

没有答案