计算在类别上花费的时间

时间:2013-04-18 13:19:02

标签: r

我有一个类别字符串,如下所示:

categoryVector <- c("1_100_1_2_3")

我也有与每个类别相对应的时间:

timeVector <- c("2013-03-07 05:16:50,617_2013-03-07 05:19:24,984_2013-03-07 05:21:06,002_2013-03-07 05:21:06,833_2013-03-07 05:21:10,713")  

我想计算在第1类和第2类上花费的时间

Time spent in category 1: (Time in 100 - Time in 1) + (Time on 2 - Time on 1)
Time spent in category 2: Time on 3 - Time on 2

我需要为200K +记录重复这些计算。在R中有没有一种有效的方法呢?

1 个答案:

答案 0 :(得分:0)

 inp <- read.table(text=gsub("_", "\n", timeVector), sep=",")
 inp$V1 <- as.POSIXct(inp$V1)
 inp2 <- read.table(text=gsub("_", "\n", categoryVector))

inp$diffs <- c( difftime(inp$V1[-1], inp$V1[-nrow(inp)]), NA)
inp <- cbind(inp,inp2)
                   V1  V2 diffs  V1
1 2013-03-07 05:16:50 617   154   1
2 2013-03-07 05:19:24 984   102 100
3 2013-03-07 05:21:06   2     0   1
4 2013-03-07 05:21:06 833     4   2
5 2013-03-07 05:21:10 713    NA   3
# should probably rename those columns
 tapply(inp$diffs, inp[,4], sum, na.rm=TRUE)
#  1   2   3 100 
#154   4   0 102