R中相邻时间戳之间的时间差

时间:2016-01-22 18:18:44

标签: r apply lubridate

我正在寻找一种有效的方法来计算给定日期的不同时间。我的数据如下:

$("#cb1").css("font-weight", "normal");
$("#cb1").css("color", "gray");

我的第一个想法是在R中使用Time_formatted Date Hours_diff 1/20/2016 19:19 1/20/2016 0:46 1/20/2016 18:33 1/20/2016 2:43 1/20/2016 15:50 1/20/2016 1:28 1/20/2016 14:22 1/20/2016 1:50 1/20/2016 12:32 1/20/2016 4:52 1/20/2016 7:40 1/20/2016 0 1/19/2016 23:23 1/19/2016 1/19/2016 23:06 1/19/2016 1/19/2016 22:37 1/19/2016 1/19/2016 21:56 1/19/2016 1/19/2016 21:05 1/19/2016 1/19/2016 17:53 1/19/2016 1/19/2016 17:39 1/19/2016 1/19/2016 17:01 1/19/2016 1/19/2016 15:31 1/19/2016 函数:

ave

但使用此函数会出现明显错误ave(data$Time_formatted, data$Date, FUN=difftime) 。由于数据非常大,循环效率低下。 有任何想法来解决这类问题吗?

2 个答案:

答案 0 :(得分:1)

首先,我准备数据。我确保以适当的格式存储日期和时间,并按Time_formatted列对数据框进行排序:

# convert times to POSIXct, dates to Date
data$Time_formatted <- as.POSIXct(data$Time_formatted, format = "%m/%d/%Y %H:%M")
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
# sort
data <- data[order(data$Time_formatted), ]

然后我将tapply()diff()一起使用来计算分钟数的差异。请注意,我添加了一个额外的零,以便计算每天的第一次,其中时间差是未定义的:

my_diff <- function(x, ...) {
   c(0, diff(x, ...))
}
diffs <- unlist(tapply(data$Time_formatted, data$Date, my_diff))

最后一步是将时差从分钟转换为%H:%M,如下所示(有关formatC()的使用,请参阅this answer):

mins2hm <- function(min) {
   h <- min %/% 60
   m <- min %% 60
   hm <- paste(h, formatC(m, width = 2, flag = 0), sep = ":")
}
data$diffs <- mins2hm(diffs)
data
##         Time_formatted       Date diffs
## 15 2016-01-19 15:31:00 2016-01-19  0:00
## 14 2016-01-19 17:01:00 2016-01-19  1:30
## 13 2016-01-19 17:39:00 2016-01-19  0:38
## 12 2016-01-19 17:53:00 2016-01-19  0:14
## 11 2016-01-19 21:05:00 2016-01-19  3:12
## 10 2016-01-19 21:56:00 2016-01-19  0:51
## 9  2016-01-19 22:37:00 2016-01-19  0:41
## 8  2016-01-19 23:06:00 2016-01-19  0:29
## 7  2016-01-19 23:23:00 2016-01-19  0:17
## 6  2016-01-20 07:40:00 2016-01-20  0:00
## 5  2016-01-20 12:32:00 2016-01-20  4:52
## 4  2016-01-20 14:22:00 2016-01-20  1:50
## 3  2016-01-20 15:50:00 2016-01-20  1:28
## 2  2016-01-20 18:33:00 2016-01-20  2:43
## 1  2016-01-20 19:19:00 2016-01-20  0:46

答案 1 :(得分:1)

以下步骤和输出:

> df<-read.csv("data.txt",header=T,stringsAsFactors=F)
> df$Time_formatted<-strptime(a$Time_formatted,"%m/%d/%Y %H:%M")
> df$Date          <-strptime(a$Date,"%m/%d/%Y")
> df<-df[order(df$Time_formatted,decreasing=T),] #Make sure it is ordered
> df

        Time_formatted       Date
1  2016-01-20 19:19:00 2016-01-20
2  2016-01-20 18:33:00 2016-01-20
3  2016-01-20 15:50:00 2016-01-20
4  2016-01-20 14:22:00 2016-01-20
5  2016-01-20 12:32:00 2016-01-20
6  2016-01-20 07:40:00 2016-01-20
7  2016-01-19 23:23:00 2016-01-19
8  2016-01-19 23:06:00 2016-01-19
9  2016-01-19 22:37:00 2016-01-19
10 2016-01-19 21:56:00 2016-01-19
11 2016-01-19 21:05:00 2016-01-19
12 2016-01-19 17:53:00 2016-01-19
13 2016-01-19 17:39:00 2016-01-19
14 2016-01-19 17:01:00 2016-01-19
15 2016-01-19 15:31:00 2016-01-19


> df$Hours_diff<-c(-diff(df$Time_formatted),0) # calculate time difference
> df[which(diff(df$Date)!=0),"Hours_diff"]<-0  # set the last timepoint in day to 0
> df$Hours_diff<-ifelse(df$Hours_diff>0,paste(floor(df$Hours_diff/60),df$Hours_diff%%60,sep=":"),0)
> df

        Time_formatted       Date Hours_diff
1  2016-01-20 19:19:00 2016-01-20       0:46
2  2016-01-20 18:33:00 2016-01-20       2:43
3  2016-01-20 15:50:00 2016-01-20       1:28
4  2016-01-20 14:22:00 2016-01-20       1:50
5  2016-01-20 12:32:00 2016-01-20       4:52
6  2016-01-20 07:40:00 2016-01-20          0   
7  2016-01-19 23:23:00 2016-01-19       0:17
8  2016-01-19 23:06:00 2016-01-19       0:29
9  2016-01-19 22:37:00 2016-01-19       0:41
10 2016-01-19 21:56:00 2016-01-19       0:51
11 2016-01-19 21:05:00 2016-01-19       3:12
12 2016-01-19 17:53:00 2016-01-19       0:14
13 2016-01-19 17:39:00 2016-01-19       0:38
14 2016-01-19 17:01:00 2016-01-19       1:30
15 2016-01-19 15:31:00 2016-01-19          0