如何按条件计算持续时间(以分钟为单位)?

时间:2017-04-12 02:01:04

标签: r

MY DATASET

我的数据集包括在一周的不同日期(ID)在不同区域(Location)工作的许多人(Day)的开始和结束时间。我的数据集的一个例子如下:

> head(WeekOne, 15)
                 Start              Finish Day     ID Location
1  2017-04-12 00:00:00 2017-04-12 00:02:55  D1 Daniel   Office
2  2017-04-12 00:02:55 2017-04-12 00:06:18  D1 Daniel   Office
3  2017-04-12 00:06:18 2017-04-12 00:08:20  D1 Daniel   OnSite
4  2017-04-12 00:08:20 2017-04-12 00:08:40  D1 Daniel   OnSite
5  2017-04-12 00:08:40 2017-04-12 00:10:11  D1 Daniel   Travel
6  2017-04-12 00:10:11 2017-04-12 00:10:18  D1 Daniel   Travel
7  2017-04-12 00:10:18 2017-04-12 00:17:52  D1 Daniel   Travel
8  2017-04-12 00:17:52 2017-04-12 00:19:00  D1 Daniel   Travel
9  2017-04-12 00:19:00 2017-04-12 00:19:56  D1 Daniel   OnSite
10 2017-04-12 00:19:56 2017-04-12 00:28:48  D1 Daniel   OnSite
11 2017-04-12 00:00:00 2017-04-12 00:03:52  D2 Daniel   OnSite
12 2017-04-12 00:03:52 2017-04-12 00:04:05  D2 Daniel   Office
13 2017-04-12 00:04:05 2017-04-12 00:08:32  D2 Daniel   Office
14 2017-04-12 00:08:32 2017-04-12 00:16:01  D2 Daniel   Travel
15 2017-04-12 00:16:01 2017-04-12 00:25:35  D2 Daniel   OnSite

我希望知道每周IDLocation次花费的总时间(以分钟为单位)。 Day的最高级别是D7,我每周都有一个单独的data.frame。因此,我只需要遍历LocationID

我所尝试的内容

下面的代码,虽然这会以奇怪的格式返回分钟,并且不会考虑在一天内多次访问同一位置。例如,Daniel在OnSite上两次访问D1

WeekOne %>% 
  group_by(ID, Location) %>% 
  summarise(Duration = max(Finish) - min(Start))

我确实考虑过创建一个新列WeekOne$Level来计算Location中的多个和更改。然后我可以迭代每个级别并使用上面的代码。例如:

> head(WeekOne, 15)
                 Start              Finish Day     ID Location Level
1  2017-04-12 00:00:00 2017-04-12 00:02:55  D1 Daniel   Office 1
2  2017-04-12 00:02:55 2017-04-12 00:06:18  D1 Daniel   Office 1
3  2017-04-12 00:06:18 2017-04-12 00:08:20  D1 Daniel   OnSite 2
4  2017-04-12 00:08:20 2017-04-12 00:08:40  D1 Daniel   OnSite 2
5  2017-04-12 00:08:40 2017-04-12 00:10:11  D1 Daniel   Travel 3
6  2017-04-12 00:10:11 2017-04-12 00:10:18  D1 Daniel   Travel 3
7  2017-04-12 00:10:18 2017-04-12 00:17:52  D1 Daniel   Travel 3
8  2017-04-12 00:17:52 2017-04-12 00:19:00  D1 Daniel   Travel 3
9  2017-04-12 00:19:00 2017-04-12 00:19:56  D1 Daniel   OnSite 4
10 2017-04-12 00:19:56 2017-04-12 00:28:48  D1 Daniel   OnSite 4
11 2017-04-12 00:00:00 2017-04-12 00:03:52  D2 Daniel   OnSite 5 
12 2017-04-12 00:03:52 2017-04-12 00:04:05  D2 Daniel   Office 6
13 2017-04-12 00:04:05 2017-04-12 00:08:32  D2 Daniel   Office 6
14 2017-04-12 00:08:32 2017-04-12 00:16:01  D2 Daniel   Travel 7
15 2017-04-12 00:16:01 2017-04-12 00:25:35  D2 Daniel   OnSite 8

WeekOne %>% 
  group_by(ID, Level) %>% 
  summarise(Duration = max(Finish) - min(Start))

但是,我不确定如何添加此列,它不考虑Location,似乎很麻烦,并且无法解决以有趣格式返回分钟的问题。

我的问题

如何随着时间的推移快速轻松地计算每个Location ID的总持续时间?我希望持续时间以分钟为单位,四舍五入到最接近的分钟。例如:3分钟。

1 个答案:

答案 0 :(得分:1)

您希望先计算持续时间,然后按ID和位置获取总和:

WeekOne %>% 
      mutate(Duration = Finish - Start) %>%
      group_by(ID, Location) %>% 
      summarize(Total_Duration = round(sum(Duration) / 60, 1))