计算表中按月分组的给定两个日期之间的天数

时间:2018-09-17 12:22:21

标签: r

我有一组开始日期和结束日期。我必须计算一个月明智的天数,不包括周末和国定假日。输出数据显示在这里:

代码变得过于复杂,无法给出正确的结果。

我尝试的代码是:

sd ="24-Jan-18"
ed ="4-Mar-18"
sd_m <- ymd(strptime(as.character(sd), format = "%d-%b-%y"))
ed_m <- ymd(strptime(as.character(ed), format = "%d-%b-%y"))
s_m <- format(sd_m, "%b-%Y")
  e_m <-  format(ed_m, "%b-%Y")
  no_months<- (year(ed_m) - year(sd_m)) * 12 + month(ed_m) - month(sd_m) +1
  i = 0
  day_count = as.vector(0)
  e_mon = as.Date(seq(as.yearmon(sd_m),as.yearmon(ed_m),1/12),frac = 1)
  s_mon =   as.Date(seq(as.yearmon(sd_m),as.yearmon(ed_m),1/12),frac = 0)
  day_count[1]= = sum(!weekdays(seq(sd_m ,e_mon[1], "days")) %in% c('Saturday', 'Sunday')) -holiday
  i=2
  for (i in 1:(no_months-1)){
   day_count[i]= sum(!weekdays(seq(s_mon[i], e_mon[i], "days")) %in% c('Saturday', 'Sunday')) -holiday } 
  day_count[no_months] = sum(!weekdays(seq(s_mon[no_months],ed_m, "days")) %in% c('Saturday', 'Sunday')) -holiday

要计算假期,我想写一个无用的for循环。 hol =c("2018-01-26" "2018-05-01" "2018-08-15" "2018-09-13" "2018-10-02" "2018-12-25")

我尝试使用bizdays create.calendar(name ='my_cal', holidays = hol1,weekdays = c('Saturday', 'Sunday'))

但是它给出了一个错误:

bizdays(sd_m,e_mon[1],my_cal)
Error in check_calendar(cal) : object 'my_cal' not found

请帮助构建!

2 个答案:

答案 0 :(得分:1)

Tidyverse方法,使用了一些润滑功能

sd ="2018-01-24"
ed ="2018-03-04"

#create a data.frame with all days from startdata (sd) to end date (ed)
df <- data.frame( dates = seq( as.Date(sd), as.Date(ed), by = "days"))

#create the vector with Holiday-dates
holidays_v <- as.Date( c("2018-01-26", "2018-05-01", "2018-08-15", "2018-09-13", "2018-10-02", "2018-12-25") )

library(tidyverse)

df %>% 
  #filter out all days that are Sundays (wday == 1), or Saturdays (wday == 7), of within the vector with Holidays
  filter( !lubridate::wday( dates ) %in% c(1,7) & !dates %in% holidays_v ) %>%
  #create period to summarise by (here: year-month)
  mutate( period = paste( lubridate::year(dates), formatC(lubridate::month(dates), width = 2, format = "d", flag = "0"), sep = "-") ) %>%
  # group by period
  group_by( period ) %>%
  #... and summarise
  summarise( number = n() )

# # A tibble: 3 x 2
#   period  number
#   <chr>    <int>
# 1 2018-01      5
# 2 2018-02     20
# 3 2018-03      2

答案 1 :(得分:0)

这是一个基本解决方案。使用末尾toLong中“注释”中可重复显示的输入,创建一个数据框d,每个日期有一行,并从中删除周末和节假日。然后按年和月汇总。这将应用于输入的每一行,从而给出L在一起的数据帧列表rbind。最后,将其转换为宽形式。如果格式为yyyy-mm的列名正确,则可以省略最后一行代码。

toLong <- function(row, sd, ed, hol) {
  s <- seq(sd, ed, "day")
  d <- data.frame(row, s, ym = format(s, "%Y-%m"))
  d <- subset(d, ! weekdays(s) %in% c("Saturday", "Sunday"))
  d <- subset(d, ! s %in% hol)
  data.frame(row, sd, ed, aggregate(s ~ ym, d, FUN = length))
}

L <- Map(toLong, 1:nrow(DF), DF$sd, DF$ed, MoreArgs = list(hol = hol))
DF2 <- do.call("rbind", L)
xt <- xtabs(s ~ row + ym, DF2)
DF3 <- cbind(DF, as.data.frame.matrix(xt))
names(DF3)[-(1:2)] <- format(as.Date(paste0(names(DF3)[-(1:2)], "-01")), "%b %Y")

给予:

> DF3
          sd         ed Oct 2018 Nov 2018 Jan 2018 Feb 2018 Mar 2018
1 2018-10-01 2018-11-01       23        1        0        0        0
2 2018-01-24 2018-03-04        0        0        6       20        2

注意

可复制的输入是:

DF <-
  structure(list(sd = structure(c(17805, 17555), class = "Date"), 
    ed = structure(c(17836, 17594), class = "Date")), row.names = c(NA, 
  -2L), class = "data.frame")

hol <- as.Date(c("2018-01-26", "2018-05-01", "2018-08-15", "2018-09-13", 
  "2018-10-02", "2018-12-25"))