短字符日期到短日期R.

时间:2017-04-14 20:09:54

标签: r datetime

我有一个df,日期格式如下。

Date                     Year
<chr>                    <dbl>
Sunday, Jul 27           2008
Tuesday, Jul 29          2008
Wednesday, July 31 (1)   2008
Wednesday, July 31 (2)   2008

是否有一种简单的方法可以实现以下格式的列和值?我还想在7月31日的两个日期删除(1)和(2)符号。

Date         Year    Month    Day    Day_of_Week
2008-07-27   2008    07       27     Sunday

2 个答案:

答案 0 :(得分:4)

使用基数R,您可以:

dat <- data.frame(
  Date = c("Sunday, Jul 27" ,"Tuesday, Jul 29", "Wednesday, July 31", "Wednesday, July 31"),
  Year = rep(2008, 4),
  stringsAsFactors = FALSE
)


dts <- as.POSIXlt(paste(dat$Year, dat$Date), format = "%Y %A, %B %d")

POSIXlt提供基于列表的日期/时间参考。要查看它们,请尝试unclass(dts[1])

从这里可以说是学术性的:

dat$Month = 1 + dts$mon # months are 0-based in POSIXlt
dat$Day = dts$mday
dat$Day_of_Week = weekdays(dts)
dat
#                 Date Year Month Day Day_of_Week
# 1     Sunday, Jul 27 2008     7  27      Sunday
# 2    Tuesday, Jul 29 2008     7  29     Tuesday
# 3 Wednesday, July 31 2008     7  31    Thursday
# 4 Wednesday, July 31 2008     7  31    Thursday

答案 1 :(得分:2)

library(dplyr)
library(lubridate)
dat = data_frame(date = c('Sunday, Jul 27','Tuesday, Jul 29', 'Wednesday, July 
31 (1)','Wednesday, July 31 (2)'), year=rep(2008,4))  

dat %>% 
    mutate(date = gsub("\\s*\\([^\\)]+\\)","",as.character(date)),
           date = parse_date_time(date,'A, b! d ')) -> dat1
           year(dat1$date)  <-  dat1$year

# A tibble: 4 × 2
        date  year
      <dttm> <dbl>
1 2008-07-27  2008
2 2008-07-29  2008
3 2008-07-31  2008
4 2008-07-31  2008