Lubridate mdy功能

时间:2016-06-09 21:02:46

标签: r lubridate

我正在尝试转换以下内容并且没有成功使用其中一个日期[1]。 “4/2/10”变为“0010-04-02”。

有没有办法纠正这个?

感谢, 的Vivek

data <- data.frame(initialDiagnose = c("4/2/10","14.01.2009", "9/22/2005", 
        "4/21/2010", "28.01.2010", "09.01.2009", "3/28/2005", 
        "04.01.2005", "04.01.2005", "Created on 9/17/2010", "03 01 2010"))

mdy <- mdy(data$initialDiagnose) 
dmy <- dmy(data$initialDiagnose) 
mdy[is.na(mdy)] <- dmy[is.na(mdy)] # some dates are ambiguous, here we give 
data$initialDiagnose <- mdy        # mdy precedence over dmy
data

   initialDiagnose
1       0010-04-02
2       2009-01-14
3       2005-09-22
4       2010-04-21
5       2010-01-28
6       2009-09-01
7       2005-03-28
8       2005-04-01
9       2005-04-01
10      2010-09-17
11      2010-03-01

1 个答案:

答案 0 :(得分:3)

我认为这是因为<div> <img class="img-valign" src="http://media.cmgdigital.com/shared/img/photos/2016/05/18/0d/5b/image.jpg" alt="" /> <span class="text2"><a href="cnn.com"><strong><u>Restaurant 100</strong></u></a><br><span>This is some text this is some text this is some text. This is some text.</span> <img class="img-valign" src="http://media.cmgdigital.com/shared/img/photos/2016/05/18/0d/5b/image.jpg" alt="" /> <span class="text2"><a href="cnn.com"><strong><u>Restaurant 100</strong></u></a> <img class="img-valign" src="http://media.cmgdigital.com/shared/img/photos/2016/05/18/0d/5b/image.jpg" alt="" /> <span class="text2"><a href="cnn.com"><strong><u>Restaurant 100</strong></u></a> </div>函数更喜欢将年份与mdy()(实际年份)相比%Y(年份的2位数缩写,默认为19XX或20XX)。

但是有一种解决方法。我查看了%ylubridate::parse_date_time)的帮助文件,并在帮助文件的底部附近,添加了一个更喜欢与?parse_date_time格式匹配的参数的示例超过年度的%y格式。帮助文件中的相关代码:

%Y

因此,对于您的示例,您可以调整此代码并将## ** how to use `select_formats` argument ** ## By default %Y has precedence: parse_date_time(c("27-09-13", "27-09-2013"), "dmy") ## [1] "13-09-27 UTC" "2013-09-27 UTC" ## to give priority to %y format, define your own select_format function: my_select <- function(trained){ n_fmts <- nchar(gsub("[^%]", "", names(trained))) + grepl("%y", names(trained))*1.5 names(trained[ which.max(n_fmts) ]) } parse_date_time(c("27-09-13", "27-09-2013"), "dmy", select_formats = my_select) ## '[1] "2013-09-27 UTC" "2013-09-27 UTC" 行替换为:

mdy <- mdy(data$initialDiagnose)

从你的问题中运行剩下的代码行,它给了我这个数据框作为结果:

# Define a select function that prefers %y over %Y. This is copied 
# directly from the help files
my_select <-   function(trained){
  n_fmts <- nchar(gsub("[^%]", "", names(trained))) + grepl("%y", names(trained))*1.5
  names(trained[ which.max(n_fmts) ])
}

# Parse as mdy dates
mdy <- parse_date_time(data$initialDiagnose, "mdy", select_formats = my_select)
# [1] "2010-04-02 UTC" NA               "2005-09-22 UTC" "2010-04-21 UTC" NA              
# [6] "2009-09-01 UTC" "2005-03-28 UTC" "2005-04-01 UTC" "2005-04-01 UTC" "2010-09-17 UTC"
#[11] "2010-03-01 UTC"