计算两个两位数年份之间的差异

时间:2017-10-26 07:10:25

标签: r difference

R中有没有简单的方法来计算两列两位数年份之间的差异(仅几年,没有月份/天,因为这里没有必要),以便生成一列年龄?

我对此很新,并且一直在使用'if'语句和代数而没有成功。

数据看起来像这样,但更大:

dat <- data.frame(year1=c("98","99","00","01","02"),
                  year2=c("03","04","05","06","07"))

4 个答案:

答案 0 :(得分:3)

您可以使用strptime()格式%y

dat <- data.frame(year1=c("98","99","00","01","02"),
    year2=c("03","04","05","06","07"),
    stringsAsFactors = F) # You might want to use this as a default!

dat$year1 <- strptime(dat$year1, format = "%y")
dat$year2 <- strptime(dat$year2, format = "%y")

as.vector(difftime(dat$year2,
    dat$year1,
    units = "days"))/365.242
4.999311 5.002163 4.999425 4.999425 4.999425

答案 1 :(得分:2)

格式化为日期,格式化为数字,取之不尽:

do.call(`-`, lapply(dat[1:2], function(x) 
    as.numeric(format(as.Date(x, format="%y"), "%Y"))))
#[1] -5 -5 -5 -5 -5

如果您在1900年代早期拥有旧约会,这可能会遇到无效的情况。根据{{​​1}}:

?strptime

答案 2 :(得分:0)

df$age <- ifelse(df$year2 < df$year1, df$year2 - df$year1 + 100, df$year2 -df$year1)

应该假设year2是某种当前年份而year1是出生年份,并且没有人在1918年之前出生。

示例:

df <- data.frame(year1 = sample(18:99, 1000, replace = T), 
                 year2 = sample(1:99, 1000, replace = T))

> head(df)
  year1 year2
1    27    88
2    41    55
3    90    36
4    81    93
5    56    60
6    27    61

df$age <- ifelse(df$year2 < df$year1, df$year2 - df$year1 + 100, df$year2 -df$year1)

> head(df)
  year1 year2 age
1    73    88  15
2    50    17  67
3    47    41  94
4    54    43  89
5    36    82  46
6    62    85  23

使用您的数据示例:

dat <- data.frame(year1=c("98","99","00","01","02"),
                  year2=c("03","04","05","06","07"))

dat$age <- ifelse(as.numeric(as.character(dat$year2)) < as.numeric(as.character(dat$year1)), 
                  as.numeric(as.character(dat$year2)) - as.numeric(as.character(dat$year1)) + 100, 
                  as.numeric(as.character(dat$year2)) - as.numeric(as.character(dat$year1)))

> dat
  year1 year2 age
1    98    03   5
2    99    04   5
3    00    05   5
4    01    06   5
5    02    07   5

答案 3 :(得分:0)

一种方法是将as.Datedplyr链一起使用:

dat %>%
  mutate(year1 = as.Date(year1, format = "%y"), 
         year2 = as.Date(year2, format = "%y")) %>%
  mutate(age = year2 - year1)

返回:

       year1      year2       age
1 1998-10-26 2003-10-26 1826 days
2 1999-10-26 2004-10-26 1827 days
3 2000-10-26 2005-10-26 1826 days
4 2001-10-26 2006-10-26 1826 days
5 2002-10-26 2007-10-26 1826 days

P.S。它假定两列的默认日期和月份,但它假设两者都相同,因此不会影响差异计算。