如何获取每年的日期之间的最小值,最大值和长度?

时间:2019-06-19 13:33:39

标签: scala date max rdd min

我有一个类型为RDD [String]的rdd作为示例,如下所示:

1990,1990-07-08
1994,1994-06-18
1994,1994-06-18
1994,1994-06-22
1994,1994-06-22
1994,1994-06-26
1994,1994-06-26
1954,1954-06-20
2002,2002-06-26
1954,1954-06-23
2002,2002-06-29
1954,1954-06-16
2002,2002-06-30
...

结果: (1982,52) (2006,64) (1962,32) (1966,32) (1986,52) (2002,64) (1994,52) (1974,38) (1990,52) (2010,64) (1978,38) (1954,26) (2014,64) (1958,35) (1998,64) (1970,32)

I group it nicely, but my problem is this v.size part, I do not know to to calculate that length.

Just to put it in perspective, here are expected results:

It is not a mistake that there is two times for 2002. But ignore that.

1 个答案:

答案 0 :(得分:1)

定义日期格式:

val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd")

和顺序:

implicit val localDateOrdering: Ordering[LocalDate] = Ordering.by(_.toEpochDay)

创建一个接收“ v”并返回MAX(date_of_matching_year)-MIN(date_of_matching_year))=长度(以天为单位)的函数:

def f(v: Iterable[Array[String]]): Int = {
    val parsedDates = v.map(LocalDate.parse(_(1), formatter))
    parsedDates.max.getDayOfYear - parsedDates.min.getDayOfYear

然后将v.size替换为f(v)

相关问题