处理闰年并替换新的日期

时间:2016-03-18 12:42:12

标签: python pandas

我在python中的数据框(df)中有一个日期列表,列名为DATE:

0       1998-03-31
1       1998-06-30
2       1998-09-30
3       1998-12-31
4       1999-03-31
5       1999-06-30
6       1999-09-30
7       1999-12-31
8       2000-02-29
9       2000-06-30
10      2000-09-30
11      2000-12-31
12      2001-03-31
13      2001-06-30
14      2001-09-30
Name: DATE, dtype: datetime64[ns]

我想将所有闰年日期XXXX-02-29变为XXXX-02-28。最有效的方法是什么?谢谢。

4 个答案:

答案 0 :(得分:1)

可以使用pd.datetimelambda apply完成此操作:

import pandas as pd

# Make DataFrame
df = pd.DataFrame(
    pd.date_range('1998-02-28', periods=12, freq='6M'), 
    columns=['Date']
)
print 'Original DataFrame:'
print df
print

# Replace feb 29 with feb 28
df['Date'] = df['Date'].apply(
    lambda x: 
        x if x.month != 2 and x.date != 29 
        else pd.datetime(x.year, x.month, 28)
)

print 'Processed DataFrame:'
print df
print
Original DataFrame:
         Date
0  1998-02-28
1  1998-08-31
2  1999-02-28
3  1999-08-31
4  2000-02-29
5  2000-08-31
6  2001-02-28
7  2001-08-31
8  2002-02-28
9  2002-08-31
10 2003-02-28
11 2003-08-31

Processed DataFrame:
         Date
0  1998-02-28
1  1998-08-31
2  1999-02-28
3  1999-08-31
4  2000-02-28
5  2000-08-31
6  2001-02-28
7  2001-08-31
8  2002-02-28
9  2002-08-31
10 2003-02-28
11 2003-08-31

答案 1 :(得分:0)

您可以检查年份是否为闰年,然后检查是否有02-29天。

if year % 4 == 0 and year % 100 != 0 or year % 400 == 0:
    # day/month check

答案 2 :(得分:0)

您可以使用wheremask来尝试map

commit
import pandas as pd
import datetime as datetime

def is_leap_and_29Feb(s):
    return (s.dt.year % 4 == 0) & 
           ((s.dt.year % 100 != 0) | (s.dt.year % 400 == 0)) & 
           (s.dt.month == 2) & (s.dt.day == 29)

mask = is_leap_and_29Feb(df.DATE)
print mask  
0     False
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8      True
9     False
10    False
11    False
12    False
13    False
14    False
Name: DATE, dtype: bool

答案 3 :(得分:0)

因为只有闰年才有29-2:

def _292(date): return (date.month==2) & (date.day==29)
df['DATE'][df['DATE'].apply(_292)]-=pd.Timedelta('1D') # yesterday
相关问题