在python中查找2个日期之间的天数,但只有数字

时间:2021-06-11 05:27:51

标签: python pandas datetime

我试图找出一系列日期和日期之间的差异。例如,该系列是 从 5 月 1 日到 6 月 1 日这是

date = pd.DataFrame()

In [0]: date['test'] = pd.date_range("2021-05-01", "2021-06-01", freq = "D")

Out[0]: date
    test
0   2021-05-01 00:00:00
1   2021-05-02 00:00:00
2   2021-05-03 00:00:00
3   2021-05-04 00:00:00
4   2021-05-05 00:00:00
5   2021-05-06 00:00:00
6   2021-05-07 00:00:00
7   2021-05-08 00:00:00
8   2021-05-09 00:00:00
9   2021-05-10 00:00:00

In[1]
date['test'] = date['test'].dt.date

Out[1]:
    test
0   2021-05-01
1   2021-05-02
2   2021-05-03
3   2021-05-04
4   2021-05-05
5   2021-05-06
6   2021-05-07
7   2021-05-08
8   2021-05-09
9   2021-05-10

In[2]:date['base'] = dt.strptime("2021-05-01",'%Y-%m-%d')

Out[2]:
0   2021-05-01 00:00:00
1   2021-05-01 00:00:00
2   2021-05-01 00:00:00
3   2021-05-01 00:00:00
4   2021-05-01 00:00:00
5   2021-05-01 00:00:00
6   2021-05-01 00:00:00
7   2021-05-01 00:00:00
8   2021-05-01 00:00:00
9   2021-05-01 00:00:00

In[3]:date['base'] = date['base'].dt.date

Out[3]:
    base
0   2021-05-01
1   2021-05-01
2   2021-05-01
3   2021-05-01
4   2021-05-01
5   2021-05-01
6   2021-05-01
7   2021-05-01
8   2021-05-01
9   2021-05-01

In[4]:date['test']-date['base']

Out[4]: 
    diff
0   0 days 00:00:00.000000000
1   1 days 00:00:00.000000000
2   2 days 00:00:00.000000000
3   3 days 00:00:00.000000000
4   4 days 00:00:00.000000000
5   5 days 00:00:00.000000000
6   6 days 00:00:00.000000000
7   7 days 00:00:00.000000000
8   8 days 00:00:00.000000000
9   9 days 00:00:00.000000000
10  10 days 00:00:00.000000000

我唯一能得到的就是这个。我不想要数字 1-10 以外的任何东西,因为我需要它们进行进一步的数值计算,但我无法摆脱它们。另外,我如何构建一个只输出日期而不是后面的 hms 的时间序列?我不想为所有这些手动 .dt.date 并且它有时会搞砸

3 个答案:

答案 0 :(得分:1)

您无需为此创建列 base,只需执行以下操作:

>>> (date['test'] - pd.to_datetime("2021-05-01", format='%Y-%m-%d')).dt.days
0      0
1      1
2      2
3      3
4      4
...
27    27
28    28
29    29
30    30
31    31
Name: test, dtype: int64

答案 1 :(得分:0)

您可以先将时间戳转换为 epoch seconds(它们实际上在内部存储为某个数字,并且可能是纪元秒的一个因素)

使用 pandas datetime to unix timestamp seconds

import pandas as pd
# start df with date column
df = pd.DataFrame({"date": pd.date_range("2021-05-01", "2021-06-01", freq = "D")})
# create a column for datetimes
df["ts"] = (df["date"] - pd.Timestamp("1970-01-01")) // pd.Timedelta("1s")
>>> df
         date          ts
0  2021-05-01  1619827200
1  2021-05-02  1619913600
2  2021-05-03  1620000000
3  2021-05-04  1620086400
...
31 2021-06-01  1622505600

这将允许您在转换回来之前进行整数数学运算

>>> df["days"] = (df["ts"] - min(df["ts"])) // (60*60*24)  # 1 day in seconds
>>> df
         date          ts  days
0  2021-05-01  1619827200     0
1  2021-05-02  1619913600     1
2  2021-05-03  1620000000     2
3  2021-05-04  1620086400     3
...
31 2021-06-01  1622505600    31

答案 2 :(得分:0)

或者,对于简单的基于日的系列,您可以使用索引作为日偏移量(因为 DataFrame 就是这样生成的)!

>>> import pandas as pd
>>> df = pd.DataFrame({"date": pd.date_range("2021-05-01", "2021-06-01", freq = "D")})
>>> df["days"] = df.index
>>> df
         date  days
0  2021-05-01     0
1  2021-05-02     1
2  2021-05-03     2
3  2021-05-04     3
...
31 2021-06-01    31
相关问题