加入两个数据帧

时间:2014-03-06 18:09:33

标签: python join pandas time-series dataframe

我有两个数据集,我进入两个数据框

               NAB.AX                                  CBA.AX
                       Close    Volume                                         Close    
    Date                                                  Date
 2013-10-02 06:52:32   36.51   4962900             2013-10-02 06:52:32.082622  21.95  

正如您所看到的,日期格式略有不同。如何使用日期索引连接两个数据框。所以基本上使用

2013-10-02 06:52:32

忽略

.082622

1 个答案:

答案 0 :(得分:1)

您可以将第二个DataFrame的索引重新分配给numpy datetime64[s]值:

df2.index = df2.index.values.astype('datetime64[s]')

例如,

In [58]: df1 = pd.DataFrame({'Close':36.51}, index=pd.DatetimeIndex(['2013-10-02 06:52:32'])); df1

                     Close
2013-10-02 06:52:32  36.51

In [78]: df2 = pd.DataFrame({'Close':[21.95, 22.95, 23.95]}, index=pd.DatetimeIndex(['2013-10-02 06:52:32.082622', '2013-10-02 06:52:32.09', '2013-10-03 06:52:33.09'])); df2
Out[78]: 
                            Close
2013-10-02 06:52:32.082622  21.95
2013-10-02 06:52:32.090000  22.95
2013-10-03 06:52:33.090000  23.95

In [79]: df2.index = df2.index.values.astype('datetime64[s]'); df2
Out[79]: 
                     Close
2013-10-02 06:52:32  21.95
2013-10-02 06:52:32  22.95
2013-10-03 06:52:33  23.95    

In [80]: df1.join(df2, lsuffix='NAB', rsuffix='_CBA')
Out[80]: 
                     CloseNAB  Close_CBA
2013-10-02 06:52:32     36.51      21.95
2013-10-02 06:52:32     36.51      22.95

或者,如果您希望保留两个索引中的所有键,请使用外部联接:

In [81]: df1.join(df2, lsuffix='NAB', rsuffix='_CBA', how='outer')
Out[81]: 
                     CloseNAB  Close_CBA
2013-10-02 06:52:32     36.51      21.95
2013-10-02 06:52:32     36.51      22.95
2013-10-03 06:52:33       NaN      23.95