pandas.to_datetime:选择哪种格式?

时间:2016-08-07 20:53:51

标签: python python-2.7 datetime pandas datetime-format

我有一个类似的.csv:

"Date","Time","Open","High","Low","Close","Volume"
12/30/2002,0930,0.94,0.94,0.94,0.94,571466

我想用pandas.to_datetime模块转换“时间”列值,但我找不到正确的格式,因为在小时和分钟之间没有分隔符。

有人能帮助我吗?

3 个答案:

答案 0 :(得分:1)

这应该有效,但我不确定是否有更好的方法:

from StringIO import StringIO

fh = StringIO('''"Date","Time","Open","High","Low","Close","Volume"
12/30/2002,0930,0.94,0.94,0.94,0.94,571466''')

df = pd.read_csv(fh, dtype={'Time':object})
df['Timestamp'] = pd.to_datetime(df['Date'] + ' ' + df['Time'])

print df

输出:

         Date  Time  Open  High   Low  Close  Volume           Timestamp
0  12/30/2002  0930  0.94  0.94  0.94   0.94  571466 2002-12-30 09:30:00

答案 1 :(得分:1)

您可以通过指定日期的格式告诉pandas没有分隔符。 %H%M告诉python你有一个没有分隔符的时间。例如,如果您有:的分隔符,则可以使用format='%H:%M'

假设您已将所有内容加载,并且您的数据框已加载为df

from pandas import pandas

# file loading and such

asset['Date'] = pandas.to_datetime(asset['Date'])
asset['Time'] = pandas.DatetimeIndex(pandas.to_datetime(asset['Time'], format = '%H%M')).time

会给你

        Date      Time  Open  High   Low  Close  Volume
0 2002-12-30  09:30:00  0.94  0.94  0.94   0.94  571466

对于Python 3人:

df['Time'] =  pd.to_datetime(df['Time'], format='%H%M').dt.time

会给你

         Date      Time  Open  High   Low  Close  Volume
0  12/30/2002  09:30:00  0.94  0.94  0.94   0.94  571466 

答案 2 :(得分:1)

您可以通过将列表列表传递给datetime param来传递要解析为完整parse_dates的列的列表:

In [6]:
import io
import pandas as pd
t='''"Date","Time","Open","High","Low","Close","Volume"
12/30/2002,0930,0.94,0.94,0.94,0.94,571466'''
df = pd.read_csv(io.StringIO(t), parse_dates=[['Date','Time']], keep_date_col=True)
df

Out[6]:
            Date_Time        Date  Time  Open  High   Low  Close  Volume
0 2002-12-30 09:30:00  12/30/2002  0930  0.94  0.94  0.94   0.94  571466

您可以看到dtypes符合预期:

In [7]:    
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 8 columns):
Date_Time    1 non-null datetime64[ns]
Date         1 non-null object
Time         1 non-null object
Open         1 non-null float64
High         1 non-null float64
Low          1 non-null float64
Close        1 non-null float64
Volume       1 non-null int64
dtypes: datetime64[ns](1), float64(4), int64(1), object(2)
memory usage: 144.0+ bytes