我有一个包含两个numpy数组的字典,一个在datetime.dateime,另一个是一个屏蔽的数据数组。我试图用日期时间数组作为DatetimeIndex将它变成Pandas,但我失败了。
In [62]: dict1
Out[62]:
{'filltimes': array([datetime.datetime(2013, 8, 12, 12, 0, 1),
datetime.datetime(2013, 8, 12, 12, 30, 1),
datetime.datetime(2013, 8, 12, 13, 0, 1), ...,
datetime.datetime(2013, 9, 14, 19, 0, 1),
datetime.datetime(2013, 9, 14, 19, 30, 1),
datetime.datetime(2013, 9, 14, 20, 0, 1)], dtype=object),
'fillvals': masked_array(data = [5.553 2.604 2.604 ..., 16.896 17.271 18.022],
mask = [False False False ..., False False False],
fill_value = 1e+20)
}
In [63]: type(dict1)
Out[63]: dict
In [64]: type(dict1['filltimes'])
Out[64]: numpy.ndarray
In [65]: type(dict1['filltimes'][0])
Out[65]: datetime.datetime
In [66]: pd1=pd.DataFrame.from_dict(dict1)
In [67]: type(pd1)
Out[67]: pandas.core.frame.DataFrame
In [68]: type(pd1['filltimes'])
Out[68]: pandas.core.series.Series
In [69]: type(pd1['filltimes'][0])
Out[69]: pandas.tslib.Timestamp
In [70]: pd1.resample('D', how = 'mean')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-70-8ddb5f2158aa> in <module>()
----> 1 pd1.resample('D', how = 'mean')
/Users/andrew/anaconda/lib/python2.7/site-packages/pandas/core/generic.pyc in resample(self, rule, how, axis, fill_method, closed, label, convention, kind, loffset, limit, base)
2777 fill_method=fill_method, convention=convention,
2778 limit=limit, base=base)
-> 2779 return sampler.resample(self).__finalize__(self)
2780
2781 def first(self, offset):
/Users/andrew/anaconda/lib/python2.7/site-packages/pandas/tseries/resample.pyc in resample(self, obj)
99 return obj
100 else: # pragma: no cover
--> 101 raise TypeError('Only valid with DatetimeIndex or PeriodIndex')
102
103 rs_axis = rs._get_axis(self.axis)
TypeError: Only valid with DatetimeIndex or PeriodIndex
In [71]: pd1.reindex(pd.DatetimeIndex(pd.to_datetime(pd1['filltimes'])))
Out[71]:
filltimes fillvals
2013-08-12 12:00:01 NaT NaN
2013-08-12 12:30:01 NaT NaN
2013-08-12 13:00:01 NaT NaN
2013-08-12 13:30:01 NaT NaN
2013-08-12 14:00:01 NaT NaN
2013-08-12 14:30:01 NaT NaN
2013-08-12 15:00:01 NaT NaN
2013-08-12 15:30:01 NaT NaN
2013-08-12 16:00:01 NaT NaN
2013-08-12 16:30:01 NaT NaN
2013-08-12 17:00:01 NaT NaN
2013-08-12 17:30:01 NaT NaN
2013-08-12 18:00:01 NaT NaN
2013-08-12 18:30:01 NaT NaN
2013-08-12 19:00:01 NaT NaN
2013-08-12 19:30:01 NaT NaN
2013-08-12 20:00:01 NaT NaN
2013-08-12 20:30:01 NaT NaN
2013-08-12 21:00:01 NaT NaN
2013-08-12 21:30:01 NaT NaN
2013-08-12 22:00:01 NaT NaN
2013-08-12 22:30:01 NaT NaN
2013-08-12 23:00:01 NaT NaN
2013-08-12 23:30:01 NaT NaN
2013-08-13 00:00:01 NaT NaN
2013-08-13 00:30:01 NaT NaN
2013-08-13 01:00:01 NaT NaN
2013-08-13 01:30:01 NaT NaN
2013-08-13 02:00:01 NaT NaN
2013-08-13 02:30:01 NaT NaN
2013-08-13 03:00:01 NaT NaN
2013-08-13 03:30:01 NaT NaN
2013-08-13 04:00:01 NaT NaN
2013-08-13 04:30:01 NaT NaN
2013-08-13 05:00:01 NaT NaN
2013-08-13 05:30:01 NaT NaN
2013-08-13 06:00:01 NaT NaN
2013-08-13 06:30:01 NaT NaN
2013-08-13 07:00:01 NaT NaN
2013-08-13 07:30:01 NaT NaN
2013-08-13 08:00:01 NaT NaN
2013-08-13 08:30:01 NaT NaN
2013-08-13 09:00:01 NaT NaN
2013-08-13 09:30:01 NaT NaN
2013-08-13 10:00:01 NaT NaN
2013-08-13 10:30:01 NaT NaN
2013-08-13 11:00:01 NaT NaN
2013-08-13 11:30:01 NaT NaN
2013-08-13 12:00:01 NaT NaN
2013-08-13 12:30:01 NaT NaN
2013-08-13 13:00:01 NaT NaN
2013-08-13 13:30:01 NaT NaN
2013-08-13 14:00:01 NaT NaN
2013-08-13 14:30:01 NaT NaN
2013-08-13 15:00:01 NaT NaN
2013-08-13 15:30:01 NaT NaN
2013-08-13 16:00:01 NaT NaN
2013-08-13 16:30:01 NaT NaN
2013-08-13 17:00:01 NaT NaN
2013-08-13 17:30:01 NaT NaN
... ...
[1601 rows x 2 columns]
In [72]:
正如您所看到的,重新索引并不会产生我期望的或至少希望它。不仅所有数据都丢失了,而且还没有保留重新索引:
In [72]: pd1
Out[72]:
filltimes fillvals
0 2013-08-12 12:00:01 5.553
1 2013-08-12 12:30:01 2.604
2 2013-08-12 13:00:01 2.604
3 2013-08-12 13:30:01 2.604
4 2013-08-12 14:00:01 2.101
5 2013-08-12 14:30:01 2.666
6 2013-08-12 15:00:01 3.420
7 2013-08-12 15:30:01 2.666
8 2013-08-12 16:00:01 2.478
9 2013-08-12 16:30:01 2.227
10 2013-08-12 17:00:01 2.729
11 2013-08-12 17:30:01 1.662
12 2013-08-12 18:00:01 2.792
13 2013-08-12 18:30:01 1.599
14 2013-08-12 19:00:01 1.411
15 2013-08-12 19:30:01 1.976
16 2013-08-12 20:00:01 1.536
17 2013-08-12 20:30:01 1.411
18 2013-08-12 21:00:01 1.160
19 2013-08-12 21:30:01 0.720
20 2013-08-12 22:00:01 0.720
21 2013-08-12 22:30:01 1.034
22 2013-08-12 23:00:01 0.783
23 2013-08-12 23:30:01 0.783
24 2013-08-13 00:00:01 0.846
25 2013-08-13 00:30:01 0.720
26 2013-08-13 01:00:01 0.783
27 2013-08-13 01:30:01 0.783
28 2013-08-13 02:00:01 0.595
29 2013-08-13 02:30:01 0.720
30 2013-08-13 03:00:01 1.034
31 2013-08-13 03:30:01 0.720
32 2013-08-13 04:00:01 1.160
33 2013-08-13 04:30:01 1.034
34 2013-08-13 05:00:01 1.599
35 2013-08-13 05:30:01 1.662
36 2013-08-13 06:00:01 1.599
37 2013-08-13 06:30:01 2.227
38 2013-08-13 07:00:01 1.474
39 2013-08-13 07:30:01 4.173
40 2013-08-13 08:00:01 2.855
41 2013-08-13 08:30:01 3.231
42 2013-08-13 09:00:01 3.420
43 2013-08-13 09:30:01 3.420
44 2013-08-13 10:00:01 3.043
45 2013-08-13 10:30:01 3.733
46 2013-08-13 11:00:01 4.675
47 2013-08-13 11:30:01 5.114
48 2013-08-13 12:00:01 5.490
49 2013-08-13 12:30:01 4.612
50 2013-08-13 13:00:01 4.235
51 2013-08-13 13:30:01 3.796
52 2013-08-13 14:00:01 3.545
53 2013-08-13 14:30:01 4.110
54 2013-08-13 15:00:01 3.671
55 2013-08-13 15:30:01 3.169
56 2013-08-13 16:00:01 3.231
57 2013-08-13 16:30:01 3.420
58 2013-08-13 17:00:01 2.792
59 2013-08-13 17:30:01 2.792
... ...
[1601 rows x 2 columns]
In [73]:
我真的很感激一两个指针,我在这里错了。熊猫的这整个日期/时间事情正在推动着它。