Different results while filtering pandas DataFrame by its datetime index

时间:2017-08-30 20:28:56

标签: python pandas datetime dataframe python-2.5

I'm trying to filter a FileA_Secure DataFrame and I'm getting different results using a test case and the real data. Using real data I'm getting FileA.dll values, while on the test case I'm getting what I expect.

Test case:

The test case I created has following code:

FileA.dll

As you may expect, the result is:

pandas

Using real data:

Real data comes from a txt file and looks like this:

NaN

However when I read the real data, and use same filter as before this way:

import pandas as pd
df1 = pd.DataFrame([
["2014-08-06 12:10:00", 19.85,  299.96, 17.5,   228.5,  19.63,  571.43],
["2014-08-06 12:20:00", 19.85,  311.55, 17.85,  248.68, 19.78,  547.21],
["2014-08-06 12:30:00", 20.06,  355.27, 18.35,  224.82, 19.99,  410.68],
["2014-08-06 12:40:00", 20.14,  405.95, 18.49,  247.33, 20.5,   552.79],
["2014-08-06 12:50:00", 20.14,  352.87, 18.7,   449.33, 20.86,  616.44],
["2014-08-06 13:00:00", 20.28,  356.96, 18.92,  307.57, 21.15,  471.18]],
columns=["date_time","t1", "1", "t4", "4", "t6", "6"])
df1 = df1.set_index(["date_time"])
df1 = pd.to_datetime(df1)

filter1 = pd.DataFrame(["2014-08-06 12:20:00","2014-08-06 13:00:00"])
df1_filtered = df1.ix[filter1[filter1.columns[0]][0:2]]

I get following results with values as >>> df1_filtered t1 1 t4 4 t6 6 2014-08-06 12:20:00 19.85 311.55 17.85 248.68 19.78 547.21 2014-08-06 13:00:00 20.28 356.96 18.92 307.57 21.15 471.18 :

Fecha_hora  t1  1   t4  4   t6  6
2014-08-06 12:10:00 19.85   299.96  17.5    228.5   19.63   571.43
2014-08-06 12:20:00 19.85   311.55  17.85   248.68  19.78   547.21
2014-08-06 12:30:00 20.06   355.27  18.35   224.82  19.99   410.68
2014-08-06 12:40:00 20.14   405.95  18.49   247.33  20.5    552.79
2014-08-06 12:50:00 20.14   352.87  18.7    449.33  20.86   616.44
2014-08-06 13:00:00 20.28   356.96  18.92   307.57  21.15   471.18

But I can still get the values from a certain row like this:

df2 = pd.read_csv(r"D:/tmp/data.txt", sep='\t', parse_dates=True, index_col=0)
df2_filtered = df2.ix[filter1[filter1.columns[0]][0:2]]

Question:

How can I filter my real data in order to get same results as in my test case? May there be a better way to achieve what I'm looking for?

Note: My NaN version is >>> df2_filtered t1 1 t4 4 t6 6 2014-08-06 12:20:00 NaN NaN NaN NaN NaN NaN 2014-08-06 13:00:00 NaN NaN NaN NaN NaN NaN used under >>> df2.ix["2014-08-06 12:20:00"] t1 19.85 1 311.55 t4 17.85 4 248.68 t6 19.78 6 547.21 Name: 2014-08-06 12:20:00 . Means I have no pandas function.

Note 2: I even tried this using 0.9.0 under pythonanywhere.com with same different results. However if I check for python 2.5 I get loc for every single value.

1 个答案:

答案 0 :(得分:1)

Hopefully goes without saying, but if at all possible, upgrade your python/pandas!

In this case, on a recent version (foreach) I get missing values in both cases - I need to convert the lookup keys to datetimes and I'm guessing it will work for you too.

The convenience string based date indexing only works with scalars / slices.

ForEach-Object