pandas:MultiIndex切片 - 混合切片和列表

时间:2015-10-09 20:29:03

标签: python pandas slice multi-index

我试图在pandas中使用(不是真的)新的切片操作符,但有些东西我还没有得到。假设我生成以下分层数据帧:

#Generate container to hold component DFs
df_list=[]

#Generate names for third dimension positions
third_names=['front','middle','back']

#For three positions in the third dimension...
for lab in third_names:
    #...generate the corresponding section of raw data...
    d=DataFrame(np.random.uniform(size=20).reshape(4,5),columns='a b c d e'.split(' '))
    #...name the columns dimension...
    d.columns.name='dim1'
    #...generate second and third dims (to go in index)...
    d['dim2']=['one','two','three','four']
    d['dim3']=lab
    #...set index...
    d.set_index(['dim3','dim2'],inplace=True)
    #...and throw the DF in the container
    df_list.append(d)

#Concatenate component DFs together
d3=pd.concat(df_list)

d3_long=d3.stack().sortlevel(0)

print d3_long

收率:

dim3    dim2   dim1
back    four   a       0.501184
               b       0.627202
               c       0.329643
               d       0.484261
               e       0.884803
        one    a       0.834231
               b       0.918897
               c       0.196537
               d       0.242109
               e       0.860124
        three  a       0.782651
               b       0.998361
               c       0.849685
               d       0.210377
               e       0.866776
        two    a       0.908422
               b       0.737073
               c       0.064402
               d       0.240718
               e       0.044409
front   four   a       0.100877
               b       0.963870
               c       0.254075
               d       0.126556
               e       0.033631
        one    a       0.243552
               b       0.999168
               c       0.752251
               d       0.684718
               e       0.353013
        three  a       0.938928
               b       0.112993
               c       0.615178
               d       0.430318
               e       0.330437
        two    a       0.301921
               b       0.645425
               c       0.464172
               d       0.824765
               e       0.606823
middle  four   a       0.814888
               b       0.228860
               c       0.333184
               d       0.622176
               e       0.151248
        one    a       0.547780
               b       0.592404
               c       0.684111
               d       0.885605
               e       0.601560
        three  a       0.340951
               b       0.839149
               c       0.800098
               d       0.663753
               e       0.215224
        two    a       0.138430
               b       0.917627
               c       0.342968
               d       0.406744
               e       0.822957
dtype: float64

我可以通过我期望的行为获得前两个维度......

print d3_long.loc[(slice('front','middle'),slice('two','four')),:]

收率:

dim3    dim2   dim1
front   four   a       0.100877
               b       0.963870
               c       0.254075
               d       0.126556
               e       0.033631
        one    a       0.243552
               b       0.999168
               c       0.752251
               d       0.684718
               e       0.353013
        three  a       0.938928
               b       0.112993
               c       0.615178
               d       0.430318
               e       0.330437
        two    a       0.301921
               b       0.645425
               c       0.464172
               d       0.824765
               e       0.606823
middle  four   a       0.814888
               b       0.228860
               c       0.333184
               d       0.622176
               e       0.151248
        one    a       0.547780
               b       0.592404
               c       0.684111
               d       0.885605
               e       0.601560
        three  a       0.340951
               b       0.839149
               c       0.800098
               d       0.663753
               e       0.215224
        two    a       0.138430
               b       0.917627
               c       0.342968
               d       0.406744
               e       0.822957
dtype: float64

但是,以下调用会产生完全相同的结果。

d3_long.loc[(slice('front','middle'),slice('two','four'),slice('b','d')),:]

它忽略了MultiIndex的第三级。当我尝试使用列表构造来获取特定位置时......

d3_long.loc[(slice('front','middle'),slice('two','four'),['b','d']),:]

它产生TypeError。有什么想法吗?

1 个答案:

答案 0 :(得分:0)

d3_long实际上是Series,因此您不需要切片器中的最后一个:。请注意,您的第二级slice('two','four')未选择任何内容(它等同于[-1:1])。

但如果你颠倒了订单,它应该给你所期望的。

In [82]: d3_long.loc[slice('front','middle'),slice('four','two'), ['b','d']]
Out[82]: 
dim3    dim2   dim1
front   four   b       0.301573
               d       0.478005
        one    b       0.306292
               d       0.281984
        three  b       0.108174
               d       0.776523
        two    b       0.028694
               d       0.527417
middle  four   b       0.285103
               d       0.647165
        one    b       0.807411
               d       0.309446
        three  b       0.277752
               d       0.939555
        two    b       0.470019
               d       0.447640
dtype: float64