从集合列表中创建分层pandas数据帧

时间:2014-09-11 13:02:05

标签: python pandas

似乎我需要一些帮助来获得pandas python包的句柄。

我有一张图片列表。每个图像由一组描述符向量描述。这些套装可以有不同的尺寸。我想用pandas存储这些数据。

每个图像都有一组向量,存储为np.ndarray[shape=(N_i, 128), dtype=np.uint8]。 N_i是第i个图像中的矢量数。标记范围(0,N_i)是关于第i个图像的特征索引。

每个图像都由图像ID索引。

因此,每个要素都应该通过图像ID +图像的特征索引进行唯一索引。

在python中,这些对象看起来像:

imageid_list = [1, 2, ...]
vecs_list = [np.ndarray, np.ndarray, ....]

如何将这些信息放入熊猫?我尝试将vecs_list放入一个系列,但它只给了我一个对象列表。我希望熊猫能够更多地了解ndarrays(超出它们是对象的事实)


以下是一些示例代码,可以更好地说明我的问题

    print('\n+----------')
    print('Get a list of image ids (a stands for annotation)')
    aid_list = ibs.get_valid_aids()[0:3]
    print('aid_list = ')
    print(aid_list)
    print('L----------')
    #
    print('\n+----------')
    print('Get the set of vectors from each image/anotation')
    vecs_list = ibs.get_annot_desc(aid_list)
    print('vecs_list = ')
    print(vecs_list)
    print('L----------')
    #__________
    print('\n+----------')
    print('Try using just the list of ndarrays to create a hierarchy (doesnt work)')
    vecs_series = pd.Series(vecs_list, index=aid_list, name='vecs')
    print('vecs_series = ')
    print(vecs_series)
    print('L----------')
    #
    print('\n+----------')
    print('Try mapping each numpy array in the list to a dataframe')
    vecs_dflist = map(pd.DataFrame, vecs_list)
    print('vecs_dflist = ')
    print(vecs_dflist)
    print('L----------')
    #__________
    print('\n+----------')
    print('Try using just the list of dataframes to create a hierarchy (doesnt work)')
    vecs_dfseries = pd.Series(vecs_dflist, index=aid_list, name='vecs')
    print('vecs_dfseries = ')
    print(vecs_dfseries)
    print('L----------')

这会产生此输出

+----------
Get a list of image ids (a stands for annotation)
aid_list = 
[1, 2, 3]
L----------

+----------
Get the set of vectors from each image/anotation
vecs_list = 
[array([[ 19,   0,   0, ..., 106,   4,   0],
       [ 58,   0,   0, ...,  26,   4,   1],
       [ 10,  40,  55, ...,   9,  27,  54],
       ..., 
       [ 78,   0,   0, ...,   7,   5,   8],
       [ 40,  24,   2, ...,   3,   0,   5],
       [ 59,   7,   5, ...,  70,  33,  15]], dtype=uint8), array([[  0,   2,  13, ...,  29,  27,   4],
       [ 29,  21,   8, ...,  11,   5,   7],
       [  1,   1,   2, ...,   0,   4,   3],
       ..., 
       [ 10,  27,  39, ...,  35,  85,  23],
       [  1,  27, 115, ...,  88,   2,   1],
       [ 31,   1,   2, ...,  15,  10,   5]], dtype=uint8), array([[  0,   0,   0, ...,   0,   0,  14],
       [  0,   0,   1, ...,   3,  27, 127],
       [ 16,   8,  18, ...,  25, 123,  23],
       ..., 
       [  5,  52,   6, ...,  21,  87,  31],
       [ 27,  55,  30, ...,  12,  56,  13],
       [ 79,  29,   0, ...,  18,  21,  29]], dtype=uint8)]
L----------

+----------
Try using just the list of ndarrays to create a hierarchy (doesnt work)
vecs_series = 
1    [[19, 0, 0, 1, 3, 0, 0, 36, 2, 0, 0, 12, 117, ...
2    [[0, 2, 13, 1, 3, 37, 64, 0, 3, 18, 29, 1, 4, ...
3    [[0, 0, 0, 0, 7, 33, 30, 2, 12, 0, 0, 2, 7, 30...
Name: vecs, dtype: object
L----------

+----------
Try mapping each numpy array in the list to a dataframe
vecs_dflist = 
[      0    1    2    3    4   ...   123  124  125  126  127
0      19    0    0    1    3 ...     0   12  106    4    0
1      58    0    0    2    2 ...    11   38   26    4    1
2      10   40   55   37   10 ...     0    0    9   27   54
3      71    6    0    0    1 ...     1    3    0    0    0
4       0    0    3    2   82 ...     4    3    0    0   26
...   ...  ...  ...  ...  ... ...   ...  ...  ...  ...  ...
1070    8   14    6   11   14 ...    29   46   15    7   15
1071   23   17    3   19   48 ...     4    7    5   13   52
1072   78    0    0    0    0 ...    11   56    7    5    8
1073   40   24    2   12   42 ...    25   19    3    0    5
1074   59    7    5    2    0 ...     1   21   70   33   15

[1075 rows x 128 columns],       0    1    2    3    4   ...   123  124  125  126  127
0       0    2   13    1    3 ...     0    8   29   27    4
1      29   21    8    2   14 ...    26   23   11    5    7
2       1    1    2    2    3 ...   117   15    0    4    3
3      20   78   43   27   27 ...    20   35   13   25   16
4       8   10    4    0   20 ...   103   23    0    0    0
...   ...  ...  ...  ...  ... ...   ...  ...  ...  ...  ...
1203   40    6    0   22   57 ...    13   21   13   26   13
1204   62    9    6    8   13 ...     3    9   10    3    4
1205   10   27   39   13    3 ...     6   11   35   85   23
1206    1   27  115    7    2 ...    16   81   88    2    1
1207   31    1    2    1    5 ...     2    4   15   10    5

[1208 rows x 128 columns],      0    1    2    3    4   ...   123  124  125  126  127
0      0    0    0    0    7 ...   118    1    0    0   14
1      0    0    1    3    6 ...     0    0    3   27  127
2     16    8   18    1    0 ...     0    0   25  123   23
3      2    0    0    1    4 ...     0    0    0   29   94
4      0   12    8    2    0 ...    11    7    3    3    3
..   ...  ...  ...  ...  ... ...   ...  ...  ...  ...  ...
904    0    3    0    0   72 ...     0    0    0    2   14
905   41   11    1   24   46 ...     0    7   21   71   63
906    5   52    6    0   17 ...     0    0   21   87   31
907   27   55   30    5    5 ...     2    0   12   56   13
908   79   29    0    0  100 ...    11   81   18   21   29

[909 rows x 128 columns]]
L----------

+----------
Try using just the list of dataframes to create a hierarchy (doesnt work)
vecs_dfseries = 
1          0    1    2    3    4   ...   123  124  ...
2          0    1    2    3    4   ...   123  124  ...
3         0    1    2    3    4   ...   123  124  1...
Name: vecs, dtype: object
L----------

0 个答案:

没有答案