Python Pandas将多列组合成单列

时间:2017-05-26 15:22:08

标签: python-2.7 pandas

我有一个Python Pandas数据框如下:

movie        unknown action adventure animation fantasy horror romance sci-fi

Toy Story    0       1      1          0        1       0      0       1              
Golden Eye   0       1      0          0        0       0      1       0      
Four Rooms   1       0      0          0        0       0      0       0    
Get Shorty   0       0      0          1        1       0      1       0
Copy Cat     0       0      1          0        0       1      0       0 

我想将电影类型合并为一个单一列。输出将是这样的:

movie       genre

Toy Story   action, adventure, fantasy, sci-fy
Golden Eye  action, romance
Four Rooms  unknown
Get Shorty  animation, fantasy, romance
Copy Cat    adventure, horror

1 个答案:

答案 0 :(得分:2)

你可以这样做:

In [171]: df['genre'] = df.iloc[:, 1:].apply(lambda x: df.iloc[:, 1:].columns[x.astype(bool)].tolist(), axis=1)

In [172]: df
Out[172]:
        movie  unknown  action  adventure  animation  fantasy  horror  romance  sci-fi                                 genre
0   Toy Story        0       1          1          0        1       0        0       1  [action, adventure, fantasy, sci-fi]
1  Golden Eye        0       1          0          0        0       0        1       0                     [action, romance]
2  Four Rooms        1       0          0          0        0       0        0       0                             [unknown]
3  Get Shorty        0       0          0          1        1       0        1       0         [animation, fantasy, romance]
4    Copy Cat        0       0          1          0        0       1        0       0                   [adventure, horror]

PS,但我不明白它对你有什么帮助,与“一个热编码”矩阵相比,我没有看到任何好处

相关问题