将Python Pandas数据框转换为零一数据帧

时间:2017-05-22 12:57:49

标签: python-2.7 pandas dataframe sklearn-pandas

输入

userID  col1    col2    col3    col4    col5    col6    col7    col8    col9            
1   Java    c   c++ php python  perl    html    hadoop  nodejs          
2   nodejs  c#  c++ oops    css html    angular java    php         
3   php python  html    java    angular hadoop  c   nodejs  c#          
4   python  php css perl    hadoop  c   nodejs  c#  html            
5   perl    css python  hadoop  c   nodejs  c#  java    php         
6   Java    python  css     perl    nodejs  c#  java    php hadoop          
7   javascript  java    perl    nodejs  angular php mysql   hadoop  html            
8   angular mysql   mongodb cs  hadoop  angular oops    html    perl            
9   nodejs  hadoop  mysql   mongodb angular oops    html    python  java

欲望输出

userID  Java    C   C++ php python  perl    html    hadoop  nodejs  oops    mysql   mongo
1   1   1   1   1   1   1   1   1   1   0   0   0
2   1   0   1   1   0   0   1   0   1   0   0   0
3   1   1   0   1   1   1   1   1   1   0   0   0
4   0   0   0   0   1   1   1   0   1   1   1   1

1 个答案:

答案 0 :(得分:0)

按列名称使用get_dummies + groupby并汇总max

df = pd.get_dummies(df.set_index('userID'), prefix='', prefix_sep='')
df = df.groupby(level=0, axis=1).max().reset_index()
print (df)
   userID  Java  angular  c  c#  c++  cs  css  hadoop  html  java  javascript  \
0       1     1        0  1   0    1   0    0       1     1     0           0   
1       2     0        1  0   1    1   0    1       0     1     1           0   
2       3     0        1  1   1    0   0    0       1     1     1           0   
3       4     0        0  1   1    0   0    1       1     1     0           0   
4       5     0        0  1   1    0   0    1       1     0     1           0   
5       6     1        0  0   1    0   0    1       1     0     1           0   
6       7     0        1  0   0    0   0    0       1     1     1           1   
7       8     0        1  0   0    0   1    0       1     1     0           0   
8       9     0        1  0   0    0   0    0       1     1     1           0   

   mongodb  mysql  nodejs  oops  perl  php  python  
0        0      0       1     0     1    1       1  
1        0      0       1     1     0    1       0  
2        0      0       1     0     0    1       1  
3        0      0       1     0     1    1       1  
4        0      0       1     0     1    1       1  
5        0      0       1     0     1    1       1  
6        0      1       1     0     1    1       0  
7        1      1       0     1     1    0       0  
8        1      1       1     1     0    0       1