输入
userID col1 col2 col3 col4 col5 col6 col7 col8 col9
1 Java c c++ php python perl html hadoop nodejs
2 nodejs c# c++ oops css html angular java php
3 php python html java angular hadoop c nodejs c#
4 python php css perl hadoop c nodejs c# html
5 perl css python hadoop c nodejs c# java php
6 Java python css perl nodejs c# java php hadoop
7 javascript java perl nodejs angular php mysql hadoop html
8 angular mysql mongodb cs hadoop angular oops html perl
9 nodejs hadoop mysql mongodb angular oops html python java
欲望输出
userID Java C C++ php python perl html hadoop nodejs oops mysql mongo
1 1 1 1 1 1 1 1 1 1 0 0 0
2 1 0 1 1 0 0 1 0 1 0 0 0
3 1 1 0 1 1 1 1 1 1 0 0 0
4 0 0 0 0 1 1 1 0 1 1 1 1
答案 0 :(得分:0)
按列名称使用get_dummies
+ groupby
并汇总max
:
df = pd.get_dummies(df.set_index('userID'), prefix='', prefix_sep='')
df = df.groupby(level=0, axis=1).max().reset_index()
print (df)
userID Java angular c c# c++ cs css hadoop html java javascript \
0 1 1 0 1 0 1 0 0 1 1 0 0
1 2 0 1 0 1 1 0 1 0 1 1 0
2 3 0 1 1 1 0 0 0 1 1 1 0
3 4 0 0 1 1 0 0 1 1 1 0 0
4 5 0 0 1 1 0 0 1 1 0 1 0
5 6 1 0 0 1 0 0 1 1 0 1 0
6 7 0 1 0 0 0 0 0 1 1 1 1
7 8 0 1 0 0 0 1 0 1 1 0 0
8 9 0 1 0 0 0 0 0 1 1 1 0
mongodb mysql nodejs oops perl php python
0 0 0 1 0 1 1 1
1 0 0 1 1 0 1 0
2 0 0 1 0 0 1 1
3 0 0 1 0 1 1 1
4 0 0 1 0 1 1 1
5 0 0 1 0 1 1 1
6 0 1 1 0 1 1 0
7 1 1 0 1 1 0 0
8 1 1 1 1 0 0 1