我有一个数据重塑问题,我可以使用一些帮助。
ID X1 X2 X3 X4 X5
6001 Certificate Associate Bachelor's Master's Doctoral
5001 Certificate Associate Bachelor's
3311 Certificate Associate Bachelor's
1981 Certificate Associate Bachelor's Master's
4001 Associate Bachelor's Master's
2003 Associate Bachelor's Master's Doctoral
2017 Certificate Associate
1001 Associate Bachelor's Master's
5002 Bachelor's
我需要将这些变成虚拟变量
ID Certificate Associates Bachelor Master Doctoral
6001 1 1 1 1 1
5001 1 1 1 0 0
2017 1 1 0 0 0
有什么建议吗?
答案 0 :(得分:2)
试用reshape2
套餐。我假设您的数据集名为df
:
require(reshape2)
# First, melt your data, using
m.df = melt(df, id.vars="ID")
# Then `cast` it
dcast(m.df, ID ~ value, length)
# ID Var.2 Associate Bachelor's Certificate Doctoral Master's
# 1 1001 2 1 1 0 0 1
# 2 1981 1 1 1 1 0 1
# 3 2003 1 1 1 0 1 1
# 4 2017 3 1 0 1 0 0
# 5 3311 2 1 1 1 0 0
# 6 4001 2 1 1 0 0 1
# 7 5001 2 1 1 1 0 0
# 8 5002 4 0 1 0 0 0
# 9 6001 0 1 1 1 1 1
我还没有对它进行测试,但是如果你按顺序排列因子,它可能会控制输出列的顺序。