如何创建包含从其他列计算的百分比数据的新列?

时间:2014-03-06 17:13:33

标签: r

请原谅非常新手的问题,但我正在尝试在包含基于其他列的百分比的数据框中创建新列。例如,我正在使用的数据类似于以下内容,其中该列是二进制因子(即存在或不存在“那个”),动词列是单个动词(即动词可能会也可能不会跟随“那个”),Freq列表示每个动词的频率。

     That    Verb Freq
1    That believe    3
2  NoThat   think    4
3    That     say    3
4    That believe    3
5    That   think    4
6  NoThat     say    3
7  NoThat believe    3
8  NoThat   think    4
9    That     say    3
10 NoThat   think    4

我想要的是添加另一列,为每个不同的动词提供“that”表达式的总体速率(编码为“That”)。如下所示:

     That    Verb Freq Perc.That
1    That believe    3      33.3
2  NoThat   think    4      25.0
3    That     say    3      33.3
4    That believe    3      33.3
5    That   think    4      25.0
6  NoThat     say    3      33.3
7  NoThat believe    3      33.3
8  NoThat   think    4      25.0
9    That     say    3      33.3
10 NoThat   think    4      25.0

可能我在其他地方错过了类似的问题。如果是的话,我道歉。不过,请提前感谢您的帮助。

1 个答案:

答案 0 :(得分:1)

您想使用ddply库中的plyr功能:

#install.packages('plyr')
library(plyr)

dat # your data frame

ddply(dat, .(verb), transform, perc.that = freq/sum(freq))

#     that    verb freq perc.that
#1    That believe    3 0.3333333
#2    That believe    3 0.3333333
#3  NoThat believe    3 0.3333333
#4    That     say    3 0.3333333
#...