根据百分等级创建虚拟变量

时间:2019-05-08 23:00:51

标签: r

我想为回归创建虚拟变量。因此数据大致如下所示:

Year  Month Price  Volume  Return StockCode
1991  1       10     300     1.2  AAPL
1991  2       11     320     1.3  AAPL
1992  1       23     310     2.1  AMZN
1992  2       22     302     2.3  AMZN

我想基于百分比对价格,成交量和收益率变量进行排名,并为每只股票的每个变量创建各自的虚拟变量。前30%将被分配为1,中40%将被分配为0,后30%将被分配为-1。理想情况下,数据框应如下所示:

Year Month D_Price D_Volume D_Return StockCode
1991  1       -1     -1       -1      AAPL
1991  2       0       1        0      AAPL
1992  1       1       0        0      AMZN
1992  2       0       0        1      AMZN

我尝试过在线寻找资源和库存溢出,但是没有任何示例可以回答我如何解决此问题。感谢任何帮助。谢谢!

2 个答案:

答案 0 :(得分:2)

您可以使用dplyr::percent_rankcut

library(dplyr)

df %>%
  mutate_at(vars(Price, Volume, Return), list(cut = function(x) cut(percent_rank(x), c(-Inf,.3,.7,Inf), labels = c(-1,0,1))))

  Year Month Price Volume Return StockCode Price_cut Volume_cut Return_cut
1 1991     1    10    300    1.2      AAPL        -1         -1         -1
2 1991     2    11    320    1.3      AAPL         0          1          0
3 1992     1    23    310    2.1      AMZN         1          0          0
4 1992     2    22    302    2.3      AMZN         0          0          1

答案 1 :(得分:2)

您还可以使用<html> <head> <title>Table Test</title> <link rel="stylesheet" href="th.css"> </head> <body> <div class="scrollingtable"> <table class="mytable"> <tbody> <tr class="mytitle"> <th class="nameheader">Name</th> <th class="date">Date</th> <th class="city">City</th> <th class="state">State</th> <th class="date">Date</th> <th class="city">City</th> <th class="state">State</th> <th class="date">Date</th> <th class="city">City</th> <th class="state">State</th> <th class="date">Date</th> <th class="city">City</th> <th class="state">State</th> <th class="date">Date</th> <th class="city">City</th> <th class="state">State</th> </tr> <tr class="subtitle"> <td colspan="16">Managers</td> </tr> <tr> <td class="name">John Doe</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Jill Smith</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bob Whitaker</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bill Allec</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr class="subtitle"> <td colspan="16">Developers</td> </tr> <tr> <td class="name">John Doe</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Jill Smith</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bob Whitaker</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bill Allec</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr class="subtitle2"> <td colspan="16">SubStaff</td> </tr> <tr> <td class="name">John Doe</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Jill Smith</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bob Whitaker</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bill Allec</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr class="subtitle"> <td colspan="16">Staff 2</td> </tr> <tr class="subtitle2"> <td colspan="16"></td> </tr> <tr> <td class="name">John Doe</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Jill Smith</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bob Whitaker</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bill Allec</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr class="subtitle"> <td colspan="16">Staff 3</td> </tr> <tr class="subtitle2"> <td colspan="16">SubStaff 3</td> </tr> <tr> <td class="name">John Doe</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Jill Smith</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bob Whitaker</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bill Allec</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr class="subtitle"> <td colspan="16">Staff 4</td> </tr> <tr> <td class="name">John Doe</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Jill Smith</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bob Whitaker</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> <tr> <td class="name">Bill Allec</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> <td class="date">1/2/2015</td> <td class="city">Las Vegas</td> <td class="state">NV</td> </tr> </tbody> </table> </div> </body> </html>sapply中的quantilebase R

初始化data.frame:

stats

设置虚拟变量:

df <- data.frame(Year =c(1991, 1991, 1992, 1992), Month = c(1, 2, 1, 2), Price = c(10, 11, 23, 22), Volume = c(300, 320, 310, 302), Return = c(1.2, 1.3, 2.1, 2.3), StockCode= c('AAPL', 'AAPL', 'AMZN', 'AMZN'))

dummy <- data.frame(sapply(df[c('Price', 'Volume', 'Return')], function(x) { y <- quantile(x, probs=c(0.3, 0.7), type = 7) #0.3 and 0.7 are your cut-off percentiles ifelse(x < y[1], -1, ifelse(x < y[2], 0, 1)) } )) 绑定到您感兴趣的其他列,并重命名列以获得所需的内容:

dummy

希望有帮助!