将变量添加到R中的数据帧并按所述变量排序

时间:2020-10-02 15:44:38

标签: r dplyr

我正在使用; #Warn ; Enable warnings to assist with detecting common errors. SendMode Input ; Recommended for new scripts due to its superior speed and reliability. SetWorkingDir %A_ScriptDir% ; Ensures a consistent starting directory. #SingleInstance, force ^T:: hookProcAdr := RegisterCallback("HookProc") hHook := SetWinEventHook(0x800B,0x800B,0,hookProcAdr,0,0,0) ; EVENT_OBJECT_LOCATIONCHANGE Gui, +hwndChildhWnd +AlwaysOnTop Gui, add, text,,some text in a small gui that will move around with a notepad window Gui, add, Button,,Button MainhWnd := WinExist() ;<---------what do i do here? ;--------------------------------------------; ;-----i commented my second problem here-----; ;--------------------------------------------; WinGetPos, mX, mY, mW, mH, ahk_id %MainhWnd% ;WinGetPos cX, cY, cW, cH, ahk_id %ChildhWnd% <--------{why cant i put child window here-+ ; | offset2 := (mw / 2) ; - (cw /2) <------------------------with offset subtraction here <-+ cX := mX + offset2 cY := mY Gui, show, x%cX% y%cY% return ;-------------------; ;-----functions-----; ;-------------------; HookProc(hWinEventHook, event, hwnd) { global MainHwnd, ChildhWnd if (hwnd = MainHwnd) { SetWinDelay, -1 WinGetPos hX, hY, hW, hH, ahk_id %MainhWnd% WinGetPos cX, cY, cW, cH, ahk_id %ChildhWnd% offset1 := (hw / 2) - (cw / 2) X := hX + offset1 Y := hY WinMove ahk_id %ChildhWnd%,,X,Y,w%cw%,h%ch% } } SetWinEventHook(eventMin, eventMax, hmodWinEventProc, lpfnWinEventProc, idProcess, idThread, dwFlags) { DllCall("CoInitialize", "uint", 0) return DllCall("SetWinEventHook", "uint", eventMin, "uint", eventMax, "uint", hmodWinEventProc, "uint", lpfnWinEventProc, "uint", idProcess, "uint", idThread, "uint", dwFlags) }``` 库中的Cigarettes数据框。我正在尝试首先通过dplyr使用mutate函数在数据框中创建一个变量为人均收入(即收入/人口)。然后,我想按州人均个人收入(即州人口)对数据进行排名,以使排名为1的行的人均收入最高。

似乎我可以使用以下方法创建变量:mutate(香烟,increment_population =收入/流行)。虽然在按新的Income_population指定排名时,排名函数似乎不起作用。

有什么建议吗?

2 个答案:

答案 0 :(得分:1)

给出完整的Cigarette数据集(https://github.com/cran/Ecdat/blob/master/data/Cigarette.rda):

library(dplyr)
Cigarette %>%
  mutate(income_population = income / pop) %>%
  arrange(desc(income_population)) %>%
  head(.)
#   state year   cpi     pop   packpc    income   tax   avgprs     taxs income_population
# 1    CT 1995 1.524 3265293 79.47219 104315120 74.00 218.2805 86.35550          31.94663
# 2    CT 1994 1.482 3268346 77.62336  99787808 71.00 215.9573 83.22400          30.53159
# 3    CT 1993 1.445 3272325 79.79036  96866464 67.00 214.8885 79.16350          29.60172
# 4    NJ 1995 1.524 7965523 80.37137 233208576 64.00 203.0872 75.49550          29.27725
# 5    CT 1992 1.403 3274997 84.24435  93778704 63.75 209.2263 75.59300          28.63475
# 6    MA 1995 1.524 6062335 76.62064 170051568 75.00 217.1050 85.33833          28.05051

更小的数据:

# dput(head(Cigarette))
structure(list(state = structure(1:6, .Label = c("AL", "AR", "AZ", "CA", "CO", "CT", "DE", "FL", "GA", "IA", "ID", "IL", "IN", "KS", "KY", "LA", "MA", "MD", "ME", "MI", "MN", "MO", "MS", "MT", "NC", "ND", "NE", "NH", "NJ", "NM", "NV", "NY", "OH", "OK", "OR", "PA", "RI", "SC", "SD", "TN", "TX", "UT", "VA", "VT", "WA", "WI", "WV", "WY"), class = "factor"), year = c(1985L, 1985L, 1985L, 1985L, 1985L, 1985L), cpi = c(1.07599997520447, 1.07599997520447, 1.07599997520447, 1.07599997520447, 1.07599997520447, 1.07599997520447), pop = c(3973000L, 2327000L, 3184000L, 26444000L, 3209000L, 3201000L), packpc = c(116.486282348633, 128.534591674805, 104.522613525391, 100.363037109375, 112.963539123535, 109.278350830078), income = c(46014968L, 26210736L, 43956936L, 447102816L, 49466672L, 60063368L), tax = c(32.5000038146973, 37, 31, 26, 31, 42), avgprs = c(102.181671142578, 101.474998474121, 108.578750610352, 107.837341308594, 94.2666625976563, 128.024993896484), taxs = c(33.3483352661133, 37, 36.1704177856445, 32.1040000915527, 31, 51.4833335876465)), row.names = c("1", "2", "3", "4", "5", "6"), class = "data.frame")

以及给出简化数据的结果:

head(Cigarette) %>%
  mutate(income_population = income / pop) %>%
  arrange(desc(income_population))
#   state year   cpi      pop   packpc    income  tax    avgprs     taxs income_population
# 1    CT 1985 1.076  3201000 109.2784  60063368 42.0 128.02499 51.48333          18.76394
# 2    CA 1985 1.076 26444000 100.3630 447102816 26.0 107.83734 32.10400          16.90753
# 3    CO 1985 1.076  3209000 112.9635  49466672 31.0  94.26666 31.00000          15.41498
# 4    AZ 1985 1.076  3184000 104.5226  43956936 31.0 108.57875 36.17042          13.80557
# 5    AL 1985 1.076  3973000 116.4863  46014968 32.5 102.18167 33.34834          11.58192
# 6    AR 1985 1.076  2327000 128.5346  26210736 37.0 101.47500 37.00000          11.26375

答案 1 :(得分:0)

假设您实际上要添加一个包含等级的变量,且1为最高等级(为清楚起见,显示的列数少于所有列,并且仅显示前10行)

library(Ecdat)
library(dplyr)

Cigarette %>% 
   mutate(income_population = income/pop) %>% 
   arrange(desc(income_population)) %>% 
   mutate(inc_pop_rank = row_number(-income_population)) %>%
   slice(1:10) %>%
   select(state, year, income_population, inc_pop_rank)

   state year income_population inc_pop_rank
1     CT 1995          31.94663            1
2     CT 1994          30.53159            2
3     CT 1993          29.60172            3
4     NJ 1995          29.27725            4
5     CT 1992          28.63475            5
6     MA 1995          28.05051            6
7     NJ 1994          27.88522            7
8     NY 1995          27.72108            8
9     NJ 1993          27.10118            9
10    MD 1995          26.89587           10

相关问题