嗨,我正在尝试使用以下代码将33个变量减少为一个指标(我知道这样做效率极低:
<?php
header('Cache-Control: no-store');
header('Content-Type: text/javascript');
if ($_GET['callback']=='') {
echo 'alert("Error: A callback function must be specified.")';
}
elseif (!isset($_GET['cookieName'])) {// Cookie not set yet
$cookieName = strtr((string)$_SERVER['UNIQUE_ID'], '@', '_');
while (isset($_COOKIE[$cookieName]) || $cookieName=='') {
$cookieName = dechex(mt_rand());// Get random cookie name
}
setcookie($cookieName, '3rd-party', 0, '/');
header('Location: '.$_SERVER['REQUEST_URI'].'&cookieName='.$cookieName);
}
elseif ($_COOKIE[$_GET['cookieName']]=='3rd-party') {// Third party cookies are enabled.
setcookie($_GET['cookieName'], '', -1, '/'); // delete cookie
echo $_GET['callback'].'(1)';
}
else {// Third party cookies are not enabled.
echo $_GET['callback'].'(0)';
}
我该如何遍历这11个变量,以便对以特定数字“ _1”结尾的每个变量进行度量,然后从1:11将这些数字求和以得到1个指标?
我如何遍历所有以数字结尾的变量,将它们加在一起以创建以数字结尾的新变量,然后将所有这些变量的11个总和汇总为一个指标?
当前数据帧的示例:
data_indicator <- data %>%
mutate(plot_1=(farm_sell_1+farm_lease_1+farm_bequeath_1)/3, na.rm=T) %>%
mutate(plot_sec_1=ifelse(plot_1>.5, 1, 0)) %>%
mutate(plot_2=(farm_sell_2+farm_lease_2+farm_bequeath_2)/3, na.rm=T) %>%
mutate(plot_sec_2=ifelse(plot_2>.5, 1, 0)) %>%
mutate(plot_3=(farm_sell_3+farm_lease_3+farm_bequeath_3)/3, na.rm=T) %>%
mutate(plot_sec_3=ifelse(plot_3>.5, 1, 0)) %>%
mutate(plot_4=(farm_sell_4+farm_lease_4+farm_bequeath_4)/3, na.rm=T) %>%
mutate(plot_sec_4=ifelse(plot_4>.5, 1, 0)) %>%
mutate(plot_5=(farm_sell_5+farm_lease_5+farm_bequeath_5)/3, na.rm=T) %>%
mutate(plot_sec_5=ifelse(plot_5>.5, 1, 0)) %>%
mutate(plot_6=(farm_sell_6+farm_lease_6+farm_bequeath_6)/3, na.rm=T) %>%
mutate(plot_sec_6=ifelse(plot_6>.5, 1, 0)) %>%
mutate(plot_7=(farm_sell_7+farm_lease_7+farm_bequeath_7)/3, na.rm=T) %>%
mutate(plot_sec_7=ifelse(plot_7>.5, 1, 0)) %>%
mutate(plot_8=(farm_sell_8+farm_lease_8+farm_bequeath_8)/3, na.rm=T) %>%
mutate(plot_sec_8=ifelse(plot_8>.5, 1, 0)) %>%
mutate(plot_9=(farm_sell_9+farm_lease_9+farm_bequeath_9)/3, na.rm=T) %>%
mutate(plot_sec_9=ifelse(plot_9>.5, 1, 0)) %>%
mutate(plot_10=(farm_sell_10+farm_lease_10+farm_bequeath_10)/3, na.rm=T) %>%
mutate(plot_sec_10=ifelse(plot_10>.5, 1, 0)) %>%
mutate(plot_11=(farm_sell_11+farm_lease_11+farm_bequeath_11)/3, na.rm=T) %>%
mutate(plot_sec_11=ifelse(plot_11>.5, 1, 0)) %>%
mutate(num_plots_sec = plot_sec_1+plot_sec_2+plot_sec_3+plot_sec_4+plot_sec_5+plot_sec_6+plot_sec_7+plot_sec_8+plot_sec_9+plot_sec_10+plot_sec_11, na.rm=T)
答案 0 :(得分:0)
使此方法更简洁的一种方法是将这些数据转换为长数据。有关宽幅还是长幅的更多信息,请访问以下站点:https://uc-r.github.io/tidyr
我将逐步介绍该过程,以便您了解其工作原理,然后包含最后一次全部完成的少量代码。
首先要使用一些虚假数据:
fake.data <- data.frame(matrix(data = rbinom(1650, 1, 0.5), nrow = 50, ncol = 33))
colnames(fake.data) <- c(paste0("farm_sell_", 1:11), paste0("farm_lease_", 1:11),
paste0("farm_bequeath_", 1:11))
上面看起来像你的
'data.frame': 50 obs. of 33 variables:
$ farm_sell_1 : int 0 0 0 1 0 1 0 0 1 1 ...
$ farm_sell_2 : int 0 0 1 1 1 1 1 1 0 0 ...
$ farm_sell_3 : int 1 0 0 0 1 1 0 1 0 0 ...
$ farm_sell_4 : int 1 1 1 0 1 0 0 0 1 1 ...
$ farm_sell_5 : int 1 1 0 0 1 0 0 0 1 1 ...
$ farm_sell_6 : int 0 1 0 0 0 0 0 0 0 0 ...
$ farm_sell_7 : int 1 0 1 1 0 0 0 1 0 1 ...
$ farm_sell_8 : int 0 0 1 0 0 1 1 0 1 0 ...
$ farm_sell_9 : int 1 1 1 0 0 0 1 1 1 1 ...
$ farm_sell_10 : int 1 1 0 0 1 0 1 1 0 0 ...
$ farm_sell_11 : int 0 0 0 0 1 1 1 0 0 0 ...
$ farm_lease_1 : int 0 0 1 1 0 0 1 0 1 0 ...
$ farm_lease_2 : int 0 0 0 1 1 1 1 1 1 0 ...
$ farm_lease_3 : int 0 1 1 1 0 1 1 1 0 0 ...
$ farm_lease_4 : int 1 0 1 1 0 1 1 1 1 1 ...
$ farm_lease_5 : int 0 0 0 0 1 1 0 1 0 1 ...
$ farm_lease_6 : int 0 1 1 0 1 1 0 0 1 1 ...
$ farm_lease_7 : int 0 0 0 1 1 1 0 1 1 1 ...
$ farm_lease_8 : int 0 1 0 1 0 0 1 0 1 0 ...
$ farm_lease_9 : int 0 0 1 0 0 1 0 0 1 1 ...
$ farm_lease_10 : int 1 1 1 1 0 1 1 1 0 1 ...
$ farm_lease_11 : int 1 0 0 1 1 0 0 0 1 1 ...
$ farm_bequeath_1 : int 1 1 1 0 0 1 1 1 0 0 ...
$ farm_bequeath_2 : int 0 1 1 0 0 1 1 1 1 1 ...
$ farm_bequeath_3 : int 1 0 0 1 1 0 1 0 0 1 ...
$ farm_bequeath_4 : int 0 0 1 1 1 0 0 0 1 0 ...
$ farm_bequeath_5 : int 1 1 0 0 0 0 0 1 1 0 ...
$ farm_bequeath_6 : int 0 1 0 0 0 0 1 0 1 1 ...
$ farm_bequeath_7 : int 0 0 0 0 1 0 1 0 0 1 ...
$ farm_bequeath_8 : int 0 1 0 1 1 0 0 1 1 1 ...
$ farm_bequeath_9 : int 0 0 1 1 0 1 1 0 0 1 ...
$ farm_bequeath_10: int 0 0 0 1 1 1 0 0 1 1 ...
$ farm_bequeath_11: int 0 0 0 1 0 0 0 0 0 0 ...
您需要dplyr
和tidyr
软件包来完成所有这些工作。
library(dplyr)
library(tidyr)
然后,我们使用pivot_longer
中的tidyr
使其变长。我在此处添加了一个键,以引用每个指标用于哪个服务器场。稍后我们将需要对其进行分组,但基本上会与原始数据中的行号匹配。
data.long <- fake.data %>%
#add a key to keep track of stuff
mutate(farm_key = 1:n()) %>%
pivot_longer(farm_sell_1:farm_bequeath_11, names_to = "variable", values_to = "value")
这看起来像这样
# A tibble: 6 x 3
farm_key variable value
<int> <chr> <int>
1 1 farm_sell_1 0
2 1 farm_sell_2 0
3 1 farm_sell_3 1
4 1 farm_sell_4 1
接下来,我们使用separate
将您的farm_sell_1等变量拆分为更具机器可读性的内容:
data.long2 <- data.long %>%
tidyr::separate(col = variable, into = c("farm", "var", "var_num"), sep = "_")
得出这样的数据:
# A tibble: 6 x 5
farm_key farm var var_num value
<int> <chr> <chr> <chr> <int>
1 1 farm sell 1 0
2 1 farm sell 2 0
3 1 farm sell 3 1
4 1 farm sell 4 1
然后,我们完成您上面所做的所有添加。首先,我们按var_num分组,然后为每个服务器场添加这些变量。这与添加farm_sell_1 + farm_lease_1 + farm_bequeath_1并除以三一样,就像上面所做的一样。然后,我们通过ifelse
语句计算plot_sec。最后,我们可以为每个服务器场总计这11个索引(每个_1,_2,_3一个索引),从而每个服务器场获得一个索引值。
data.long3 <- data.long2 %>%
group_by(farm_key, var_num) %>%
summarise(plot_val = sum(value, na.rm = T)/3) %>% #same as plot_1, plot_2, etc.
ungroup() %>%
mutate(plot_sec = ifelse(plot_val>0.5,1,0)) %>%
#sum together to get one value for each farm_key
group_by(farm_key) %>%
summarise(num_plots_sec = sum(plot_sec)) %>%
ungroup()
然后数据如下:
# A tibble: 6 x 2
farm_key num_plots_sec
<int> <dbl>
1 1 4
2 2 4
3 3 4
4 4 8
5 5 7
并且如所承诺的,一大堆代码可以一次完成所有操作:
data.one.ind <- fake.data %>%
#add a key to keep track of stuff
mutate(farm_key = 1:n()) %>%
pivot_longer(farm_sell_1:farm_bequeath_11, names_to = "variable", values_to = "value") %>%
tidyr::separate(col = variable, into = c("farm", "var", "var_num"), sep = "_") %>%
group_by(farm_key, var_num) %>%
summarise(plot_val = sum(value, na.rm = T)/3) %>% #same as plot_1, plot_2, etc.
ungroup() %>%
mutate(plot_sec = ifelse(plot_val>0.5,1,0)) %>%
#sum together to get one value for each farm_key
group_by(farm_key) %>%
summarise(num_plots_sec = sum(plot_sec)) %>%
ungroup()
总而言之,它实际上可能不会为您节省太多的键入时间。但是它更适应变量的变化。