我遇到了循环问题。它应该很容易解决,但“St for Stata用户”(我在Stata编码了几年),Roger Peng的视频和谷歌似乎没有帮助我。你们其中一个可以向我解释我做错了什么吗?
我正在尝试编写一个贯穿'thresholds'数据帧的循环来从三组列中提取信息。我可以通过三次编写相同的代码段来做我想做的事情,但随着代码变得越来越复杂,这将变得非常麻烦。
以下是“阈值”的示例(请参阅下面的dput
输出,由友好的读者添加):
threshold_1_name threshold_1_dir threshold_1_value
1 overweight > 25
2 possible malnutrition < 31
3 Q1 > 998
4 Q1 > 998
5 Q1 > 998
6 Q1 > 998
threshold_1_units threshold_2_name threshold_2_dir threshold_2_value threshold_2_units
1 kg/m^2 obese > 30 kg/m^2
2 cm <NA> > NA
3 <NA> Q3 > 998
4 Q3 > 998
5 Q3 > 998
6 Q3 > 998
此代码执行我想要执行的操作:
newvars1 <- paste(thresholds$varname, thresholds$threshold_1_name, sep = "_")
noval <- is.na(thresholds$threshold_1_value)
newvars1 <- newvars1[!noval]
newvars2 <- paste(thresholds$varname, thresholds$threshold_2_name, sep = "_")
noval <- is.na(thresholds$threshold_2_value)
newvars2 <- newvars2[!noval]
newvars3 <- paste(thresholds$varname, thresholds$threshold_3_name, sep = "_")
noval <- is.na(thresholds$threshold_3_value)
newvars3 <- newvars3[!noval]
以下是我试图循环的方式:
variables <- NULL
for (i in 1:3) {
valuevar <- paste("threshold", i, "value", sep = "_")
namevar <- paste("threshold", i, "name", sep = "_")
newvar <- paste("varnames", i, sep = "")
for (j in 1:length(thresholds$varname)) {
check <- is.na(thresholds[valuevar[j]])
if (check == FALSE) {
newvars <- paste(thresholds$varname, thresholds[namevar], sep = "_")
}
}
variables <- c(variables, newvars)
}
这是我收到的错误:
Error: unexpected '}' in "}"
我认为我称之为'i'的方式正在搞乱,但我不确定如何正确地做到这一点。当我切换到R时,我使用当地人的Stata习惯真的咬我了。
由友好的读者编辑以添加dput
输出:
thresholds <- structure(list(varname = structure(1:6, .Label = c("varA", "varB",
"varC", "varD", "varE", "varF"), class = "factor"), threshold_1_name = c("overweight",
"possible malnutrition", "Q1", "Q1", "Q1", "Q1"), threshold_1_dir = c(">",
"<", ">", ">", ">", ">"), threshold_1_value = c(25L, 31L, 998L,
998L, 998L, 998L), threshold_1_units = c("kg/m^2", "cm", NA,
NA, NA, NA), threshold_2_name = c("obese", "<NA>", "Q3", "Q3",
"Q3", "Q3"), threshold_2_dir = c(">", ">", ">", ">", ">", ">"
), threshold_2_value = c(30L, NA, 998L, 998L, 998L, 998L), threshold_2_units = c("kg/m^2",
"cm", NA, NA, NA, NA)), .Names = c("varname", "threshold_1_name",
"threshold_1_dir", "threshold_1_value", "threshold_1_units",
"threshold_2_name", "threshold_2_dir", "threshold_2_value", "threshold_2_units"
), row.names = c(NA, -6L), class = "data.frame")
答案 0 :(得分:6)
我看到的第一个问题是if(check = "FALSE")
,如果您正在测试需要=
的条件,那么这是一个作业==
。此外,引用单词"FALSE"
意味着您正在测试字符串值(字面意思为单词FALSE)的变量,而不是逻辑值,即没有引号的FALSE
。
@BlueMagister正确地指出了第二个问题,)
for (j in 1:length(...)) {
for (j in 1:length(thresholds$varname)) {
check <- is.na(thresholds[valuevar[j]])
if (check = "FALSE") { # bad!
newvars <- paste(thresholds$varname, thresholds[namevar], sep = "_")
}
}
for (j in 1:length(thresholds$varname)) {
check <- is.na(thresholds[valuevar[j]])
if (check == FALSE) { # good!
newvars <- paste(thresholds$varname, thresholds[namevar], sep = "_")
}
}
但是因为它是一个if语句,你可以使用非常简单的逻辑,特别是在逻辑上(TRUE / FALSE值)。
for (j in 1:length(thresholds$varname)) {
check <- is.na(thresholds[valuevar[j]])
if (!check) { # better!
newvars <- paste(thresholds$varname, thresholds[namevar], sep = "_")
}
}
答案 1 :(得分:1)
你的循环中显然缺少一个括号。您应该考虑使用支持大括号匹配的编辑器来避免这些错误。
答案 2 :(得分:0)
我认为最简单的方法就是编写一个函数来完成所需的非循环代码。作为参考,这是该代码的输出,使用编辑问题的dput
输出。
> newvars1 <- paste(thresholds$varname, thresholds$threshold_1_name, sep = "_")
> newvars1 <- newvars1[!is.na(thresholds$threshold_1_value)]
> newvars2 <- paste(thresholds$varname, thresholds$threshold_2_name, sep = "_")
> newvars2 <- newvars2[!is.na(thresholds$threshold_2_value)]
> c(newvars1, newvars2)
[1] "varA_overweight" "varB_possible malnutrition"
[3] "varC_Q1" "varD_Q1"
[5] "varE_Q1" "varF_Q1"
[7] "varA_obese" "varC_Q3"
[9] "varD_Q3" "varE_Q3"
[11] "varF_Q3"
这是函数的样子:
unlist(lapply(1:2, function(k) {
newvars <- paste(thresholds$varname,
thresholds[[paste("threshold", k, "name", sep="_")]], sep = "_")
newvars <- newvars[!is.na(thresholds[[paste("threshold", k, "value", sep="_")]])]
}))
# [1] "varA_overweight" "varB_possible malnutrition"
# [3] "varC_Q1" "varD_Q1"
# [5] "varE_Q1" "varF_Q1"
# [7] "varA_obese" "varC_Q3"
# [9] "varD_Q3" "varE_Q3"
#[11] "varF_Q3"
我试图弄清楚你的循环中发生了什么,但那里有很多对我没有意义的东西;如果我要以这种方式循环,我就是这样写的。
variables <- NULL
for (i in 1:2) {
valuevar <- paste("threshold", i, "value", sep = "_")
namevar <- paste("threshold", i, "name", sep = "_")
newvars <- c()
for (j in 1:nrow(thresholds)) {
if (!is.na(thresholds[[valuevar]][j])) {
newvars <- c(newvars, paste(thresholds$varname[j],
thresholds[[namevar]][j], sep = "_"))
}
}
variables <- c(variables, newvars)
}
variables