从 R 中的多个 for 循环创建单个 for 循环

时间:2021-04-08 17:22:51

标签: r for-loop

我正在处理我收集的一些数据。通常我的工作涉及同一处理的多个级别,并且为了使可视化更容易,将值归一化为初始值。

为了做到这一点,我制作了一个简单的脚本:

var1 = c(200:223) # These numbers represent the measured response variable (RV).
var2 = rep(c("A", "A.1", "B", "B.1", "C", "C.1"), each = 1, times = 4) # These are experimental unit identifiers.
var3 = rep(c("37ºC", "45ºC"), each = 6, times = 2) # These are the levels of the treatment (temperature)
var4 = rep(c("First time", "Second time"), each = 12) #These are the times at which the RV was measured.

df = data.frame(var1, var2, var3, var4) #Data frame of the vars

这是数据框的样子:

ID  var1  var2 var3     var4
1   200    A 37ºC  First time
2   201  A.1 37ºC  First time
3   202    B 37ºC  First time
4   203  B.1 37ºC  First time
5   204    C 37ºC  First time
6   205  C.1 37ºC  First time

7   206    A 45ºC  First time
8   207  A.1 45ºC  First time
9   208    B 45ºC  First time
10  209  B.1 45ºC  First time
11  210    C 45ºC  First time
12  211  C.1 45ºC  First time

13  212    A 37ºC Second time
14  213  A.1 37ºC Second time
15  214    B 37ºC Second time
16  215  B.1 37ºC Second time
17  216    C 37ºC Second time
18  217  C.1 37ºC Second time

19  218    A 45ºC Second time
20  219  A.1 45ºC Second time
21  220    B 45ºC Second time
22  221  B.1 45ºC Second time
23  222    C 45ºC Second time
24  223  C.1 45ºC Second time

标准化的想法是将 second time 的值与相同 First timevar3 的值进行比较。例如,比较 ID13 和 ID1,比较 ID1 和 ID1。

df_37 = subset(df, var3 %in% "37ºC") #Subset df by value in var3
df_45 = subset(df, var3 %in% "45ºC")

relat37 = c() #Create vectors where to store normalized numbers for the corresponding var3
relat45 = c()

for (j in df_37$var2) { #Iterate through the values of var2, the identifiers of each VR
  for (i in filter(df_37, var2 == j)[1]) { #Take the first value of the first column of the data frame if the condition is met
  }
  result = (i/i[1]) #divide the values of the second time by those of the first time.
  relat37 = append(relat37, result) #append the results.
}
for (j in df_45$var2) {
  for (i in filter(df_45, var2 == j)[1]) {
  }
  result = (i/i[1]) 
  relat45 = append(relat45, result)
}

当调用存储变量“relat37”或“reltat45”时,我将初始 var2 归一化为自身,第二次归一化为同一 var2 的第一个时间值。然后它进入下一个 var2。:

[1]  1.0 13.0  1.0  7.0  1.0  5.0  1.0  4.0  1.0  3.4  1.0  3.0  1.0 13.0  1.0  7.0  1.0  5.0  1.0  4.0  1.0
[22]  3.4  1.0  3.0

问题是由于某种原因它重复了两次答案,所以我得到 24,而不是 12 个标准化值。

第一个问题 有没有办法制作一个不需要我制作 2 个新数据框和 2 个单独 for 循环的 for 循环?我只想要一个 for 循环来完成每个级别的处理需要做的事情。在这种情况下只有两个,但我通常使用 6 个或更多,并且代码太长。

第二个问题 为什么我会得到 24 个结果?后 12 个结果只是前 12 个重复。我如何只得到标准化为每个 var2 的 12 个结果?

0 个答案:

没有答案