我正在处理我收集的一些数据。通常我的工作涉及同一处理的多个级别,并且为了使可视化更容易,将值归一化为初始值。
为了做到这一点,我制作了一个简单的脚本:
var1 = c(200:223) # These numbers represent the measured response variable (RV).
var2 = rep(c("A", "A.1", "B", "B.1", "C", "C.1"), each = 1, times = 4) # These are experimental unit identifiers.
var3 = rep(c("37ºC", "45ºC"), each = 6, times = 2) # These are the levels of the treatment (temperature)
var4 = rep(c("First time", "Second time"), each = 12) #These are the times at which the RV was measured.
df = data.frame(var1, var2, var3, var4) #Data frame of the vars
这是数据框的样子:
ID var1 var2 var3 var4
1 200 A 37ºC First time
2 201 A.1 37ºC First time
3 202 B 37ºC First time
4 203 B.1 37ºC First time
5 204 C 37ºC First time
6 205 C.1 37ºC First time
7 206 A 45ºC First time
8 207 A.1 45ºC First time
9 208 B 45ºC First time
10 209 B.1 45ºC First time
11 210 C 45ºC First time
12 211 C.1 45ºC First time
13 212 A 37ºC Second time
14 213 A.1 37ºC Second time
15 214 B 37ºC Second time
16 215 B.1 37ºC Second time
17 216 C 37ºC Second time
18 217 C.1 37ºC Second time
19 218 A 45ºC Second time
20 219 A.1 45ºC Second time
21 220 B 45ºC Second time
22 221 B.1 45ºC Second time
23 222 C 45ºC Second time
24 223 C.1 45ºC Second time
标准化的想法是将 second time
的值与相同 First time
的 var3
的值进行比较。例如,比较 ID13 和 ID1,比较 ID1 和 ID1。
df_37 = subset(df, var3 %in% "37ºC") #Subset df by value in var3
df_45 = subset(df, var3 %in% "45ºC")
relat37 = c() #Create vectors where to store normalized numbers for the corresponding var3
relat45 = c()
for (j in df_37$var2) { #Iterate through the values of var2, the identifiers of each VR
for (i in filter(df_37, var2 == j)[1]) { #Take the first value of the first column of the data frame if the condition is met
}
result = (i/i[1]) #divide the values of the second time by those of the first time.
relat37 = append(relat37, result) #append the results.
}
for (j in df_45$var2) {
for (i in filter(df_45, var2 == j)[1]) {
}
result = (i/i[1])
relat45 = append(relat45, result)
}
当调用存储变量“relat37”或“reltat45”时,我将初始 var2 归一化为自身,第二次归一化为同一 var2 的第一个时间值。然后它进入下一个 var2。:
[1] 1.0 13.0 1.0 7.0 1.0 5.0 1.0 4.0 1.0 3.4 1.0 3.0 1.0 13.0 1.0 7.0 1.0 5.0 1.0 4.0 1.0
[22] 3.4 1.0 3.0
问题是由于某种原因它重复了两次答案,所以我得到 24,而不是 12 个标准化值。
第一个问题 有没有办法制作一个不需要我制作 2 个新数据框和 2 个单独 for 循环的 for 循环?我只想要一个 for 循环来完成每个级别的处理需要做的事情。在这种情况下只有两个,但我通常使用 6 个或更多,并且代码太长。
第二个问题 为什么我会得到 24 个结果?后 12 个结果只是前 12 个重复。我如何只得到标准化为每个 var2 的 12 个结果?