找到平均值

时间:2016-02-28 10:16:25

标签: mean

我正在尝试创建一个包含n列的数据框(df1)(在本例中为3)。第1列应该是数据帧df0中的随机列。第2列应该是相同随机列的平均值加上来自df0的其他四个随机列。第3列应该是前五个加上另外五个随机列的平均值。

1 个答案:

答案 0 :(得分:1)

我试着逐一回答你的问题。让我们从第一个开始

total <- 15 # Total number of columns in df0
sample <- 10 # Total number of columns I'm extracting from df0
values <- 4 # Number of rows 
random <- sample(total,sample,replace=FALSE)
df0 <- data.frame(matrix(data = rexp(values*total, rate = total), nrow = values, ncol = total))

#At first I select 10 random columns from df0 
df1 <- df0[, sample(ncol(df0), sample)]


#I would create an empty data frame 

df2 <- data.frame(matrix(, nrow =values , ncol = 3))


#then assign the first column of df1 to the output  , 
df2$X1 <- df1[,1] 

 #then you get the average of five first random selected to second column of df2 
df2$X2 <- rowMeans(subset(df1[1:5])) 

 #finally the average of 10 columns to the third column of df2 
df2$X3 <- rowMeans(subset(df1[1:10]))


> df2
#         X1         X2         X3
#1 0.18816542 0.12617238 0.08728368
#2 0.09855574 0.07592763 0.06069351
#3 0.12022571 0.06045562 0.07964574
#4 0.00260806 0.06172300 0.06225859

为了删除所有不需要的列,我个人使用如下所示的内容 但我相信还有另一种方法可以做到这一点

# for example you only want to keep column 3 and 5 then 
col_list = c("X3", "X5")
dfm = df0[,col_list]