所有步骤一起

Question

我正在尝试在自己感兴趣的时间段内计算unsigned char A[6]; unsigned char C[2]; unsigned char B[6]; A[0] = 14; A[1] = 5; A[2] = 2; A[3] = 12; A[4] = 228; A[5] = 151; for (size_t k = 0; k < 8; k++) { for (int l = 0; l < 6; ++l) { B[l] = 0; } C[0] = 0; C[1] = 0; for (int i = 0; i < 6; i++) { C[(A[i] >> k) & 1]++; } C[1] = C[1] + C[0]; for (int j = 0; j < 6; ++j) { B[--C[(A[j] >> k) & 1] ] = A[j]; } swap(A, B); }中两个quartile中的最高variables和最低data.frame。下面的代码给了我上位数和下位数的一位数字。

    set.seed(50)
FakeData <- data.frame(seq(as.Date("2001-01-01"), to= as.Date("2003-12-31"), by="day"),
                     A = runif(1095, 0,10),
                     D = runif(1095,5,15))
    colnames(FakeData) <- c("Date", "A","D")
    statistics <- FakeData %>% 
              gather(-Date, key = "Variable", value = "Value") %>% 
              mutate(Year = year(Date), Month = month(Date)) %>% 
              filter(between(Month,3,5)) %>% 
              mutate(NewDate = ymd(paste("2020", Month,day(Date), sep = "-"))) %>%
              group_by(Variable, NewDate) %>%
              summarise(Upper = quantile(Value,0.75, na.rm = T),
                        Lower = quantile(Value, 0.25, na.rm = T))

我想要类似下面的输出（Final_output是我感兴趣的内容）

Output1 <- data.frame(seq(as.Date("2000-03-01"), to= as.Date("2000-05-31"), by="day"),
                       Upper = runif(92, 0,10), lower = runif(92,5,15), Variable = rep("A",92))
colnames(Output1)[1] <- "Date"
Output2 <- data.frame(seq(as.Date("2000-03-01"), to= as.Date("2000-05-31"), by="day"),
                      Upper = runif(92, 2,10), lower = runif(92,5,15), Variable = rep("D",92))
colnames(Output2)[1] <- "Date"
Final_Output<- bind_rows(Output1,Output2)

Answer 1

我可以为您提出data.table解决方案。实际上，有几种方法可以做到这一点。

最后的步骤（在Value变量上按组应用四分位数）可以转换为（如果需要，如您的示例中的两列）：

statistics[,.('p25' = quantile(get('Value'), probs = 0.25), 'p75' = quantile(get('Value'), probs = 0.75)),
           by = c("Variable", "NewDate")]

如果您喜欢长格式的输出：

library(data.table)
setDT(statistics)

statistics[,.(lapply(get('Value'), quantile, probs = .25,.75)) ,
by = c("Variable", "NewDate")]

所有步骤一起

如果您选择使用data.table使用data.table动词执行所有步骤，可能会更好。我将假设您的数据具有与您生成和排列的数据框相似的结构，即

statistics <- FakeData %>% 
  gather(-Date, key = "Variable", value = "Value")

在这种情况下，mutate和filter步骤将变为

statistics[,`:=`(Year = year(Date), Month = month(Date))]
statistics <- statistics[Month %between% c(3,5)]
statistics[, NewDate = :ymd(paste("2020", Month,day(Date), sep = "-"))]

然后选择您喜欢的最后一步，例如

statistics[,.('p25' = quantile(get('Value'), probs = 0.25), 'p75' = quantile(get('Value'), probs = 0.75)),
           by = c("Variable", "NewDate")]

汇总r中data.frame的多个变量的数据？

1 个答案:

所有步骤一起