Question

我有一个长格式的逐年逐月数据，我试图用两列来spread。我看到的唯一示例包括一个key。

> dput(df)
structure(list(ID = c("a", "a", "a", "a", "a", "a", "a", "a", 
"a", "b", "b", "b", "b", "b", "b", "b", "b", "b"), Year = c(2015L, 
2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L, 2015L, 
2015L, 2015L, 2016L, 2016L, 2016L, 2017L, 2017L, 2017L), Month = c(1L, 
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 
3L), Value = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 6L, 7L, 8L, 
9L, 10L, 11L, 12L, 13L, 14L)), .Names = c("ID", "Year", "Month", 
"Value"), class = "data.frame", row.names = c(NA, -18L))

我正在尝试将其转换成一种数据格式，其中年份为2：5列，每month每ID一行

ID  Month   2015    2016    2017
a   1         1     2       3
a   2         1     2       3
a   3         1     2       3
a   1         6     9       12
a   2         7     10      13
a   3         8     11      14

我尝试了以下错误，并且出现以下错误：

by_month_over_years = spread(df,key = c(Year,Month), Value)
Error: `var` must evaluate to a single number or a column name, not an integer vector

Answer 1

library(tidyr)
library(dplyr)
df %>% group_by(ID) %>% spread(Year, Value)

# A tibble: 6 x 5
# Groups:   ID [2]
  ID    Month `2015` `2016` `2017`
  <chr> <int>  <int>  <int>  <int>
 1 a         1      1      2      3
 2 a         2      1      2      3
 3 a         3      1      2      3
 4 b         1      6      9     12
 5 b         2      7     10     13
 6 b         3      8     11     14

Answer 2

library(reshape2) # or data.table, for dcast

dcast(df, ID + Month ~ Year)

#   ID Month 2015 2016 2017
# 1  a     1    1    2    3
# 2  a     2    1    2    3
# 3  a     3    1    2    3
# 4  b     1    6    9   12
# 5  b     2    7   10   13
# 6  b     3    8   11   14

Answer 3

这里是base R的{{1}}选项

reshape

分散在R-dplyr tidyr解决方案中的多列上

3 个答案: