Question

想象你有这个数据框

x <- c("a1", "a2", "a3", "a4", "a1", "a2", "a3", "a4")

y <- c("red", "yellow", "blue", "green", "black", "pink", "purple", 
"orange")

df <- data.frame(x, y, stringsAsFactors = FALSE)

我想不出一种方法，最好是使用dplyr在分组数据帧之后提取y列。本质上，我想知道a1，a2，a3和a4中的颜色，并将这些结果存储为单独的矢量，最好在列表中。

我可以做

colors.in.a1 <- df %>% filter(x == "a1") %>% pull(y)

分别用于a1，a2，a3，a4，但这需要花费我的真实数据。我希望pull()的行为类似于tally()，也许返回一个基于分组变量命名的向量列表，但事实并非如此。

Answer 1

仅使用Base R（感谢@thelatemail的评论）：

split(df$y, df$x)

或者我们可以使用nest：

library(tidyverse)

df %>%
  group_by(x) %>%
  nest() %>%
  mutate(data = data %>% map(pull, y)) %>%
  pull(data) %>%
  setNames(unique(x))

结果：

$a1
[1] "red"   "black"

$a2
[1] "yellow" "pink"  

$a3
[1] "blue"   "purple"

$a4
[1] "green"  "orange"

Answer 2

使用app.post("/todo", urlencoded, function(req, res) { items.push(req.body); res.sendStatus(200); });和dplyr的另一种解决方案：

purrr

library(dplyr)
library(purrr)

df %>% 
  split(.$x) %>% 
  map(pull, y)

数据：

$a1
[1] "red"   "black"

$a2
[1] "yellow" "pink"  

$a3
[1] "blue"   "purple"

$a4
[1] "green"  "orange"

R，从分组的数据帧中提取列作为向量

2 个答案: