示例数据

Question

我有一个大学的数据框列，其中大学的每个组成部分都是列表中的一个元素（部门、大学、城市等）。但它们并不完全相同，我只想提取每个元素的前三个元素。我想要类似的东西：

library(tidyverse)

universities %>%
  mutate(Affiliations = map(Affiliations, pluck, 1:3))

但是 pluck 只选择一个元素。这里有什么想法吗？

这是dput的结果：

structure(list(Affiliations = list(c("center for advancing electronics dresden (cfaed) tu dresden", 
" dresden", " 01062", " germany"), c("roxelyn and richard pepper department of communication sciences and disorders", 
" northwestern university", " evanston", " il  60208", " united states"
), c("the hugh knowles hearing research center", " northwestern university", 
" evanston", " il  60208", " united states"), c("lodz university", 
" lodz", " poland"), c("cad department", " l'viv polytechnic national university", 
" l'viv", " ukraine"))), row.names = c(NA, -5L), class = c("tbl_df", 
"tbl", "data.frame"))

Answer 1

您可以尝试自定义 lambda 函数：

示例数据

universities

# # A tibble: 5 x 1
#   Affiliations
#   <list>      
# 1 <chr [4]>   
# 2 <chr [5]>   
# 3 <chr [5]>   
# 4 <chr [3]>   
# 5 <chr [4]>

自定义 Lambda 函数

universities %>% 
  mutate(Affiliations = map(Affiliations, ~ .[1:3]))

# # A tibble: 5 x 1
#   Affiliations
#   <list>      
# 1 <chr [3]>   
# 2 <chr [3]>   
# 3 <chr [3]>   
# 4 <chr [3]>   
# 5 <chr [3]>

取消嵌套更宽（如果需要）

universities %>% 
  mutate(Affiliations = map(Affiliations, ~ .[1:3])) %>% 
  unnest_wider(Affiliations, names_repair = ~ c("v1", "v2", "v3"))

# # A tibble: 5 x 3
#   v1                                           v2                       v3      
#   <chr>                                        <chr>                    <chr>   
# 1 center for advancing electronics dresden (c~ " dresden"               " 01062"
# 2 roxelyn and richard pepper department of co~ " northwestern universi~ " evans~
# 3 the hugh knowles hearing research center     " northwestern universi~ " evans~
# 4 lodz university                              " lodz"                  " polan~
# 5 cad department                               " l'viv polytechnic nat~ " l'viv"

Answer 2

只需 lapply 和括号函数。

res <- lapply(universities$Affiliations, `[`, 1:3)
res
# [[1]]
# [1] "center for advancing electronics dresden (cfaed) tu dresden" " dresden"                                                   
# [3] " 01062"                                                     
# 
# [[2]]
# [1] "roxelyn and richard pepper department of communication sciences and disorders"
# [2] " northwestern university"                                                     
# [3] " evanston"                                                                    
# 
# [[3]]
# [1] "the hugh knowles hearing research center" " northwestern university"                 " evanston"                               
# 
# [[4]]
# [1] "lodz university" " lodz"           " poland"        
# 
# [[5]]
# [1] "cad department"                         " l'viv polytechnic national university" " l'viv"

如果愿意，可以rbind.data.framed。

res.df <- setNames(do.call(rbind.data.frame, res), c("V1", "V2", "V3"))
res.df
#                                                                              V1                                     V2        V3
# 1                   center for advancing electronics dresden (cfaed) tu dresden                                dresden     01062
# 2 roxelyn and richard pepper department of communication sciences and disorders                northwestern university  evanston
# 3                                      the hugh knowles hearing research center                northwestern university  evanston
# 4                                                               lodz university                                   lodz    poland
# 5                                                                cad department  l'viv polytechnic national university     l'viv

提取 R 中列表列中的前 x 个元素？

2 个答案:

示例数据

自定义 Lambda 函数

取消嵌套更宽（如果需要）