我正在与git2r
合作,希望创建一些有关项目活动的基本统计信息。
git2r
将所有提交作为S4对象列表返回。下面我展示了第一个对象的结构:
> library(git2r)
> repo <- repository('/Users/swain/Dropbox/projects/from-github/brakeman')
> last3 <- commits(repo, n=3)
> str(last3)
List of 3
$ :Formal class 'git_commit' [package "git2r"] with 6 slots
.. ..@ sha : chr "f7746c21846d895bd90632df5a2366381ced77d9"
.. ..@ author :Formal class 'git_signature' [package "git2r"] with 3 slots
.. .. .. ..@ name : chr "Justin"
.. .. .. ..@ email: chr "presidentbeef@users.noreply.github.com"
.. .. .. ..@ when :Formal class 'git_time' [package "git2r"] with 2 slots
.. .. .. .. .. ..@ time : num 1.5e+09
.. .. .. .. .. ..@ offset: num -420
.. ..@ committer:Formal class 'git_signature' [package "git2r"] with 3 slots
.. .. .. ..@ name : chr "GitHub"
.. .. .. ..@ email: chr "noreply@github.com"
.. .. .. ..@ when :Formal class 'git_time' [package "git2r"] with 2 slots
.. .. .. .. .. ..@ time : num 1.5e+09
.. .. .. .. .. ..@ offset: num -420
.. ..@ summary : chr "Merge pull request #1056 from presidentbeef/hash_access_interpolation_performance_improvements"
.. ..@ message : chr "Merge pull request #1056 from presidentbeef/hash_access_interpolation_performance_improvements\n\nHash access i"| __truncated__
.. ..@ repo :Formal class 'git_repository' [package "git2r"] with 1 slot
.. .. .. ..@ path: chr "/Users/swain/Dropbox/projects/from-github/brakeman"
我搜索了高低,以便将所有对象中的一个插槽提取到列表中。例如,对于列表last3
中的所有S4对象,我想将author
拉入此新列表中。请注意,这里有对象的嵌套,所以我可能想要从顶层对象的插槽中的对象上创建一个列表。
最终,我想开始创建各个领域的情节和摘要。例如,按星期几提交的条形图;提交者的消息长度框图;像这样的东西。将插槽转换为列表或向量的方式是错误的吗? (编辑:s / histogram / bar chart /,doh)
答案 0 :(得分:2)
这是您要实现的目标的tidyverse解决方案。 Jenny Bryan有一套很好的介绍性文档,介绍如何使用purrr(和其他包)来完成这类任务:https://jennybc.github.io/purrr-tutorial/。
library(git2r)
library(dplyr)
library(ggplot2)
library(purrr)
library(lubridate)
options(stringsAsFactors = FALSE)
repo <- repository("/git-repos/brakeman/")
# Get relevant bits out of the list
analysis_df <-
repo %>%
commits(n = 50) %>%
map_df(
~ data.frame(
name = .@author@name,
date = .@author@when@time %>% as.POSIXct(origin="1970-01-01"),
message = .@message
)
)
# A histogram of commits by day of the week;
analysis_df %>%
mutate(weekday = weekdays(date)) %>%
group_by(weekday) %>%
tally() %>%
ggplot(aes(x = weekday, y = n)) +
geom_bar(stat = "identity")
# box plots of the message length by committer
analysis_df %>%
mutate(message_length = nchar(message)) %>%
group_by(name) %>%
summarise(mean_message_length = mean(message_length)) %>%
ggplot(aes(x = name, y = mean_message_length)) +
geom_bar(stat = "identity")
答案 1 :(得分:1)
怎么样
lapply(last3,function(x) data.frame(author = x@author@name, email = x@author@email))