Social activities on GitHub
On my last post I explored the Bioconductor submissions using {gh}
to retrieve some data.
After some feedback from the Bioconductor community I realized I should download other kind of data to improve my analysis on the reviews.
To make this I developed a new package to retrieve information from GitHub.
Learning
Developing this package I learned more about the {gh}
package (In the previous blog I wrote manually the calls to different pages, which later on I discovered it is automatically handled by {gh}
).
And learned that the different accept headers have influenced on the total information returned (and that you cannot pass several accept headers at the same time).
Hope to learn more about the R community that is using Github as a way to help each other, improve packages and process.
Reproducibility
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.1 (2020-06-06)
## os Ubuntu 20.04.1 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Europe/Madrid
## date 2021-01-08
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.1)
## backports 1.2.1 2020-12-09 [1] CRAN (R 4.0.1)
## blogdown 0.21.84 2021-01-07 [1] Github (rstudio/blogdown@c4fbb58)
## bookdown 0.21 2020-10-13 [1] CRAN (R 4.0.1)
## broom 0.7.3 2020-12-16 [1] CRAN (R 4.0.1)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.1)
## cli 2.2.0 2020-11-20 [1] CRAN (R 4.0.1)
## colorspace 2.0-0 2020-11-11 [1] CRAN (R 4.0.1)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.1)
## curl 4.3 2019-12-02 [1] CRAN (R 4.0.1)
## DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.1)
## dbplyr 2.0.0 2020-11-03 [1] CRAN (R 4.0.1)
## digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.1)
## dplyr * 1.0.2 2020-08-18 [1] CRAN (R 4.0.1)
## ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.1)
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1)
## fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.1)
## forcats * 0.5.0 2020-03-01 [1] CRAN (R 4.0.1)
## fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.1)
## generics 0.1.0 2020-10-31 [1] CRAN (R 4.0.1)
## ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 4.0.1)
## gh 1.2.0 2020-11-27 [1] CRAN (R 4.0.1)
## gitcreds 0.1.1 2020-12-04 [1] CRAN (R 4.0.1)
## glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.1)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.1)
## haven 2.3.1 2020-06-01 [1] CRAN (R 4.0.1)
## hms 0.5.3 2020-01-08 [1] CRAN (R 4.0.1)
## htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.1)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.1)
## jsonlite 1.7.2 2020-12-09 [1] CRAN (R 4.0.1)
## knitr 1.30 2020-09-22 [1] CRAN (R 4.0.1)
## lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.1)
## lubridate 1.7.9.2 2020-11-13 [1] CRAN (R 4.0.1)
## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.1)
## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.0.1)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.1)
## pillar 1.4.7 2020-11-20 [1] CRAN (R 4.0.1)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.1)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.1)
## R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.1)
## Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.1)
## readr * 1.4.0 2020-10-05 [1] CRAN (R 4.0.1)
## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.1)
## reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.1)
## rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.1)
## rmarkdown 2.6 2020-12-14 [1] CRAN (R 4.0.1)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.0.1)
## rvest 0.3.6 2020-07-25 [1] CRAN (R 4.0.1)
## scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.1)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.1)
## socialGH * 0.0.3 2020-08-17 [1] local
## stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.1)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.1)
## tibble * 3.0.4 2020-10-12 [1] CRAN (R 4.0.1)
## tidyr * 1.1.2 2020-08-27 [1] CRAN (R 4.0.1)
## tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.1)
## tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 4.0.1)
## vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.1)
## withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.1)
## xfun 0.20 2021-01-06 [1] CRAN (R 4.0.1)
## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.1)
## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.1)
##
## [1] /home/lluis/bin/R/4.0.1/lib/R/library
socialGH
This package based on
{gh}
, allows to retrieve, data from Github.You can install it with
Basically pulls the data in list format and transforms it into a data.frame in order to be able to do analysis, filter it or analyze it.
It allows to selective download comments, pull requests, issues, events, labels and the timeline of an issue.
With the issues we can see the labels, how many coments and many information:
However, it doesn’t retrieve each comment of an issue.
We can see that I was the only one writing on the issues and we already retrieved the text of the comments.
We can also look for events on issues:
On all the functions you can provide a number of the issue and you’ll retrieve the information just for that issue. If you don’t provide an issue it will search the whole repository:
However it is better if we look to the timeline of an issue:, which downloads each comment of the issues.
With timeline we don’t get the initial information of when the issue was created and we’ll need to call
get_issue("llrs/blogR", 23)
to know that. Here I did omit the text of the comment to make it readable, but we can see what has been happening and by who or who is affecting.