R quarantine house
So I found this funny tweet
What's your R quarantine house? I'm definitely 5 pic.twitter.com/h7aiijOqK0
— Jacqueline Nolis (@skyetetra) April 24, 2020
And Tyler Morgan made the “joke” to check the dependencies. So, let’s check them:
List libraries
First we set up the original choices:
env1 <- c("ggplot2", "dplyr", "data.table", "purrr")
env2 <- c("forecats", "glue", "jsonlite", "rmarkdown")
env3 <- c("shiny", "rayshader", "stringr", "tidytext")
env4 <- c("devtools", "xml2", "tidyr", "tibble")
env5 <- c("reticulate", "keras", "plumber", "usethis")
env6 <- c("blogdown", "brickr", "lubridate", "igraph")
quarantines <- list(env1 = env1, env2 = env2,
env3 = env3, env4 = env4,
env5 = env5, env6 = env6)
Dependencies
All of them are on CRAN (and I don’t have them installed on my computer) so let’s retrieve the available packages from CRAN. Then we can check how many unique packages are needed for each one:
library("tools")
ap <- available.packages()
unique_dep <- function(sets, db) {
pd <- package_dependencies(packages = sets, recursive = TRUE, db = db)
unique(unlist(pd))
}
uniq_p <- lapply(quarantines, unique_dep, db = ap)
sort(lengths(uniq_p))
## env2 env1 env5 env3 env4 env6
## 22 59 63 89 91 96
So the environment with more dependencies is the third and the second is the one with least dependencies.
Similarity of the environments
We’ve seen that the number of package is quite different. But how many of them is shared?
A little time ago I wrote a package aimed to this: {BioCor
} you can install it from Bioconductor. I’ll use it now:
library("BioCor")
similarity <- mpathSim(names(uniq_p), inverseList(uniq_p), method = NULL)
similarity
## env1 env2 env3 env4 env5 env6
## env1 1.0000000 0.2716049 0.5675676 0.5733333 0.5081967 0.7612903
## env2 0.2716049 1.0000000 0.3783784 0.3716814 0.3294118 0.3728814
## env3 0.5675676 0.3783784 1.0000000 0.5666667 0.4868421 0.7783784
## env4 0.5733333 0.3716814 0.5666667 1.0000000 0.6623377 0.6737968
## env5 0.5081967 0.3294118 0.4868421 0.6623377 1.0000000 0.5031447
## env6 0.7612903 0.3728814 0.7783784 0.6737968 0.5031447 1.0000000
The closer to 1 it means that they share more dependencies, so the most different are the environment 1 and the environment 2 We can see that the most similar packages are the environment 1 and environment 6 and that the environment 6 is the one with higher similarity to the other sets.
Which quarantine environment has some of the others?
So some of these environments call other packages from the other environments as dependencies. We can now look for how many of them:
inside_calls <- lapply(uniq_p, function(x, y) {
# Look how many packages of each set is on the dependencies of this set
vapply(y, function(z, x) {
sum(z %in% x)
}, x = x, numeric(1L))
}, y = quarantines)
# Simplify and name for easier understanding
inside <- simplify2array(inside_calls)
names(dimnames(inside)) <- list("Package of", "Inside of")
inside
## Inside of
## Package of env1 env2 env3 env4 env5 env6
## env1 1 0 2 2 1 3
## env2 2 2 2 2 2 3
## env3 0 1 2 1 0 2
## env4 1 0 1 2 0 2
## env5 0 0 0 1 1 0
## env6 0 0 0 0 0 0
colSums(inside)-diag(inside) # To avoid counting self-dependencies
## env1 env2 env3 env4 env5 env6
## 3 1 5 6 3 10
We can see that environment 6 has more packages from the other environments.
Chances of survival:
Someone mentioned that the {survival}
package wasn’t on any environment.
But it might be on the dependencies:
vapply(uniq_p, function(x){"survival" %in% x}, logical(1L))
## env1 env2 env3 env4 env5 env6
## FALSE FALSE FALSE FALSE FALSE FALSE
No, it seems like we won’t survive well with this environments :)
Conclusions
Environment 6 is the one with more packages from the other environments, but if you want to have more packages use the second one. What you can do with these packages on a quarantine is harder to say :D
Reproducibility
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.0.1 (2020-06-06)
## os Ubuntu 20.04.1 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Europe/Madrid
## date 2021-01-08
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date lib source
## annotate 1.68.0 2020-10-27 [1] Bioconductor
## AnnotationDbi 1.52.0 2020-10-27 [1] Bioconductor
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.1)
## Biobase 2.50.0 2020-10-27 [1] Bioconductor
## BiocGenerics 0.36.0 2020-10-27 [1] Bioconductor
## BioCor * 1.14.0 2020-10-27 [1] Bioconductor
## BiocParallel 1.24.1 2020-11-06 [1] Bioconductor
## bit 4.0.4 2020-08-04 [1] CRAN (R 4.0.1)
## bit64 4.0.5 2020-08-30 [1] CRAN (R 4.0.1)
## blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.1)
## blogdown 0.21.84 2021-01-07 [1] Github (rstudio/blogdown@c4fbb58)
## bookdown 0.21 2020-10-13 [1] CRAN (R 4.0.1)
## cli 2.2.0 2020-11-20 [1] CRAN (R 4.0.1)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.1)
## DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.1)
## digest 0.6.27 2020-10-24 [1] CRAN (R 4.0.1)
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.1)
## fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.1)
## glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.1)
## graph 1.68.0 2020-10-27 [1] Bioconductor
## GSEABase 1.52.1 2020-12-11 [1] Bioconductor
## htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.1)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.1)
## IRanges 2.24.1 2020-12-12 [1] Bioconductor
## knitr 1.30 2020-09-22 [1] CRAN (R 4.0.1)
## lattice 0.20-41 2020-04-02 [1] CRAN (R 4.0.1)
## magrittr 2.0.1 2020-11-17 [1] CRAN (R 4.0.1)
## Matrix 1.3-2 2021-01-06 [1] CRAN (R 4.0.1)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.1)
## R6 2.5.0 2020-10-28 [1] CRAN (R 4.0.1)
## Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.1)
## rlang 0.4.10 2020-12-30 [1] CRAN (R 4.0.1)
## rmarkdown 2.6 2020-12-14 [1] CRAN (R 4.0.1)
## RSQLite 2.2.1 2020-09-30 [1] CRAN (R 4.0.1)
## S4Vectors 0.28.1 2020-12-09 [1] Bioconductor
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.1)
## stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.1)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 4.0.1)
## vctrs 0.3.6 2020-12-17 [1] CRAN (R 4.0.1)
## withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.1)
## xfun 0.20 2021-01-06 [1] CRAN (R 4.0.1)
## XML 3.99-0.5 2020-07-23 [1] CRAN (R 4.0.1)
## xtable 1.8-4 2019-04-21 [1] CRAN (R 4.0.1)
## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.1)
##
## [1] /home/lluis/bin/R/4.0.1/lib/R/library