Concepts around open source/free software

This post is to lay out some concepts I picked up after reading “The Making and maintenance of open source software”. Having these concepts in mind might help me on my contributions to R and OSS in general. I write these thought to come back to in future posts.

The book classify projects by two axis, contribution growth and user growth:

High user growth low user growth
High contributor growth Federations Clubs
Low contributor growth Stadiums Toys

And classify projects according to the following characteristics :

  • Technical

  • Support

  • Ease of participation

  • User adoption

  • Contributor growth

Code seems like a common good which require the following characteristics:

  • Intrinsic motivation

  • Modular

  • Granular

  • Low cost of coordination

On the author’s opinion only maintainers are interested in the success of the whole community and need to make trade off between different interest of the community around the project.

Motivation is very important and I classify based on the source of motivation and the sign of it:

Positive Negative
Intrinsic Learn skills Burn out
Extrinsic Social benefits Friction, or lack of feedback

Following the book, contributors can be grouped in two:

  • Invested: Lurk before making a contribution, learn the quirks of the community

  • Casual: Adding value to themselves and other

Contributors might spend many time learning about the community before making their first contribution (or show themselves). That’s why only knowing if this is the first contribution of someone doesn’t mean they will continue contributing on the project.

Users, can be classified in two groups: passive, they use the software and nothing else, or active. Active users might do one of the following:

  • Educate others: write a blog post, or material

  • Spread the word: Announce they use the software

  • Support: Solve other’s users questions

  • Fill bug reports

The health of the project depends on the popularity dependencies and active and future maintenance of the software.

However, the book says that one contributor is not the same as the other. For instance removing a maintainer causes more harm than a casual contributor.

The source of this is that software is like a puppy. The value of the code is how live it is, static code has null value. but once it is being used it is very valuable.

For this reason the maintenance costs once there are users is very high. However, in general there are few ways to know how many users does a piece of software have.

This produces marginal costs to maintainer, which are driven by how are these goods:

Excludable Non-Excludable
Rivalrous Private goods Commons good
Non-Rivalrous Club goods Public goods

Costs are mainly attention from the maintainers from the users and contributors. Users are like a cars in a highway initially there is no problem, but at high levels of traffic adding new lanes don’t solve traffic jam.

However, the cost increase with more request, the bandwidth to download software and hosting

More users leads to more requests, which lead to a competition for maintainers to do less proactive work and do more reactive work.

This leads to start on very simple organiztion and evolve to more disorganized complexity and then to a organized complexity to just cope with the costs of the project. On this organization complexity relationships with maintainers become important.

Value = usage+dependencies-maintenance+substitutability+switching cost+enabling while Cost = development +maintenance+attention.

The common (and scarce) good is the attention of both maintainers and developers. Which requires judgement call on which kind of requests dedicate their time: extractive or non-extractive requests.

The benefit for maintainers once the reputation/recognition is enough is almost non existent.

The book cites several communities Python, ruby, Linux, javascript, java, but I don’t think they used R community as a source. So what are the implication of these concepts to the R community? How do we help maintainers to keep up with their work or let in new maintainers?

Reproducibility

## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       Ubuntu 20.04.4 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Madrid
##  date     2022-03-16
##  pandoc   2.17.1.1 @ /usr/lib/rstudio/bin/quarto/bin/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  blogdown      1.8.1   2022-02-19 [1] Github (rstudio/blogdown@9af7733)
##  bookdown      0.24    2021-09-02 [1] CRAN (R 4.1.2)
##  bslib         0.3.1   2021-10-06 [1] CRAN (R 4.1.2)
##  cli           3.2.0   2022-02-14 [1] CRAN (R 4.1.2)
##  digest        0.6.29  2021-12-01 [1] CRAN (R 4.1.2)
##  evaluate      0.15    2022-02-18 [1] CRAN (R 4.1.2)
##  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.2)
##  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.1.2)
##  jsonlite      1.8.0   2022-02-22 [1] CRAN (R 4.1.2)
##  knitr         1.37    2021-12-16 [1] CRAN (R 4.1.2)
##  magrittr      2.0.2   2022-01-26 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.1.2)
##  rlang         1.0.2   2022-03-04 [1] CRAN (R 4.1.2)
##  rmarkdown     2.13    2022-03-10 [1] CRAN (R 4.1.2)
##  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.2)
##  sass          0.4.0   2021-05-12 [1] CRAN (R 4.1.2)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
##  stringi       1.7.6   2021-11-29 [1] CRAN (R 4.1.2)
##  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.2)
##  xfun          0.30    2022-03-02 [1] CRAN (R 4.1.2)
##  yaml          2.3.5   2022-02-21 [1] CRAN (R 4.1.2)
## 
##  [1] /home/lluis/bin/R/4.1.2/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Avatar
Lluís Revilla Sancho
Bioinformatician

Bioinformatician with interests in functional enrichment, data integration and transcriptomics.