Concepts around open source/free software
This post is to lay out some concepts I picked up after reading “The Making and maintenance of open source software”. Having these concepts in mind might help me on my contributions to R and OSS in general. I write these thought to come back to in future posts.
The book classify projects by two axis, contribution growth and user growth:
High user growth | low user growth | |
---|---|---|
High contributor growth | Federations | Clubs |
Low contributor growth | Stadiums | Toys |
And classify projects according to the following characteristics :
Technical
Support
Ease of participation
User adoption
Contributor growth
Code seems like a common good which require the following characteristics:
Intrinsic motivation
Modular
Granular
Low cost of coordination
On the author’s opinion only maintainers are interested in the success of the whole community and need to make trade off between different interest of the community around the project.
Motivation is very important and I classify based on the source of motivation and the sign of it:
Positive | Negative | |
---|---|---|
Intrinsic | Learn skills | Burn out |
Extrinsic | Social benefits | Friction, or lack of feedback |
Following the book, contributors can be grouped in two:
Invested: Lurk before making a contribution, learn the quirks of the community
Casual: Adding value to themselves and other
Contributors might spend many time learning about the community before making their first contribution (or show themselves). That’s why only knowing if this is the first contribution of someone doesn’t mean they will continue contributing on the project.
Users, can be classified in two groups: passive, they use the software and nothing else, or active. Active users might do one of the following:
Educate others: write a blog post, or material
Spread the word: Announce they use the software
Support: Solve other’s users questions
Fill bug reports
The health of the project depends on the popularity dependencies and active and future maintenance of the software.
However, the book says that one contributor is not the same as the other. For instance removing a maintainer causes more harm than a casual contributor.
The source of this is that software is like a puppy. The value of the code is how live it is, static code has null value. but once it is being used it is very valuable.
For this reason the maintenance costs once there are users is very high. However, in general there are few ways to know how many users does a piece of software have.
This produces marginal costs to maintainer, which are driven by how are these goods:
Excludable | Non-Excludable | |
---|---|---|
Rivalrous | Private goods | Commons good |
Non-Rivalrous | Club goods | Public goods |
Costs are mainly attention from the maintainers from the users and contributors. Users are like a cars in a highway initially there is no problem, but at high levels of traffic adding new lanes don’t solve traffic jam.
However, the cost increase with more request, the bandwidth to download software and hosting
More users leads to more requests, which lead to a competition for maintainers to do less proactive work and do more reactive work.
This leads to start on very simple organiztion and evolve to more disorganized complexity and then to a organized complexity to just cope with the costs of the project. On this organization complexity relationships with maintainers become important.
Value = usage+dependencies-maintenance+substitutability+switching cost+enabling
while Cost = development +maintenance+attention
.
The common (and scarce) good is the attention of both maintainers and developers. Which requires judgement call on which kind of requests dedicate their time: extractive or non-extractive requests.
The benefit for maintainers once the reputation/recognition is enough is almost non existent.
The book cites several communities Python, ruby, Linux, javascript, java, but I don’t think they used R community as a source. So what are the implication of these concepts to the R community? How do we help maintainers to keep up with their work or let in new maintainers?
Reproducibility
## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
## setting value
## version R version 4.1.2 (2021-11-01)
## os Ubuntu 20.04.4 LTS
## system x86_64, linux-gnu
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Europe/Madrid
## date 2022-03-16
## pandoc 2.17.1.1 @ /usr/lib/rstudio/bin/quarto/bin/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## blogdown 1.8.1 2022-02-19 [1] Github (rstudio/blogdown@9af7733)
## bookdown 0.24 2021-09-02 [1] CRAN (R 4.1.2)
## bslib 0.3.1 2021-10-06 [1] CRAN (R 4.1.2)
## cli 3.2.0 2022-02-14 [1] CRAN (R 4.1.2)
## digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2)
## evaluate 0.15 2022-02-18 [1] CRAN (R 4.1.2)
## fastmap 1.1.0 2021-01-25 [1] CRAN (R 4.1.2)
## htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.2)
## jquerylib 0.1.4 2021-04-26 [1] CRAN (R 4.1.2)
## jsonlite 1.8.0 2022-02-22 [1] CRAN (R 4.1.2)
## knitr 1.37 2021-12-16 [1] CRAN (R 4.1.2)
## magrittr 2.0.2 2022-01-26 [1] CRAN (R 4.1.2)
## R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.2)
## rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.2)
## rmarkdown 2.13 2022-03-10 [1] CRAN (R 4.1.2)
## rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.2)
## sass 0.4.0 2021-05-12 [1] CRAN (R 4.1.2)
## sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
## stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2)
## stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.2)
## xfun 0.30 2022-03-02 [1] CRAN (R 4.1.2)
## yaml 2.3.5 2022-02-21 [1] CRAN (R 4.1.2)
##
## [1] /home/lluis/bin/R/4.1.2/lib/R/library
##
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────