BaseSet 0.9.0

I’m excited to provide a new release of BaseSet, the package implementing a a class and methods to work with (fuzzy) sets.

This release was focused on making it easier to work with it.

From the beginning it was engineered towards the tidyverse and this time I focused on general R methods like $, [, c:

New methods

First we can create a TidySet or TS for short:

library("BaseSet", warn.conflicts = FALSE)
packageVersion("BaseSet")
## [1] '0.9.0'
l <- list(A = "1",
     B = c("1", "2"),
     C = c("2", "3", "4"),
     D = c("1", "2", "3", "4")
)
TS <- tidySet(l)

Up till now there was no compatibility with the base R methods but there was with the tidyverse.

TSa <- TS[["A"]]
TSb <- TS[["B"]]

Maybe this doesn’t look much but previously it wasn’t possible to subset the class. Initially I thought that working with a single class per session would be enough. Later I realized that maybe people would have good reasons to split or combine multiple objects:

TSab <- c(TSa, TSb)
TSab
##   elements sets fuzzy
## 1        1    A     1
## 2        1    B     1
## 3        2    B     1

Note that subsetting by sets does not produce the same object as elements are kept:

dim(TSab)
##  Elements Relations      Sets 
##         2         3         2
dim(TS[1:2, "sets"])
##  Elements Relations      Sets 
##         4         3         2

You’ll need to drop the elements:

dim(droplevels(TS[1:2, "sets"]))
##  Elements Relations      Sets 
##         2         3         2

We can include more information like this:

TSab[1:2, "relations", "type"] <- c("new", "addition")
TSab[1:2, "sets", "origin"] <- c("fake", "real")
TSab
##   elements sets fuzzy     type origin
## 1        1    A     1      new   fake
## 2        1    B     1 addition   real
## 3        2    B     1     <NA>   real

With this release is easier to access the columns of the TidySet:

TSab$type
## [1] "new"      "addition" NA
TSab$origin
## [1] "fake" "real"
TS$sets
##  [1] "A" "B" "B" "C" "C" "C" "D" "D" "D" "D"

If you pay attention you’ll realize that it will look at the minimum information required. But if the column is present in the relations and elements or sets slots it will pick the first.

You can use:

TS[, "sets", "new"] <- "a"
TS[, "sets", "new"]

I recommend reading carefully the help page of ?`extract-TidySet` and make some tests based on the examples. I might have created some bugs or friction points with the extraction operations, let me know and I’ll address them (That’s the reason why I kept it below a 1.0 release).

More usable

Another usability addition to the class is the possibility to autocomplete.

Now if you tab TS$ty and press TAB it should complete to TS$type because there is a column called type. This will make it easier to use the $.

With this release, we can now check the number of sets and the number of relations each set has:

length(TS)
## [1] 4
lengths(TS)
## A B C D 
## 1 2 3 4

New function

The new function union_closed checks if the combinations of sets produce already existing sets.

union_closed(TS, sets = c("A", "B", "C"))
## [1] FALSE
union_closed(TS)
## [1] TRUE

Next steps

I hope this makes it even easier to work with the class. Combine different objects, and manipulate it more intuitively.

While creating this document I realized it has some friction points.
In next release it will be possible to:

  1. Subset the object by element or set name, if only querying elements and sets slots. For example TS[c("3", "4"), "elements", "NEWS"] <- TRUE
  2. Use names and dimnames to discover which data is in the object.
  3. Some bug fixes about these methods.

Enjoy!

I would also apreciate to hear some feedback about how you are using the package. It will help me to direct the development/maintenance of the package wherever it is more useful.

Reproducibility

## ─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.3.1 (2023-06-16)
##  os       Ubuntu 22.04.3 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Madrid
##  date     2023-12-18
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  BaseSet     * 0.9.0   2023-08-23 [1] local
##  blogdown      1.18    2023-06-19 [1] CRAN (R 4.3.1)
##  bookdown      0.37    2023-12-01 [1] CRAN (R 4.3.1)
##  bslib         0.6.1   2023-11-28 [1] CRAN (R 4.3.1)
##  cachem        1.0.8   2023-05-01 [1] CRAN (R 4.3.1)
##  cli           3.6.2   2023-12-11 [1] CRAN (R 4.3.1)
##  digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.1)
##  dplyr         1.1.4   2023-11-17 [1] CRAN (R 4.3.1)
##  evaluate      0.23    2023-11-01 [1] CRAN (R 4.3.2)
##  fansi         1.0.6   2023-12-08 [1] CRAN (R 4.3.1)
##  fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.1)
##  generics      0.1.3   2022-07-05 [1] CRAN (R 4.3.1)
##  glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.1)
##  htmltools     0.5.7   2023-11-03 [1] CRAN (R 4.3.2)
##  jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.3.1)
##  jsonlite      1.8.8   2023-12-04 [1] CRAN (R 4.3.1)
##  knitr         1.45    2023-10-30 [1] CRAN (R 4.3.2)
##  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.3.2)
##  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.1)
##  pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.1)
##  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.1)
##  R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.1)
##  rlang         1.1.2   2023-11-04 [1] CRAN (R 4.3.1)
##  rmarkdown     2.25    2023-09-18 [1] CRAN (R 4.3.1)
##  rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.1)
##  sass          0.4.8   2023-12-06 [1] CRAN (R 4.3.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.1)
##  tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.1)
##  tidyselect    1.2.0   2022-10-10 [1] CRAN (R 4.3.1)
##  utf8          1.2.4   2023-10-22 [1] CRAN (R 4.3.2)
##  vctrs         0.6.5   2023-12-01 [1] CRAN (R 4.3.1)
##  xfun          0.41    2023-11-01 [1] CRAN (R 4.3.2)
##  yaml          2.3.8   2023-12-11 [1] CRAN (R 4.3.1)
## 
##  [1] /home/lluis/bin/R/4.3.1
##  [2] /opt/R/4.3.1/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Edit this page

Avatar
Lluís Revilla Sancho
Data scientist

Data scientist with interests in software quality, mostly R.

Related