`r knitr::opts_chunk$set(tidy=FALSE)` # Objects in R Motivation and relevance - "Objects" provide a way to encapsulate complex inter-related data into managable structures accessible in a consistent way. This reduces book-keeping errors while enabling advanced analysis. - Objects are pervasive in R (e.g., `data.frame()` and `lm()` return objects). - Using formally defined objects increases interoperability between packages, while making it relatively easy to apply established concepts to derived (new but similar) data types. - Bioconductor makes extensive use of formal objects. These are reviewed with an eye to understanding the design and principles of their implementation. ## S3 - Instance-based - Single-inheritance - Single-dispatch **Example** `stats::lm()` Objects - Construct objects by fitting a linear model. Constructor is the function `lm()`. ```{r lm-constructor} x <- rnorm(1000) df <- data.frame(x=x, y=x + rnorm(length(x), sd=.5)) fit <- lm(y ~ x, df) fit ``` - Class usually based on a `list()` with a `class` attribute. ```{r lm-properties} is.list(fit) names(fit) str(head(fit, 3)) ``` - Class inheritance via vector of named classes, `class=c("derived", "base")`. - No independent class defintion or guarantee of object structure Generic and methods - Generic defined by a plain-old-function with body `UseMethod`, e.g., ```{r} print ``` - Method defined by function composed by the generic + "." + class, e.g., the `anova` method for objects of class `lm` ```{r} head(print.lm) ``` Discovery - Methods for class `lm`: `methods(class="lm")` - Classes for which a method is defined: `methods("anova")` - "Hidden" methods are registered with the generic, but not on the search path; use `getAnywhere()` or triple-colon resolution `stats:::anova.loess` **Exercise**: a minimal class. Implement a minimal class `MyClass` and a `print` method that provides a tidy summary of the object. Hints: `structure()` with argument `class`, `list()`, `class()`, `cat` to print. **Exercise**: a simple data base. Implement a minimal class containing people and their occupations. Implement functions and methods to create and print the class, and to subset the class. ## S4 - Defined classes - Multiple inheritance - Multiple dispatch **Exercise** a minimal class. Use `setClass` to implement a simple class that represents people and their occupation. Use `setMethod` to write a `show` method to display the class in a reasonable way, including when the class has a very large number of elements. ```{r setClass} .Empl <- setClass("Empl", representation(person="character", job="character")) setMethod(show, "Empl", function(object) { len <- length(object@person) cat("class: ", class(object), " (n =", len, ")\n", sep="") cat("person:", head(object@person), if (len > 6) "...", "\n") cat("job:", head(object@job), if (len > 6) "...", "\n") }) .Empl() .Empl(person=c("Xavier", "Melanie", "Octavio"), job=c("Leader", "Innovator", "Doer")) .Empl(person=LETTERS, job=letters) ``` - More in a later section! ## Reference classes - Mutable - Defined classes - Multiple inheritance - Single dispatch **Example**: `ShortRead::FastqStreamer()` ```{r FastqStreamer, eval=FALSE} library(ShortRead) example(FastqStreamer) ``` Why a reference class? - External data resource with persistent state Some design decisions: - Simple interface: construct with `FastqStreamer()`, invoke with `yield()`. - Class methods are not meant for end-user access - Avoid introducing yet another way of interacting with R objects - Avoid exposing internal detail to user ## And... - [R.oo][] - [proto][]: [prototype][]-based objects [R.oo]: http://cran.r-project.org/web/packages/R.oo/index.html [proto]: http://cran.r-project.org/web/packages/proto/index.html [prototype]: http://en.wikipedia.org/wiki/Prototype-based_programming