---
title: "Interactive exploration of design matrices with ExploreModelMatrix"
author: "Charlotte Soneson, Federico Marini, Michael I Love, Florian Geier and Michael B Stadler"
date: "`r Sys.Date()`"
output: 
  html_vignette
vignette: >
  %\VignetteIndexEntry{ExploreModelMatrix}
  %\VignetteEncoding{UTF-8} 
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  chunk_output_type: console
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
stopifnot(requireNamespace("htmltools"))
htmltools::tagList(rmarkdown::html_dependency_font_awesome())
```

# Introduction

`ExploreModelMatrix` is an R package for visualizing design matrices generated
by the `model.matrix()` R function. Provided with a sample information table 
and a design formula, the `ExploreModelMatrix()` function launches a shiny 
app where the user can explore the fitted values (in terms of the model 
coefficients) for each combination of predictor values. In addition, the app 
allows the user to interactively change the design formula and the reference 
levels of factor variables as well as drop unwanted columns from the design 
matrix, in order to explore the effect on the composition of the fitted values.
Note that `ExploreModelMatrix` is not intended to be used to determine _which_ 
design formula that should be used for analyzing a data set. Instead, its 
purpose is to assist in the interpretation of the coefficients in a given 
model.

In addition to the interactive visualization, `ExploreModelMatrix` also 
provides a function, `VisualizeDesign()`, for generating static 
visualizations.

In this vignette, we illustrate how the package can be used by showing 
examples of applying the functions to various experimental design setups. 
Many examples are taken from questions raised at the 
[Bioconductor support site](https://support.bioconductor.org/).

```{r setup}
library(ExploreModelMatrix)
```

# Interface

The `ExploreModelMatrix()` function opens a graphical interface where the user
can interactively explore the provided design. This section gives an overview
of what is shown in the graphical interface. A step-by-step tour is also
available by clicking on the <i class="fa fa-question-circle"></i> icon in
the top right of the application.

<img src=`r system.file(package='ExploreModelMatrix', 'www/ExploreModelMatrix.jpg')` width="600"/>

The sidebar contains the input controls. The design formula of interest is 
typed into the `Design formula` text box, and must start with the `~` symbol. 
It can be changed interactively while using the application. If the
`ExploreModelMatrix()` function is called with `sampleData=NULL`, there will
also be an input control allowing a tab-delimited text file with sample
information to be uploaded to the application. Finally, the package contains a
collection of example designs, suitable for teaching, exploration and
illustration. The remaining input controls allow the user to change the
reference levels of the factor variables, to drop specific columns from the
design matrix, and to change the display settings of the plots.

The first row of the main body of the application displays the fitted values 
(expressed in terms of the model coefficients) for each combination of 
predictor values, in both figure and table form. In the next row, the full 
provided sample table as well as a summary are provided, and the third row 
displays the full design matrix as well as its rank. Panels below this 
display the pseudoinverse of the design matrix, a visualization of variance 
inflation factors, a co-occurrence matrix and the correlation among the 
model coefficients.

# Examples

This section contains a number of examples of real designs, and shows how they 
can be explored with `ExploreModelMatrix`. For each example, the sample 
information table is printed out. Next, the `VisualizeDesign()` function is 
called to generate a static plot of the fitted values, in terms of the model 
coefficients. This is the same plot that is displayed in the top left panel 
of the interactive interface generated by `ExploreModelMatrix()`. We also 
provide the code for generating and (for interactive sessions) opening 
the interactive application with `ExploreModelMatrix()`.

## Example 1

This example illustrates a two-factor design (genotype and treatment), where the
effect of the genotype and treatment are assumed to be additive. For each
genotype, two treated and two control individuals are studied. The design
formula is `~ genotype + treatment`, reflecting the assumption of additivity
between the two predictors. The figure generated by the `VisualizeDesign()`
function, displayed below, shows the value of the linear predictor (or, for a
regular linear model, the fitted values) for observations with a given
combination of predictor values, in terms of the model coefficients. This can be
useful in order to set up suitable contrasts. For example, we can see that
testing the null hypothesis that the `genotypeB` coefficient is zero would
correspond to comparing observations with genotype B and those with genotype A.

```{r, fig.width = 5}
(sampleData <- data.frame(genotype = rep(c("A", "B"), each = 4),
                          treatment = rep(c("ctrl", "trt"), 4)))
vd <- VisualizeDesign(sampleData = sampleData, 
                      designFormula = ~ genotype + treatment, 
                      textSizeFitted = 4)
cowplot::plot_grid(plotlist = vd$plotlist)
app <- ExploreModelMatrix(sampleData = sampleData,
                          designFormula = ~ genotype + treatment)
if (interactive()) {
  shiny::runApp(app)
}
```

## Example 2

From https://support.bioconductor.org/p/121132/. In this example we are
considering a set of patients, each being either Resistant or Sensitive to a
treatment, and each studied before (pre) and after (post) treatment. Patients
have been renumbered within each response group, and patients with only pre- or
post-measurements are removed. We use the design 
`~ Response + Response:ind.n + Response:Treatment`. 
As can be seen from the visualization below, this lets us easily compare e.g.
post- vs pre-treatment observations within the Sensitive group (via the
`ResponseSensitive.Treatmentpre` coefficient).

```{r, fig.width = 5, fig.height = 12}
(sampleData <- data.frame(
  Response = rep(c("Resistant", "Sensitive"), c(12, 18)),
  Patient = factor(rep(c(1:6, 8, 11:18), each = 2)),
  Treatment = factor(rep(c("pre","post"), 15)), 
  ind.n = factor(rep(c(1:6, 2, 5:12), each = 2))))
vd <- VisualizeDesign(
  sampleData = sampleData,
  designFormula = ~ Response + Response:ind.n + Response:Treatment,
  textSizeFitted = 3
)
cowplot::plot_grid(plotlist = vd$plotlist, ncol = 1)
app <- ExploreModelMatrix(
  sampleData = sampleData,
  designFormula = ~ Response + Response:ind.n + Response:Treatment
)
if (interactive()) {
  shiny::runApp(app)
}
```

The design above doesn't allow comparison between Resistant and Sensitive
patients while accounting for the patient effect, since the patient is nested
within the response group. If we choose to ignore the patient effect, we can fit
a factorial model with the design formula `~ Treatment + Response`, as
illustrated below.

```{r, fig.width = 5}
vd <- VisualizeDesign(sampleData = sampleData,
                      designFormula = ~ Treatment + Response, 
                      textSizeFitted = 4)
cowplot::plot_grid(plotlist = vd$plotlist, ncol = 1)
```

## Example 3

From https://support.bioconductor.org/p/80408/. Here we are considering mice
from two conditions (ctrl/ko), each measured with and without treatment with a
drug (plus/minus). We use the design `~ 0 + batch + condition` (where `batch`
corresponds to the mouse ID), and drop the column corresponding to
`conditionko_minus` to get a full-rank design matrix.

```{r, fig.height = 4, fig.width = 6}
(sampleData = data.frame(
  condition = factor(rep(c("ctrl_minus", "ctrl_plus", 
                           "ko_minus", "ko_plus"), 3)),
  batch = factor(rep(1:6, each = 2))))
vd <- VisualizeDesign(sampleData = sampleData,
                      designFormula = ~ 0 + batch + condition, 
                      textSizeFitted = 4, lineWidthFitted = 20, 
                      dropCols = "conditionko_minus")
cowplot::plot_grid(plotlist = vd$plotlist, ncol = 1)
app <- ExploreModelMatrix(sampleData = sampleData,
                          designFormula = ~ batch + condition)
if (interactive()) {
  shiny::runApp(app)
}
```

# Exporting data from the app

It is possible to export the data used internally by the interactive 
application (in effect, the output from the internal call to 
`VisualizeDesign()`). To enable such export, first generate the `app` object 
as in the examples above, and then assign the call to `shiny::runApp()` 
to a variable to capture the output. For example: 

```{r}
if (interactive()) {
  out <- shiny::runApp(app)
}
```

To activate the export, make sure to click the button 'Close app' in order to 
close the application (don't just close the window). This will take you back to 
your R session, where the variable `out` will be populated with the data used 
in the app (in the form of a list generated by `VisualizeDesign()`).

# Session info

```{r}
sessionInfo()
```