--- title: "SEtools" author: - name: Pierre-Luc Germain affiliation: - D-HEST Institute for Neurosciences, ETH Zürich - Laboratory of Statistical Bioinformatics, University Zürich package: SEtools output: BiocStyle::html_document: fig_height: 3.5 abstract: | Showcases the use of SEtools to merge objects of the SummarizedExperiment class. vignette: | %\VignetteIndexEntry{SEtools} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include=FALSE} library(BiocStyle) ``` # Getting started The `r Rpackage("SEtools")` package is a set of convenience functions for the _Bioconductor_ class `r Biocpkg("SummarizedExperiment")`. It facilitates merging, melting, and plotting `SummarizedExperiment` objects. **NOTE that the heatmap-related and melting functions have been moved to a standalone package, `r Biocpkg("sechm")`.** The old `sehm` function of `SEtools` should be considered deprecated, and most `SEtools` functions are conserved for legacy/reproducibility reasons (or until they find a better home). ## Package installation ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("SEtools") ``` Or, to install the latest development version: ```{r, eval=FALSE} BiocManager::install("plger/SEtools") ``` ## Example data To showcase the main functions, we will use an example object which contains (a subset of) whole-hippocampus RNAseq of mice after different stressors: ```{r} suppressPackageStartupMessages({ library(SummarizedExperiment) library(SEtools) }) data("SE", package="SEtools") SE ``` This is taken from [Floriou-Servou et al., Biol Psychiatry 2018](https://doi.org/10.1016/j.biopsych.2018.02.003). ## Merging and aggregating SEs ```{r} se1 <- SE[,1:10] se2 <- SE[,11:20] se3 <- mergeSEs( list(se1=se1, se2=se2) ) se3 ``` All assays were merged, along with rowData and colData slots. By default, row z-scores are calculated for each object when merging. This can be prevented with: ```{r} se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE) ``` If more than one assay is present, one can specify a different scaling behavior for each assay: ```{r} se3 <- mergeSEs( list(se1=se1, se2=se2), use.assays=c("counts", "logcpm"), do.scale=c(FALSE, TRUE)) ``` Differences to the `cbind` method include prefixes added to column names, optional scaling, handling of metadata (e.g. for `sechm`) ### Merging by rowData columns It is also possible to merge by rowData columns, which are specified through the `mergeBy` argument. In this case, one can have one-to-many and many-to-many mappings, in which case two behaviors are possible: * By default, all combinations will be reported, which means that the same feature of one object might appear multiple times in the output because it matches multiple features of another object. * If a function is passed through `aggFun`, the features of each object will by aggregated by `mergeBy` using this function before merging. ```{r merging} rowData(se1)$metafeature <- sample(LETTERS,nrow(se1),replace = TRUE) rowData(se2)$metafeature <- sample(LETTERS,nrow(se2),replace = TRUE) se3 <- mergeSEs( list(se1=se1, se2=se2), do.scale=FALSE, mergeBy="metafeature", aggFun=median) sechm::sechm(se3, features=row.names(se3)) ``` ### Aggregating a SE A single SE can also be aggregated by using the `aggSE` function: ```{r aggregating} se1b <- aggSE(se1, by = "metafeature") se1b ``` If the aggregation function(s) are not specified, `aggSE` will try to guess decent aggregation functions from the assay names. This is similar to `scuttle::sumCountsAcrossFeatures`, but preserves other SE slots. *** ## Other convenience functions Calculate an assay of log-foldchanges to the controls: ```{r} SE <- log2FC(SE, fromAssay="logcpm", controls=SE$Condition=="Homecage") ```

# Session info {.unnumbered} ```{r sessionInfo, echo=FALSE} sessionInfo() ```