--- title: "Lefser finds features that have greatest differences between classes." author: | | Asya Khleborodova, Ludwig Geistlinger, and Levi Waldron | School of Public Health, City University of New York date: "`r Sys.Date()`" abstract: "" email: output: BiocStyle::html_document: toc: true toc_depth: 2 vignette: > %\VignetteIndexEntry{Introduction to the lefser R implementation of the popular LEfSE software for biomarker discovery in microbiome analysis.} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # Introduction Lefser is metagenomic biomarker discovery tool that is based on [LEfSe](https://huttenhower.sph.harvard.edu/galaxy/) tool and is published by [Huttenhower et al. 2011](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3218848/). `Lefser` is the R implementation of the `LEfSe` method. Using statistical analyses, `lefser` compares microbial populations of healthy and diseased subjects to discover differencially expressed microorganisms. `Lefser` than computes effect size, which estimates magnitude of differential expression between the populations for each differentially expressed microorganism. Subclasses of classes can also be assigned and used within the analysis. ```{r style, echo = FALSE, results = 'asis'} knitr::opts_chunk$set(fig.align = "center") ``` # Installation To install Bioconductor and the `lefser` package, run the following commands. ```{r, eval = FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("lefser") ``` Then load the `lefser` package. ```{r,include=TRUE,results="hide",message=FALSE,warning=FALSE} library(lefser) ``` # Overview and example use of `lefser` The `lefser` function can be used with a `SummarizedExperiment`. Load the `zeller14` example dataset and exclude 'adenoma' conditions. ```{r} data(zeller14) zeller14 <- zeller14[, zeller14$study_condition != "adenoma"] ``` Note. `lefser` supports only two-group contrasts. The `colData` in the `SummarizedExperiment` dataset contains the grouping column `study_condition` which includes the 'control' and 'CRC' groups. ```{r} table(zeller14$study_condition) ``` There can be subclasses in each group condition. In the example dataset we include `age_category` as a subclass of `study_condition` which includes 'adults' and 'seniors'. This variable will correspond to the `blockCol` input argument. ```{r} table(zeller14$age_category) ``` We can create a contingency table for the two categorical variables. ```{r} table(zeller14$age_category, zeller14$study_condition) ``` We can now use the `lefser` function. It provides results as a `data.frame` with the names of selected microorganisms and their effect size. ```{r} res <- lefser(zeller14, groupCol = "study_condition", blockCol = "age_category") head(res) ``` # Visualizing results with `lefserPlot` ```{r} lefserPlot(res) ``` # Interoperating with phyloseq When using `phyloseq` objects, we recommend to extract the data and create a `SummarizedExperiment` object as follows: ```{r} library(phyloseq) fp <- system.file( "extdata", "study_1457_split_library_seqs_and_mapping.zip", package = "phyloseq" ) kostic <- suppressWarnings({ microbio_me_qiime(fp) }) counts <- unclass(otu_table(kostic)) colData <- as(sample_data(kostic), "data.frame") ## create a SummarizedExperiment object SummarizedExperiment( assays = list(counts = counts), colData = colData ) ``` You may also consider using `makeTreeSummarizedExperimentFromPhyloseq` from the `mia` package (example not shown). ## sessionInfo
```{r} sessionInfo() ```