---
title: "Lefser finds features that have greatest differences between classes."
author: |
| Asya Khleborodova, Ludwig Geistlinger, and Levi Waldron
| School of Public Health, City University of New York
date: "`r Sys.Date()`"
abstract: ""
email:
output:
BiocStyle::html_document:
toc: true
toc_depth: 2
vignette: >
%\VignetteIndexEntry{Introduction to the lefser R implementation of the popular LEfSE software for biomarker discovery in microbiome analysis.}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
# Introduction
Lefser is metagenomic biomarker discovery tool that is based on
[LEfSe](https://huttenhower.sph.harvard.edu/galaxy/) tool and is published by
[Huttenhower et al. 2011](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3218848/).
`Lefser` is the R implementation of the `LEfSe` method.
Using statistical analyses, `lefser` compares microbial populations of healthy
and diseased subjects to discover differencially expressed microorganisms.
`Lefser` than computes effect size, which estimates magnitude of differential
expression between the populations for each differentially expressed
microorganism. Subclasses of classes can also be assigned and used within the
analysis.
```{r style, echo = FALSE, results = 'asis'}
knitr::opts_chunk$set(fig.align = "center")
```
# Installation
To install Bioconductor and the `lefser` package, run the following
commands.
```{r, eval = FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("lefser")
```
Then load the `lefser` package.
```{r,include=TRUE,results="hide",message=FALSE,warning=FALSE}
library(lefser)
```
# Overview and example use of `lefser`
The `lefser` function can be used with a `SummarizedExperiment`.
Load the `zeller14` example dataset and exclude 'adenoma' conditions.
```{r}
data(zeller14)
zeller14 <- zeller14[, zeller14$study_condition != "adenoma"]
```
Note. `lefser` supports only two-group contrasts.
The `colData` in the `SummarizedExperiment` dataset contains the grouping
column `study_condition` which includes the 'control' and 'CRC' groups.
```{r}
table(zeller14$study_condition)
```
There can be subclasses in each group condition. In the example dataset
we include `age_category` as a subclass of `study_condition` which includes
'adults' and 'seniors'. This variable will correspond to the `blockCol`
input argument.
```{r}
table(zeller14$age_category)
```
We can create a contingency table for the two categorical variables.
```{r}
table(zeller14$age_category, zeller14$study_condition)
```
We can now use the `lefser` function. It provides results as a `data.frame`
with the names of selected microorganisms and their effect size.
```{r}
res <- lefser(zeller14, groupCol = "study_condition", blockCol = "age_category")
head(res)
```
# Visualizing results with `lefserPlot`
```{r}
lefserPlot(res)
```
# Interoperating with phyloseq
When using `phyloseq` objects, we recommend to extract the data and create a
`SummarizedExperiment` object as follows:
```{r}
library(phyloseq)
fp <- system.file(
"extdata", "study_1457_split_library_seqs_and_mapping.zip",
package = "phyloseq"
)
kostic <- suppressWarnings({
microbio_me_qiime(fp)
})
counts <- unclass(otu_table(kostic))
colData <- as(sample_data(kostic), "data.frame")
## create a SummarizedExperiment object
SummarizedExperiment(
assays = list(counts = counts), colData = colData
)
```
You may also consider using `makeTreeSummarizedExperimentFromPhyloseq` from the
`mia` package (example not shown).
## sessionInfo
```{r}
sessionInfo()
```