--- title: "Example 16S Annotation Workflow" author: "Nathan D. Olson" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Example 16S Annotation Workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Overview The metagenomeFeatures package and associated annotation packages (e.g. greengene13.5MgDb) can be used to and taxonomic annotations to MRexperiment objects. The metagenomeSeq package includes the MRexperiment-class, and has a number of methods for analyzing marker-gene metagenome datasets. This vignette demonstrate how to add taxonomic annotation to a MRexperiment-class object using metagenomeFeatures. To generate a MRexperiment object with taxonomic information you need; * assayData - matrix with count data for each OTU and sample * phenoData - an AnnotatedDataFrame with sample metadata * featureData - a metagenomeAnnotation object with taxonomic information for each OTU For MRexperiment objects, the assayData and phenoData need to have the sample sample order similarly assayData and featureData need to have the sample OTU order. __Required Packages__ ```{r message=FALSE} library(Biostrings) library(metagenomeFeatures) library(metagenomeSeq) ``` The `magrittr` pipe operator `%>%` is used throughout this vignette (exported by metagenomeFeatures), see the magrittr package vignette for more information about this handy operator (https://cran.r-project.org/web/packages/magrittr/index.html). ## Creating Inital MRexperiment Object The assayData and phenoData we will use is in a file included in the metagenomeFeatures package with count data derived from the msd16s study. Creating an initial MRexperiment with count and phenoData. ```{r} assay_data <- metagenomeFeatures:::vignette_assay_data() pheno_data <- metagenomeFeatures:::vignette_pheno_data() demoMRexp = newMRexperiment(assay_data$counts, phenoData=pheno_data) demoMRexp ``` Currently the `featureData` slot is empty, we will now use the `annotateMRexp_fData` function in the metagenomeFeatures package to load the `featureData`. ## Loading featureData with metagenomeFeatures You first need a `MgDb` object for the database used to classify the OTUs. A subset of the greengenes database, `mock_MgDb` is used to generate a MgDb-object for annotating the msd16s data. This database is only used here for demonstration purposes, the `gg13.5MgDb` database in the greengenes13.5MgDb package can also be used to generate the metagenomeAnnotation object. ```{r echo=FALSE, message=FALSE} mock_MgDb <- metagenomeFeatures:::vignette_mock_MgDb() ``` ### annotateMRexp_fData The `annotateMRexp_fData` function takes an `MgDb` and `MRexperiment` objects as input and defines the `featureData` slot using the OTU ids from the `assayData` slot. The slot is defined with a `mgFeatures` object which extends the `AnnotatedDataFrame` object with additional slots with the database reference sequences (`refDbSeq`), phylogenetic tree (`refDbtree`, if available), and metadata about the reference database used. ```{r} demoMRexp <- annotateMRexp_fData(mgdb = mock_MgDb, MRobj = demoMRexp) demoMRexp ``` The `featureData` slot in the `demoMRexp` object now contains the taxonomic data for the OTUs as well as the database reference sequences as well as the phylogenetic tree. ## Accessing mgFeature Slots Each mgFeature slot can be accessed using the approporiate accessor function. __Seq Data__ ```{r} mgF_seq(demoMRexp@featureData) %>% head() ``` __Taxonomy__ ```{r} mgF_taxa(demoMRexp@featureData) %>% head() ``` __Metadata__ ```{r} mgF_meta(demoMRexp@featureData) %>% head() ``` __Phylogenetic Tree__ The tree slot is not defined for `demoMRexp`. If available the tree would be accessed using the following command. ```{r eval=FALSE} mgF_tree(demoMRexp@featureData) ``` ```{r} sessionInfo() ```