--- title: "Get Started" author: - name: Pol Castellano-Escuder, Ph.D. affiliation: Duke University email: polcaes@gmail.com date: "`r BiocStyle::doc_date()`" output: BiocStyle::html_document vignette: > %\VignetteIndexEntry{Get Started} %\VignetteEngine{knitr::rmarkdown} %\usepackage[utf8]{inputenc} %\VignetteEncoding{UTF-8} bibliography: ["POMA.bib"] biblio-style: apalike link-citations: true --- **Compiled date**: `r Sys.Date()` **Last edited**: 2024-01-21 **License**: `r packageDescription("POMA")[["License"]]` ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, fig.align = "center", comment = ">" ) ``` # Installation To install the Bioconductor version of the POMA package, run the following code: ```{r, eval = FALSE} # install.packages("BiocManager") BiocManager::install("POMA") ``` # Load POMA ```{r, warning = FALSE, message = FALSE} library(POMA) library(ggtext) library(magrittr) ``` # The POMA Workflow The `POMA` package functions are organized into three sequential, distinct blocks: Data Preparation, Pre-processing, and Statistical Analysis. ## Data Preparation The `SummarizedExperiment` package from Bioconductor offers well-defined computational data structures for representing various types of omics experiment data [@SummarizedExperiment]. Utilizing these data structures can significantly improve data analysis. `POMA` leverages `SummarizedExperiment` objects, enhancing the reusability of existing methods for this class and contributing to more robust and reproducible workflows. The workflow begins with either loading or creating a `SummarizedExperiment` object. Typically, your data might be stored in separate matrices and/or data frames. The `PomaCreateObject` function simplifies this step by quickly building a SummarizedExperiment object for you. ```{r, eval = FALSE} # create an SummarizedExperiment object from two separated data frames target <- readr::read_csv("your_target.csv") features <- readr::read_csv("your_features.csv") data <- PomaCreateObject(metadata = target, features = features) ``` Alternatively, if your data is already in a `SummarizedExperiment` object, you can proceed directly to the pre-processing step. This vignette uses example data provided in `POMA`. ```{r, warning = FALSE, message = FALSE} # load example data data("st000336") ``` ```{r, warning = FALSE, message = FALSE} st000336 ``` ## Pre Processing ### Missing Value Imputation ```{r} imputed <- st000336 %>% PomaImpute(method = "knn", zeros_as_na = TRUE, remove_na = TRUE, cutoff = 20) imputed ``` ### Normalization ```{r} normalized <- imputed %>% PomaNorm(method = "log_pareto") normalized ``` #### Normalization effect ```{r, message = FALSE} PomaBoxplots(imputed, x = "samples") # data before normalization ``` ```{r, message = FALSE} PomaBoxplots(normalized, x = "samples") # data after normalization ``` ```{r, message = FALSE} PomaDensity(imputed, x = "features") # data before normalization ``` ```{r, message = FALSE} PomaDensity(normalized, x = "features") # data after normalization ``` ### Outlier Detection ```{r} PomaOutliers(normalized)$polygon_plot pre_processed <- PomaOutliers(normalized)$data pre_processed ``` ```{r} # pre_processed %>% # PomaUnivariate(method = "ttest") %>% # magrittr::extract2("result") ``` ```{r} # imputed %>% # PomaVolcano(pval = "adjusted", labels = TRUE) ``` ```{r, warning = FALSE} # pre_processed %>% # PomaUnivariate(method = "mann") %>% # magrittr::extract2("result") ``` ```{r} # PomaLimma(pre_processed, contrast = "Controls-DMD", adjust = "fdr") ``` ```{r} # poma_pca <- PomaMultivariate(pre_processed, method = "pca") ``` ```{r} # poma_pca$scoresplot + # ggplot2::ggtitle("Scores Plot") ``` ```{r, warning = FALSE, message = FALSE, results = 'hide'} # poma_plsda <- PomaMultivariate(pre_processed, method = "plsda") ``` ```{r} # poma_plsda$scoresplot + # ggplot2::ggtitle("Scores Plot") ``` ```{r} # poma_plsda$errors_plsda_plot + # ggplot2::ggtitle("Error Plot") ``` ```{r} # poma_cor <- PomaCorr(pre_processed, label_size = 8, coeff = 0.6) # poma_cor$correlations # poma_cor$corrplot # poma_cor$graph ``` ```{r} # PomaCorr(pre_processed, corr_type = "glasso", coeff = 0.6)$graph ``` ```{r} # alpha = 1 for Lasso # PomaLasso(pre_processed, alpha = 1, labels = TRUE)$coefficientPlot ``` ```{r} # poma_rf <- PomaRandForest(pre_processed, ntest = 10, nvar = 10) # poma_rf$error_tree ``` ```{r} # poma_rf$confusionMatrix$table ``` ```{r} # poma_rf$MeanDecreaseGini_plot ``` # Session Information ```{r} sessionInfo() ``` # References