--- title: "Introduction to decoupleR" author: - name: Jesús Vélez affiliation: - National Autonomous University of Mexico email: jvelezmagic@gmail.com output: BiocStyle::html_document: self_contained: true toc: true toc_float: true toc_depth: 3 code_folding: show package: "`r pkg_ver('decoupleR')`" vignette: > %\VignetteIndexEntry{Introduction to decoupleR} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r chunk_setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r vignette_setup, echo=FALSE, message=FALSE, warning = FALSE} # Track time spent on making the vignette. start_time <- Sys.time() # Bib setup. library(RefManageR) # Write bibliography information bib <- c( R = citation(), BiocStyle = citation("BiocStyle")[1], knitr = citation("knitr")[1], rmarkdown = citation("rmarkdown")[1], sessioninfo = citation("sessioninfo")[1], testthat = citation("testthat")[1], RefManageR = citation("RefManageR")[1], decoupleR = citation("decoupleR")[1], GSVA = citation("GSVA")[1], viper = citation("viper")[1] ) ``` # Basics ## Install `decoupleR` `R` is an open-source statistical environment which can be easily modified to enhance its functionality via packages. `r Biocpkg("decoupleR")` is an `R` package available via the [Bioconductor](http://bioconductor.org) repository for packages. `R` can be installed on any operating system from [CRAN](https://cran.r-project.org/) after which you can install `r Biocpkg("decoupleR")` by using the following commands in your `R` session: ```{r bioconductor_install, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } BiocManager::install("decoupleR") # Check that you have a valid Bioconductor installation BiocManager::valid() ``` You can install the development version from [GitHub](https://github.com/) with: ```{r github_install, eval=FALSE} # install.packages("devtools") devtools::install_github("saezlab/decoupleR") ``` ## Required knowledge `r Biocpkg("decoupleR")` is based on many other packages and in particular in those that have implemented the infrastructure needed for dealing with functional genomic analysis. That is, packages like `r Biocpkg("viper")` or `r Biocpkg("GSVA")`, among others. This in order to have a centralized place from which to apply different statistics to the same data set without the need and work that would require testing in isolation. Opening the possibility of developing benchmarks that can grow easily. ## Asking for help As package developers, we try to explain clearly how to use our packages and in which order to use the functions. But `R` and `Bioconductor` have a steep learning curve so it is critical to learn where to ask for help. We would like to highlight the [Bioconductor support site](https://support.bioconductor.org/) as the main resource for getting help: remember to use the `decoupleR` tag and check [the older posts](https://support.bioconductor.org/t/decoupleR/). Other alternatives are available such as creating GitHub issues and tweeting. However, please note that if you want to receive help you should adhere to the [posting guidelines](http://www.bioconductor.org/help/support/posting-guide/). It is particularly critical that you provide a small reproducible example and your session information so package developers can track down the source of the error. ## Citing `decoupleR` We hope that `r Biocpkg("decoupleR")` will be useful for your research. Please use the following information to cite the package and the overall approach. Thank you! ```{r decoupleR citation} citation("decoupleR") ``` # Quick start to using to `decoupleR` ## Libraries `r Biocpkg("decoupleR")` provides different statistics to calculate the regulatory activity given an expression `matrix` and a `network`. It incorporates pre-existing methods to avoid recreating the wheel while implementing its own methods under an evaluation standard. Therefore, it provides flexibility when evaluating a data set with different statistics. Since inputs and outputs are always tibbles (i.e. special data frames), incorporating `r Githubpkg("tidyverse/dplyr","dplyr")` into your workflow can be useful for manipulating results, but it is not necessary. ```{r load_library, message=FALSE} library(decoupleR) library(dplyr) ``` ## Input data In order to use it, you first need to have a `matrix` where the rows represent the target nodes and the columns the different conditions in which they were evaluated. In addition, it is necessary to provide a `network` that contains at least two columns corresponding to the source and target nodes. It is noteworthy that certain methods will require specifying additional metadata columns. For instance, the mode of regulation (MoR) or the likelihood of the interaction. ```{r read_example_data} inputs_dir <- system.file("testdata", "inputs", package = "decoupleR") mat <- file.path(inputs_dir, "input-expr_matrix.rds") %>% readRDS() %>% glimpse() network <- file.path(inputs_dir, "input-dorothea_genesets.rds") %>% readRDS() %>% glimpse() ``` ## How to decouple? Once the data is loaded, you are one step away from achieving decoupling. This step corresponds to specifying which statistics you want to run. For more information about the defined statistics and their parameters, you can execute `?decouple()`. ```{r usage-decouple_function, message=TRUE} decouple( mat = mat, network = network, .source = "tf", .target = "target", statistics = c("gsva", "mean", "pscira", "scira", "viper", "ora"), args = list( gsva = list(verbose = FALSE), mean = list(.mor = "mor", .likelihood = "likelihood"), pscira = list(.mor = "mor"), scira = list(.mor = "mor"), viper = list( .mor = "mor", .likelihood = "likelihood", verbose = FALSE ), ora = list() ) ) %>% glimpse() ``` Done, you have applied different statistics to the same data set, now you can analyze them at your convenience, for example, by performing a [benchmark](). ## How it works? ### Mapping statistics with arguments Internally, `decouple()` works through `purrr::map2_dfr()` to perform statistics and argument mapping. This comes with important points: - `statistics` and `args` can be vectors of the same length. A vector of length 1 will be recycled. So, **match** is performed by **position** not by name. Using named vectors could be a good idea to make clear your intentions. - You will lose track of which statistic you are running on a certain problem. To get around this, `decouple()` works with expressions that are later evaluated. For example, it generates a toy call that represents what it is trying to run. You can show it with the option `show_toy_call = TRUE`. - If an error occurs, copy the last line that was displayed with `show_toy_call = TRUE` and execute it locally. Try to fix it and correct it on the original call. Based on the previous points, they can take the generated toy calls and execute them independently, obtaining the same results as if you were executing `decouple()`. See internal gsva calls and save results. ```{r see gsvas individual calls} gsvas_res <- decouple( mat = head(mat, 5000), network = network, .source = "tf", .target = "target", statistics = c("gsva"), args = list( gsva_default = list(verbose = FALSE), gsva_minsize = list(verbose = FALSE, ssgsea.norm = FALSE) ), show_toy_call = TRUE ) ``` Run same calls as provided by setting `show_toy_call = TRUE`. ```{r run_individual_gsvas} gsva_1 <- run_gsva( mat = head(mat, 5000), network = network, .source = "tf", .target = "target", verbose = FALSE ) gsva_2 <- run_gsva( mat = head(mat, 5000), network = network, .source = "tf", .target = "target", verbose = FALSE, ssgsea.norm = FALSE ) gsvas_res_2 <- bind_rows(gsva_1, gsva_2, .id = "run_id") ``` Now compare results and see there is not difference. ```{r see_not_differences} all.equal(gsvas_res, gsvas_res_2) ``` ### Mapping network columns To carry out the column mapping, `decoupleR` relies on the selection provided by the `r Githubpkg("tidyverse/tidyselect","tidyselect")` package. Some of the selection it provides are: - Symbols - Strings - Position Let's see an example. Input network has the following columns: ```{r show_columns} network %>% colnames() ``` We can use the way we like to do the mapping, even a combination of ways to do it. This applies not only to the decouple function, but to all functions of the `decoupleR statistics` family, identifiable by the `run_` prefix. ```{r} this_column <- "target" viper_res <- decouple( mat = mat, network = network, .source = tf, .target = !!this_column, statistics = c("viper"), args = list( viper = list( .mor = 4, .likelihood = "likelihood", verbose = FALSE ) ), show_toy_call = TRUE ) ``` # Reproducibility ## Special thanks The `r Biocpkg("decouopleR")` package `r Citep(bib[["decoupleR"]])` was made possible thanks to: * R `r Citep(bib[["R"]])` * `r Biocpkg("BiocStyle")` `r Citep(bib[["BiocStyle"]])` * `r CRANpkg("knitcitations")` `r Citep(bib[["knitcitations"]])` * `r CRANpkg("knitr")` `r Citep(bib[["knitr"]])` * `r CRANpkg("rmarkdown")` `r Citep(bib[["rmarkdown"]])` * `r CRANpkg("sessioninfo")` `r Citep(bib[["sessioninfo"]])` * `r CRANpkg("testthat")` `r Citep(bib[["testthat"]])` * `r Biocpkg("GSVA")` `r Citep(bib[["GSVA"]])` * `r Biocpkg("viper")` `r Citep(bib[["viper"]])` This package was developed using `r BiocStyle::Githubpkg("lcolladotor/biocthis")`. ## Vignette ### Create ```{r create_vignette, eval=FALSE} # Create the vignette library(rmarkdown) system.time(render("decoupleR.Rmd", "BiocStyle::html_document")) # Extract the R code library(knitr) knit("decoupleR.Rmd", tangle = TRUE) ``` ### Wallclock time spent generating the vignette ```{r reproduce_time, echo=FALSE} # Processing time in seconds total_time <- diff(c(start_time, Sys.time())) round(total_time, digits = 3) ``` ## Session information ```{r session_info, echo=FALSE} options(width = 120) sessioninfo::session_info() ``` # Bibliography This vignette was generated using `r Biocpkg("BiocStyle")` `r Citep(bib[["BiocStyle"]])` with `r CRANpkg("knitr")` `r Citep(bib[["knitr"]])` and `r CRANpkg("rmarkdown")` `r Citep(bib[["rmarkdown"]])` running behind the scenes. Citations made with `r CRANpkg('RefManageR')` `r Citep(bib[['RefManageR']])`. ```{r vignetteBiblio, results = "asis", echo = FALSE, warning = FALSE, message = FALSE} ## Print bibliography PrintBibliography(bib, .opts = list(hyperlink = "to.doc", style = "html")) ```