--- title: "Working with the Gene Ontology" author: - name: Kevin Rue-Albrecht affiliation: - University of Oxford email: kevin.rue-albrecht@imm.ox.ac.uk output: BiocStyle::html_document: self_contained: yes toc: true toc_float: true toc_depth: 2 code_folding: show date: "`r doc_date()`" package: "`r pkg_ver('iSEEpathways')`" vignette: > %\VignetteIndexEntry{Working with the Gene Ontology} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", crop = NULL ## Related to https://stat.ethz.ch/pipermail/bioc-devel/2020-April/016656.html ) ``` ```{r, eval=!exists("SCREENSHOT"), include=FALSE} SCREENSHOT <- function(x, ...) knitr::include_graphics(x) ``` ```{r vignetteSetup, echo=FALSE, message=FALSE, warning = FALSE} ## Track time spent on making the vignette startTime <- Sys.time() ## Bib setup library("RefManageR") ## Write bibliography information bib <- c( R = citation(), BiocStyle = citation("BiocStyle")[1], knitr = citation("knitr")[1], RefManageR = citation("RefManageR")[1], rmarkdown = citation("rmarkdown")[1], sessioninfo = citation("sessioninfo")[1], testthat = citation("testthat")[1], iSEEpathways = citation("iSEEpathways")[1] ) ``` # Scenario In this vignette, we demonstrate how one may use the package `r BiocStyle::Biocpkg("GO.db")` to dynamically display additional information about selected pathways in the interactive user interface. # Demonstration ## Example data First, we generate pathway analysis results for simulated data using `r BiocStyle::Biocpkg("fgsea")`. In particular, we use the package `r BiocStyle::Biocpkg("org.Hs.eg.db")` to fetch real gene sets. To reduce memory footprint, we retain only the gene sets associated with 15 to 500 genes. Then, we simulate a score for each of the gene present in any of those remaining gene sets. In practice, that score could be the log~2~ fold-change of the gene in a differential expression analysis (among other possibilities). Finally, we perform an FGSEA on the simulated data. ```{r "start", message=FALSE, warning=FALSE} library("org.Hs.eg.db") library("fgsea") # Example data ---- ## Pathways pathways <- select(org.Hs.eg.db, keys(org.Hs.eg.db, "SYMBOL"), c("GOALL"), keytype = "SYMBOL") pathways <- subset(pathways, ONTOLOGYALL == "BP") pathways <- unique(pathways[, c("SYMBOL", "GOALL")]) pathways <- split(pathways$SYMBOL, pathways$GOALL) len_pathways <- lengths(pathways) pathways <- pathways[len_pathways > 15 & len_pathways < 500] ## Features set.seed(1) # simulate a score for all genes found across all pathways feature_stats <- rnorm(length(unique(unlist(pathways)))) names(feature_stats) <- unique(unlist(pathways)) # arbitrarily select a pathway to simulate enrichment pathway_id <- "GO:0046324" pathway_genes <- pathways[[pathway_id]] # increase score of genes in the selected pathway to simulate enrichment feature_stats[pathway_genes] <- feature_stats[pathway_genes] + 1 # fgsea ---- set.seed(42) fgseaRes <- fgsea(pathways = pathways, stats = feature_stats, minSize = 15, maxSize = 500) head(fgseaRes[order(pval), ]) ``` Then, we embed the `r BiocStyle::Biocpkg("fgsea")` results in a `r BiocStyle::Biocpkg("SummarizedExperiment")` object. In this case, we create an empty `?SummarizedExperiment-class` object, without any simulated count data nor metadata, as we will not be using any of those data in this example. We then embed the pathway analysis results in the newly created `?SummarizedExperiment-class` object. But first, we reorder the results by increasing p-value. Although not essential, this implicitly defines the default ordering of the table in the live app. ```{r, message=FALSE, warning=FALSE} library("SummarizedExperiment") library("iSEEpathways") se <- SummarizedExperiment() fgseaRes <- fgseaRes[order(pval), ] se <- embedPathwaysResults(fgseaRes, se, name = "fgsea", class = "fgsea", pathwayType = "GO") ``` ## Pathway information In this example, we configure the app option `PathwaysTable.select.details` to define a function that, given the identifier of the GO term currently selected in a panel, displays information about that GO term. Although not essential, this is a user-friendly and immediate way to 'translate' machine-friendly database identifiers into human-friendly descriptions. ```{r, message=FALSE, warning=FALSE} library("iSEE") library("GO.db") library("shiny") go_details <- function(x) { info <- select(GO.db, x, c("TERM", "ONTOLOGY", "DEFINITION"), "GOID") html <- list(p(strong(info$GOID), ":", info$TERM, paste0("(", info$ONTOLOGY, ")"))) if (!is.na(info$DEFINITION)) { html <- append(html, list(p(info$DEFINITION))) } tagList(html) } se <- registerAppOptions(se, PathwaysTable.select.details = go_details) ``` ## Live app Finally, we configure the app initial state and launch the live app. ```{r, message=FALSE} app <- iSEE(se, initial = list( PathwaysTable(ResultName="fgsea", Selected = "GO:0046324", PanelWidth = 12L) )) if (interactive()) { shiny::runApp(app) } ``` ```{r, echo=FALSE, out.width="100%"} SCREENSHOT("screenshots/gene_ontology.png", delay=20) ``` # Reproducibility The `r Biocpkg("iSEEpathways")` package `r Citep(bib[["iSEEpathways"]])` was made possible thanks to: * R `r Citep(bib[["R"]])` * `r Biocpkg("BiocStyle")` `r Citep(bib[["BiocStyle"]])` * `r CRANpkg("knitr")` `r Citep(bib[["knitr"]])` * `r CRANpkg("RefManageR")` `r Citep(bib[["RefManageR"]])` * `r CRANpkg("rmarkdown")` `r Citep(bib[["rmarkdown"]])` * `r CRANpkg("sessioninfo")` `r Citep(bib[["sessioninfo"]])` * `r CRANpkg("testthat")` `r Citep(bib[["testthat"]])` This package was developed using `r BiocStyle::Biocpkg("biocthis")`. Code for creating the vignette ```{r createVignette, eval=FALSE} ## Create the vignette library("rmarkdown") system.time(render("gene-ontology.Rmd", "BiocStyle::html_document")) ## Extract the R code library("knitr") knit("gene-ontology.Rmd", tangle = TRUE) ``` Date the vignette was generated. ```{r reproduce1, echo=FALSE} ## Date the vignette was generated Sys.time() ``` Wallclock time spent generating the vignette. ```{r reproduce2, echo=FALSE} ## Processing time in seconds totalTime <- diff(c(startTime, Sys.time())) round(totalTime, digits = 3) ``` `R` session information. ```{r reproduce3, echo=FALSE} ## Session info library("sessioninfo") options(width = 120) session_info() ``` # Bibliography This vignette was generated using `r Biocpkg("BiocStyle")` `r Citep(bib[["BiocStyle"]])` with `r CRANpkg("knitr")` `r Citep(bib[["knitr"]])` and `r CRANpkg("rmarkdown")` `r Citep(bib[["rmarkdown"]])` running behind the scenes. Citations made with `r CRANpkg("RefManageR")` `r Citep(bib[["RefManageR"]])`. ```{r vignetteBiblio, results = "asis", echo = FALSE, warning = FALSE, message = FALSE} ## Print bibliography PrintBibliography(bib, .opts = list(hyperlink = "to.doc", style = "html")) ```