--- title: "GWAS catalog: Phenotypes systematized by the experimental factor ontology" author: "Vincent J. Carey, stvjc at channing.harvard.edu" date: "August 2015" output: BiocStyle::html_document: highlight: pygments number_sections: yes toc: yes vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{gwascat -- GRanges for GWAS hits in EBI catalog} %\VignetteEncoding{UTF-8} --- ```{r style, echo = FALSE, results = 'asis'} suppressPackageStartupMessages({ library(gwascat) }) ``` # Introduction The EMBL-EBI curation of the GWAS catalog (originated at NHGRI) includes labelings of GWAS hit records with terms from the EBI Experimental Factor Ontology (EFO). The Bioconductor `r Biocpkg("gwascat")` package includes a `r Biocpkg("graph")` representation of the ontology and records the EFO assignments of GWAS results in its basic representations of the catalog. # Views of the EFO Term names are regimented. ```{r getg} library(gwascat) requireNamespace("graph") data(efo.obo.g) efo.obo.g sn = head(graph::nodes(efo.obo.g)) sn ``` The nodeData of the graph includes a `def` field. We will process that to create a data.frame. ```{r procdef} nd = graph::nodeData(efo.obo.g) alldef = sapply(nd, function(x) unlist(x[["def"]])) allnames = sapply(nd, function(x) unlist(x[["name"]])) alld2 = sapply(alldef, function(x) if(is.null(x)) return(" ") else x[1]) mydf = data.frame(id = names(allnames), concept=as.character(allnames), def=unlist(alld2)) ``` We can create an interactive data table for all terms, but for performance we limit the table size to terms involving the string 'autoimm'. ```{r limtab} limdf = mydf[ grep("autoimm", mydf$def, ignore.case=TRUE), ] requireNamespace("DT") suppressWarnings({ DT::datatable(limdf, rownames=FALSE, options=list(pageLength=5)) }) ``` # Graph operations The use of the graph representation allows various approaches to traversal and selection. Here we examine metadata for a term of interest, transform to an undirected graph, and obtain the adjacency list for that term. ```{r lkg} graph::nodeData(efo.obo.g, "EFO:0000540") ue = graph::ugraph(efo.obo.g) neighISD = graph::adj(ue, "EFO:0000540")[[1]] sapply(graph::nodeData(graph::subGraph(neighISD, efo.obo.g)), "[[", "name") ``` With RBGL we can compute paths to terms from root. ```{r lkggg} requireNamespace("RBGL") p = RBGL::sp.between( efo.obo.g, "EFO:0000685", "EFO:0000001") sapply(graph::nodeData(graph::subGraph(p[[1]]$path_detail, efo.obo.g)), "[[", "name") ``` # Connections to the GWAS catalog The `mcols` element of the `GRanges` instances provided by gwascat include mapped EFO terms and EFO URIs. ```{r lkef} data(ebicat_2020_04_30) names(S4Vectors::mcols(ebicat_2020_04_30)) adinds = grep("autoimmu", ebicat_2020_04_30$MAPPED_TRAIT) adgr = ebicat_2020_04_30[adinds] adgr S4Vectors::mcols(adgr)[, c("MAPPED_TRAIT", "MAPPED_TRAIT_URI")] ```