--- title: "GenomicInteractionNodes Guide" author: "Jianhong Ou, Yu Xiang, Xiaolin Wei, and Yarui Diao" date: "`r doc_date()`" package: "`r pkg_ver('GenomicInteractionNodes')`" vignette: > %\VignetteIndexEntry{GenomicInteractionNodes Vignette} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} output: BiocStyle::html_document --- ```{r, echo=FALSE, results="hide", warning=FALSE, message=FALSE} library(GenomicInteractionNodes) library(rtracklayer) library(TxDb.Hsapiens.UCSC.hg19.knownGene) library(org.Hs.eg.db) knitr::opts_chunk$set(warning=FALSE, message=FALSE) ``` # Introduction The profiles of genome interactions can be categorized into three levels: loops, domains and compartments. Cis-regulatory elements (cREs) regulate the distal target gene expression through three-dimensional chromatin looping. Most of these loops occur within the boundaries of topologically associating domains (TADs). The TAD boundaries are enriched for insulator binding proteins. HiC analysis revealed the TADs can be segregated into an active (A) and inactive (B) compartment. TADs are megabase-sized chromosomal regions where the loop sizes may very from thousand bases to mega bases. We know that not all the loops within one TADs are involved in single binding platform. There are hundreds or even thousands binding platforms even in a single TAD. Batut et al. report the 'tethering element' help to establish long-range enhancer-promoter interactions. Tethering elements are the one kind of the binding platform which bring together enhancers and promoters for rapid gene activation. For each binding platform, multiple loops may work together to work as a single function, such as repression or promotion one or more distal target gene expression. That means multiple promoters and enhancers may bind together to perform as super enhancer or regulatory elements. We defined such kind of binding platform as genomic interaction node or interaction hot spot. The interaction node can be a tethering element working as node, or a super-enhancer (or super regulatory element region). It is a level of genome organization higher than loops but smaller than TADs. It is a kind of tethering element with multiple interaction loops. The interaction node has two attributes: 1. it must contain multiple interaction loops, 2. it regulates one or more target genes. To help users to define the interaction nodes, we developed the `Bioconductor` package: `GenomicInteractionNodes`. The `GenomicInteractionNodes` package will define the interaction node by testing the involved loops within one connected component by Poisson distribution. The annotated loops will be used for enrichment analysis. # Installation You can install the package via `devtools::install_github` from `github`. ```{r devtoolsInstallation, eval=FALSE} library(devtools) install_github("jianhong/GenomicInteractionNodes") ``` You can also try to install it via `BiocManager::install` when it is ready in Bioconductor. ```{r BiocManagerInstallation, eval=FALSE} library(BiocManager) install("GenomicInteractionNodes") ``` # Quick start Here is an example to use `GenomicInteractionNodes` to define interaction nodes and do downstream enrichment analysis. There are three steps, 1. `detectNodes`, define the nodes. 2. `annotateNodes`, annotate the nodes for downstream analysis. 3. `enrichmentAnalysis`, Gene Ontology enrichment analysis. ```{r quickStart} ## load library library(GenomicInteractionNodes) library(rtracklayer) library(TxDb.Hsapiens.UCSC.hg19.knownGene) ## for human hg19 library(org.Hs.eg.db) ## used to convert gene_id to gene_symbol library(GO.db) ## used for enrichment analysis ## import the interactions from bedpe file p <- system.file("extdata", "WT.2.bedpe", package = "GenomicInteractionNodes") #### please try to replace the connection to your file path interactions <- import(con=p, format="bedpe") ## define the nodes nodes <- detectNodes(interactions) names(nodes) lapply(nodes, head, n=2) ## annotate the nodes by gene promoters node_regions <- nodes$node_regions node_regions <- annotateNodes(node_regions, txdb=TxDb.Hsapiens.UCSC.hg19.knownGene, orgDb=org.Hs.eg.db, upstream=2000, downstream=500) head(node_regions, n=2) ## Gene Ontology enrichment analysis enrich <- enrichmentAnalysis(node_regions, orgDb=org.Hs.eg.db) names(enrich$enriched) names(enrich$enriched_in_compound) res <- enrich$enriched$BP res <- res[order(res$fdr), ] head(res, n=2) ``` # Session Info ```{r sessionInfo} sessionInfo() ```