--- title: "miRsponge: identification and analysis of miRNA sponge interaction networks and modules" author: "\\ Junpeng Zhang ()\\ School of Engineering, Dali University" date: '`r Sys.Date()`' output: BiocStyle::html_document: toc: yes BiocStyle::pdf_document: toc: yes vignette: > %\VignetteIndexEntry{miRsponge: identification and analysis of miRNA sponge interaction networks and modules} %\VignettePackage{miRsponge} % \VignetteEngine{knitr::rmarkdown} % \usepackage[utf8]{inputenc} % \VignetteEncoding{UTF-8} --- ```{r style, echo=FALSE, results="asis", message=FALSE} BiocStyle::markdown() knitr::opts_chunk$set(tidy = FALSE, warning = FALSE, message = FALSE) ``` ```{r echo=FALSE, results='hide', message=FALSE} library(miRsponge) ``` # Introduction MicroRNAs (miRNAs) are small non-coding RNAs (~22 nt) that control a wide range of biological processes including cancers via regulating target genes [1-5]. Therefore, it is important to uncover miRNA functions and regulatory mechanisms in cancers. The emergence of competing endogenous RNA (ceRNA) hypothesis [6] has challenged the traditional knowledge that coding RNAs only act as targets of miRNAs. Actually, a pool of coding and non-coding RNAs that shares common miRNA biding sites competes with each other, thus act as ceRNAs to release coding RNAs from miRNAs control. These ceRNAs are also known as miRNA sponges or miRNA decoys, and include long non-coding RNAs (lncRNAs), pseudogenes, circular RNAs (circRNAs) and messenger RNAs (mRNAs), etc [7-10]. Recent studies [11, 12] have shown that miRNA sponge network and module can help to reveal the biological mechanism in cancer. To accelerate the research of miRNA sponge, we develop an R package 'miRsponge' to implement popular methods in the identification and analysis of miRNA sponge network and module. # Identification of miRNA sponge interactions In 'spongeMethod' function, We implement seven popular methods (miRHomology [13, 14], pc [15, 16], sppc [17], ppc [11], hermes [18], muTaME [19], and cernia [20]) to identify miRNA sponge interactions. The seven methods should meet a basic condition: the significance of sharing of miRNAs by each RNA-RNA pair (e.g. adjusted p-value < 0.01). Each method has its own merit due to different evaluating indicators. Thus, we present an integrate method to combine predicted miRNA sponge interactions from different methods. ## miRHomology We implement miRHomology method based on the homology of miRNA binding sites present in the 3'UTR sequence. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") head(miRHomologyceRInt) ``` ## pc The pc function considers expression data based on miRHomology method. Significantly positive miRNA sponge interaction pairs are regarded as output. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc") head(pcceRInt) ``` ## sppc We implement sppc method based on sensitivity correlation (difference between the Pearson and partial correlation coefficients). ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") sppcceRInt <- spongeMethod(miRTarget, ExpData, senscorcutoff = 0.1, method = "sppc") head(sppcceRInt) ``` ## hermes The hermes method predicts competing endogenous RNAs via evidence for competition for miRNA regulation based on conditional mutual information. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") hermesceRInt <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "hermes") head(hermesceRInt) ``` Parameter 'num_perm' is used to set the number of permutations of input expression data. The larger the number is, the slower the calculation is. ## ppc The ppc method is a variant of the hermes method. However, it predicts competing endogenous RNAs via evidence for competition for miRNA regulation based on Partial Pearson Correlation. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") ppcceRInt <- spongeMethod(miRTarget, ExpData, num_perm = 10, method = "ppc") head(ppcceRInt) ``` Parameter 'num_perm' is used to set the number of permutations of input expression data. The larger the number is, the slower the calculation is. ## muTaME We implement the muTaME method based on the logarithm of four scores: (1) the fraction of common miRNAs, (2) the density of the MREs for all shared miRNAs, (3) the distribution of MREs of the putative RNA-RNA pairs and (4) the relation between the overall number of MREs for a putative miRNA sponge compared with the number of miRNAs that yield these MREs. There is no reason to decide which score has more contribution than the rest. Thus, we calculate a combined score by adding these four scores. To evaluate the strength of each RNA-RNA pair, we further normalize the combined scores to obtain normalized scores with interval [0 1]. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") MREs <- system.file("extdata", "MREs.csv", package="miRsponge") mres <- read.csv(MREs, header=TRUE, sep=",") muTaMEceRInt <- spongeMethod(miRTarget, mres = mres, method = "muTaME") head(muTaMEceRInt) ``` ## cernia We implement the cernia method based on the logarithm of seven scores: (1) the fraction of common miRNAs, (2) the density of the MREs for all shared miRNAs, (3) the distribution of MREs of the putative RNA-RNA pairs, (4) the relation between the overall number of MREs for a putative miRNA sponge compared with the number of miRNAs that yield these MREs, (5) the density of the hybridization energies related to MREs for all shared miRNAs, (6) the DT-Hybrid recommendation scores and (7) the pairwise Peason correlation between putative RNA-RNA pair expression data. There is no reason to decide which score has more contribution than the rest. Thus, we calculate a combined score by adding these seven scores. To evaluate the strength of each RNA-RNA pair, we further normalize the combined scores to obtain normalized scores with interval [0 1]. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") MREs <- system.file("extdata", "MREs.csv", package="miRsponge") mres <- read.csv(MREs, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") cerniaceRInt <- spongeMethod(miRTarget, ExpData, mres, method = "cernia") head(cerniaceRInt) ``` ## integrateMethod To obtain a list of high-confidence miRNA sponge interactions, we implement 'integrateMethod' function to integrate results of different methods. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc") sppcceRInt <- spongeMethod(miRTarget, ExpData, method = "sppc") Interlist <- list(miRHomologyceRInt[, 1:2], pcceRInt[, 1:2], sppcceRInt[, 1:2]) IntegrateceRInt <- integrateMethod(Interlist, Intersect_num = 2) head(IntegrateceRInt) ``` Parameter 'Intersect_num' is used to set the least number of methods intersected for integration. That is to say, we only reserve those miRNA sponge interactions predicted by at least 'Intersect_num' methods. # Validation of miRNA sponge interactions To validate the predicted miRNA sponge interactions, we implement 'spongeValidate' function. The groundtruth of miRNA sponge interactions are from miRSponge () and the experimentally validated miRNA sponge interactions of related literatures. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") Groundtruthcsv <- system.file("extdata", "Groundtruth.csv", package="miRsponge") Groundtruth <- read.csv(Groundtruthcsv, header=TRUE, sep=",") spongenetwork_validated <- spongeValidate(miRHomologyceRInt[, 1:2], directed = FALSE, Groundtruth) spongenetwork_validated ``` # Module identification from miRNA sponge interaction network To further understand the module-level properties of miRNA sponges in cancer, we implement 'netModule' function to identify miRNA sponge modules. Users can choose FN [21], MCL [22], LINKCOMM [23] and MCODE [24] for module identification. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") spongenetwork_Cluster <- netModule(miRHomologyceRInt[, 1:2], modulesize = 2) spongenetwork_Cluster ``` # Disease and functional enrichment analysis of miRNA sponge modules We implement 'moduleDEA' function to make disease enrichment analysis of modules. The disease databases used include DO: Disease Ontology database (), DGN: DisGeNET database () and NCG: Network of Cancer Genes database (). Moreover, 'moduleFEA' function is implemented to conduct functional GO, KEGG and Reactome enrichment analysis of modules. The ontology databases used contain GO: Gene Ontology database (), KEGG: Kyoto Encyclopedia of Genes and Genomes Pathway Database (), and Reactome: Reactome Pathway Database (). ```{r, eval=FALSE, include=TRUE} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") miRHomologyceRInt <- spongeMethod(miRTarget, method = "miRHomology") spongenetwork_Cluster <- netModule(miRHomologyceRInt[, 1:2], modulesize = 2) sponge_Module_DEA <- moduleDEA(spongenetwork_Cluster) sponge_Module_FEA <- moduleFEA(spongenetwork_Cluster) ``` # Survival analysis of miRNA sponge modules To make survival analysis of miRNA sponge modules, we implement 'moduleSurvival' function. ```{r} miR2Target <- system.file("extdata", "miR2Target.csv", package="miRsponge") miRTarget <- read.csv(miR2Target, header=TRUE, sep=",") ExpDatacsv <- system.file("extdata", "ExpData.csv", package="miRsponge") ExpData <- read.csv(ExpDatacsv, header=FALSE, sep=",") SurvDatacsv <- system.file("extdata", "SurvData.csv", package="miRsponge") SurvData <- read.csv(SurvDatacsv, header=TRUE, sep=",") pcceRInt <- spongeMethod(miRTarget, ExpData, method = "pc") spongenetwork_Cluster <- netModule(pcceRInt[, 1:2], modulesize = 2) sponge_Module_Survival <- moduleSurvival(spongenetwork_Cluster, ExpData, SurvData, devidePercentage=.5) sponge_Module_Survival ``` Parameter 'devidePercentage' is used to set the percentage of high risk group. # Conclusions miRsponge provides several functions to study miRNA sponge (also called ceRNA or miRNA decoy), including popular methods for identifying miRNA sponge interactions, and the integrative method to integrate miRNA sponge interactions from different methods, as well as the functions to validate miRNA sponge interactions, and infer miRNA sponge modules, conduct enrichment analysis of modules, and conduct survival analysis of modules. It could provide a useful tool for the research of miRNA sponges. # References [1] Ambros V. microRNAs: tiny regulators with great potential. Cell, 2001, 107:823–6. [2] Bartel DP. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell, 2004, 116:281–97. [3] Du T, Zamore PD. Beginning to understand microRNA function. Cell Research, 2007, 17:661–3. [4] Esquela-Kerscher A, Slack FJ. Oncomirs—microRNAs with a role in cancer. Nature Reviews Cancer, 2006, 6:259–69. [5] Lin S, Gregory RI. MicroRNA biogenesis pathways in cancer. Nature Reviews Cancer, 2015, 15:321–33. [6] Salmena L, Poliseno L, Tay Y, et al. A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language? Cell, 2011, 146(3):353-8. [7] Cesana M, Cacchiarelli D, Legnini I, et al. A long noncoding RNA controls muscle differentiation by functioning as a competing endogenous RNA. Cell, 2011, 147:358–69. [8] Poliseno L, Salmena L, Zhang J, et al. A coding-independent function of gene and pseudogene mRNAs regulates tumour biology. Nature, 2010, 465:1033–8. [9] Hansen TB, Jensen TI, Clausen BH, et al. Natural RNA circles function as efficient microRNA sponges. Nature, 2013, 495:384–8. [10] Memczak S, Jens M, Elefsinioti A, et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature, 2013, 495:333–8. [11] Le TD, Zhang J, Liu L, et al. Computational methods for identifying miRNA sponge interactions. Brief Bioinform., 2017, 18(4):577-590. [12] Tay Y, Rinn J, Pandolfi PP. The multilayered complexity of ceRNA crosstalk and competition. Nature, 2014, 505:344–52. [13] Li JH, Liu S, Zhou H, et al. starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein-RNA interaction networks from large-scale CLIP-Seq data. Nucleic Acids Res., 2014, 42(Database issue):D92-7. [14] Sarver AL, Subramanian S. Competing endogenous RNA database. Bioinformation, 2012, 8(15):731-3. [15] Zhou X, Liu J, Wang W, Construction and investigation of breast-cancer-specific ceRNA network based on the mRNA and miRNA expression data. IET Syst Biol., 2014, 8(3):96-103. [16] Xu J, Li Y, Lu J, et al. The mRNA related ceRNA-ceRNA landscape and significance across 20 major cancer types. Nucleic Acids Res., 2015, 43(17):8169-82. [17] Paci P, Colombo T, Farina L, Computational analysis identifies a sponge interaction network between long non-coding RNAs and messenger RNAs in human breast cancer. BMC Syst Biol., 2014, 8:83. [18] Sumazin P, Yang X, Chiu HS, et al. An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma. Cell, 2011, 147(2):370-81. [19] Tay Y, Kats L, Salmena L, et al. Coding-independent regulation of the tumor suppressor PTEN by competing endogenous mRNAs. Cell, 2011, 147(2):344-57. [20] Sardina DS, Alaimo S, Ferro A, et al. A novel computational method for inferring competing endogenous interactions. Brief Bioinform., 2017, 18(6):1071-1081. [21] Clauset A, Newman ME, Moore C. Finding community structure in very large networks. Phys Rev E Stat Nonlin Soft Matter Phys., 2004, 70(6 Pt 2):066111. [22] Enright AJ, Van Dongen S, Ouzounis CA. An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res., 2002, 30(7):1575-84. [23] Kalinka AT, Tomancak P. linkcomm: an R package for the generation, visualization, and analysis of link communities in networks of arbitrary size and type. Bioinformatics, 2011, 27(14):2011-2. [24] Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 2003, 4:2. # Session information ```{r} sessionInfo() ```