SpidermiRquery
: Searching networkSpidermiRdownload
: Downloading network dataSpidermiRdownload_net
: Download networkSpidermiRdownload_miRNAprediction
: Downloading miRNA predicted data targetSpidermiRdownload_miRNAvalidate
: Downloading miRNA validated data targetSpidermiRdownload_miRNAextra_cir
:Download Extracellular Circulating microRNAsSpidermiRdownload_pharmacomir
: Download Pharmaco-miR Verified Sets from PharmacomiR databaseSpidermiRprepare
: Preparing the dataSpidermiRanalyze
: : Analyze data from network dataSpidermiRanalyze_mirnanet_pharm
: Integration of pharmacomiR in the networkSpidermiRanalyze_direct_net
: Searching by biomarkers of interest with direct interactionSpidermiRanalyze_direct_subnetwork
: Network composed by only the nodes in a set of biomarkers of interestSpidermiRanalyze_subnetwork_neigh
: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.SpidermiRanalyze_degree_centrality
: Ranking degree centrality genesSpidermiRanalyze_Community_detection
: Find community detectionSpidermiRanalyze_Community_detection_net
: Community detectionSpidermiRanalyze_Community_detection_bi
: Community detection from a set of biomarkers of interestSpidermiRvisualize
: To visualize the networkSpidermiRvisualize_mirnanet
: To Visualize the network.SpidermiRvisualize_BI
: To Visualize the network from a set of BI.SpidermiRvisualize_direction
: To visualize the networkSpidermiRvisualize_plot_target
: Visualize the plot with miRNAs and the number of their targets in the network.SpidermiRvisualize_degree_dist
: plots the degree distribution of the networkSpidermiRvisualize_adj_matrix
: plots the adjacency matrix of the networkSpidermiRvisualize_3Dbarplot
: 3D barplotFeatures databases SpidermiR
:Biological systems are composed of multiple layers of dynamic interaction networks. These networks can be decomposed, for example, into: co-expression, physical, co-localization, genetic, pathway, and shared protein domains.
GeneMania provides us with an enormous collection of data sets for interaction network studies (Warde-Farley D, Donaldson S, Comes O, Zuberi K, Badrawi R, and others 2010). The data can be accessed and downloaded from different database, using a web portal. But currently, there is not a R-package to query and download these data.
An important regulatory mechanism of these network data involves microRNAs (miRNAs). miRNAs are involved in various cellular functions, such as differentiation, proliferation, and tumourigenesis. However, our understanding of the processes regulated by miRNAs is currently limited and the integration of miRNA data in these networks provides a comprehensive genome-scale analysis of miRNA regulatory networks.Actually, GeneMania doesn’t integrate the information of miRNAs and their interactions in the network.
SpidermiR
allows the user to query, prepare, download network data (e.g. from GeneMania), and to integrate this information with miRNA data
with the possibility to analyze
these downloaded data directly in one single R package.
This techincal report gives a short overview of the essential SpidermiR
methods and their application.
To install use the code below.
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("SpidermiR")
SpidermiRquery
: Searching networkYou can easily search GeneMania data using the SpidermiRquery
function.
SpidermiRquery_species
: Searching by speciesThe user can query the species supported by GeneMania, using the function SpidermiRquery_species:
org<-SpidermiRquery_species(species)
The list of species is shown below:
tabOrgd[, 2] | |
---|---|
1 | Arabidopsis_thaliana |
2 | Caenorhabditis_elegans |
3 | Danio_rerio |
4 | Drosophila_melanogaster |
5 | Escherichia_coli |
6 | Homo_sapiens |
7 | Mus_musculus |
8 | Rattus_norvegicus |
9 | Saccharomyces_cerevisiae |
SpidermiRquery_networks_type
: Searching by network categoriesThe user can query the network types supported by GeneMania for a specific specie, using the function SpidermiRquery_networks_type
. The user can select a specific specie using an index obtained by the function SpidermiRquery_species
(e.g. organismID=org[6,] is the input for Homo_sapiens,organismID=org[9,] is the input for Saccharomyces cerevisiae )
net_type<-SpidermiRquery_networks_type(organismID=org[9,])
The list of network categories in Saccharomyces cerevisiae is shown below:
## [1] "Co-localization" "Predicted"
## [3] "Co-expression" "Physical Interactions"
## [5] "Genetic Interactions" "Shared protein domains"
## [7] "Other"
SpidermiRquery_spec_networks
: Searching by species, and network categoriesYou can filter the search by species using organism ID (above reported), and the network category. The network category can be filtered using the following parameters:
net_shar_prot<-SpidermiRquery_spec_networks(organismID = org[9,],
network = "SHpd")
The databases, which data are collected, are the output of this step. An example is shown below ( for Shared protein domains in Saccharomyces_cerevisiae data are collected in INTERPRO, and PFAM):
## [1] "http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.INTERPRO.txt"
## [2] "http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.PFAM.txt"
SpidermiRdownload
: Downloading network dataThe user in this step can download the data, as previously queried.
SpidermiRdownload_net
: Download networkThe user can download the data (previously queried) with SpidermiRdownload_net
.
out_net<-SpidermiRdownload_net(net_shar_prot)
## [1] "Downloading: http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.INTERPRO.txt ... reference n. 1 of 2"
## [1] "Downloading: http://genemania.org/data/current/Saccharomyces_cerevisiae/Shared_protein_domains.PFAM.txt ... reference n. 2 of 2"
The list of SpidermiRdownload_net is shown below:
## List of 2
## $ :'data.frame': 47523 obs. of 3 variables:
## ..$ Gene_A: chr [1:47523] "Q0050" "Q0050" "Q0055" "Q0050" ...
## ..$ Gene_B: chr [1:47523] "Q0055" "Q0060" "Q0060" "Q0065" ...
## ..$ Weight: num [1:47523] 0.39 0.09 0.15 0.09 0.15 0.23 0.1 0.17 0.18 0.18 ...
## $ :'data.frame': 30228 obs. of 3 variables:
## ..$ Gene_A: chr [1:30228] "Q0050" "Q0055" "Q0055" "Q0060" ...
## ..$ Gene_B: chr [1:30228] "Q0055" "Q0060" "Q0065" "Q0065" ...
## ..$ Weight: num [1:30228] 0.39 0.12 0.12 0.22 0.14 0.14 0.14 0.15 0.15 0.34 ...
SpidermiRdownload_miRNAprediction
: Downloading miRNA predicted data targetThe user can download the predicted miRNA-gene from 4 databases:DIANA, Miranda, PicTar and TargetScan
mirna<-c('hsa-miR-567','hsa-miR-566')
SpidermiRdownload_miRNAprediction(mirna_list=mirna)
SpidermiRdownload_miRNAextra_cir
:Download Extracellular Circulating microRNAsThe user can download extracellular circulating miRNAs from miRandola database
list_circ<-SpidermiRdownload_miRNAextra_cir(miRNAextra_cir)
SpidermiRdownload_pharmacomir
: Download Pharmaco-miR Verified Sets from PharmacomiR databaseThe user can download Pharmaco-miR Verified Sets from PharmacomiR database (Rukov J, Wilentzik R, Jaffe I, Vinther J, Shomron N. 2013).
mir_pharmaco<-SpidermiRdownload_pharmacomir(pharmacomir=pharmacomir)
SpidermiRprepare
: Preparing the dataSpidermiRprepare_NET
: Prepare matrix of gene network with Ensembl Gene ID, and gene symbolsSpidermiRprepare_NET
reads network data from SpidermiRdownload_net
and enables user to prepare them for downstream analysis. In particular, it prepares matrix of gene network mapping Ensembl Gene ID to gene symbols. Gene symbols are needed to integrate miRNAdata.
geneSymb_net<-SpidermiRprepare_NET(organismID = org[9,],
data = out_net)
## [1] "Preprocessing of the network n. 1 of 2"
## [1] "Preprocessing of the network n. 2 of 2"
The network with gene symbols ID is shown below:
Gene_A | Gene_B | Weight | gene_symbolA | gene_symbolB |
---|---|---|---|---|
Q0050 | Q0055 | 0.39 | AI1 | AI2 |
Q0050 | Q0060 | 0.09 | AI1 | AI3 |
Q0055 | Q0060 | 0.15 | AI2 | AI3 |
Q0050 | Q0065 | 0.09 | AI1 | AI4 |
Q0055 | Q0065 | 0.15 | AI2 | AI4 |
SpidermiRanalyze
: : Analyze data from network dataSpidermiRanalyze_mirnanet_pharm
: Integration of pharmacomiR in the networkThe user in this step can integrate the pharmacomiR database in order to link miRNA and drug effect in a specific network.
miRNA_NET <-data.frame(V1=c('hsa-let-7a','CASP3'),V2=c('CASP3','TAMOXIFEN'),stringsAsFactors=FALSE)
mir_pharmnet<-SpidermiRanalyze_mirnanet_pharm(mir_ph=mir_pharmaco,net=miRNA_NET)
SpidermiRanalyze_direct_net
: Searching by biomarkers of interest with direct interactionStarting from a set of biomarkers of interest (BI), genes, miRNA or both, given by the user, this function finds sub-networks including all direct interactions involving at least one of the BI.
biomark_of_interest<-c("hsa-let-7a","CDC34","hsa-miR-27a","PEX7","EPT1","FOX","hsa-miR-5a")
miRNA_NET <-data.frame(V1=c('hsa-let-7a','CASP3','BRCA','hsa-miR-7a','hsa-miR-5a','SMAD','SOX'),V2=c('CASP3','TAMOXIFEN','MYC','PTEN','FOX','HIF1','P53'),stringsAsFactors=FALSE)
GIdirect_net<-SpidermiRanalyze_direct_net(data=miRNA_NET,BI=biomark_of_interest)
## [1] "CDC34 is not in the network or please check the correct name"
## [1] "hsa-miR-27a is not in the network or please check the correct name"
## [1] "PEX7 is not in the network or please check the correct name"
## [1] "EPT1 is not in the network or please check the correct name"
The data frame of SpidermiRanalyze_direct_net
, GIdirect_net, is shown below:
## 'data.frame': 2 obs. of 2 variables:
## $ V1: chr "hsa-let-7a" "hsa-miR-5a"
## $ V2: chr "CASP3" "FOX"
SpidermiRanalyze_direct_subnetwork
: Network composed by only the nodes in a set of biomarkers of interestStarting from BI, this function finds sub-networks including all direct interactions involving only BI.
subnet<-SpidermiRanalyze_direct_subnetwork(data=miRNA_NET,BI=biomark_of_interest)
SpidermiRanalyze_subnetwork_neigh
: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes.Starting from BI, this function finds sub-networks including all direct and indirect interactions involving at least one of BI.
GIdirect_net_neigh<-SpidermiRanalyze_subnetwork_neigh(data=miRNA_NET,BI=biomark_of_interest)
SpidermiRanalyze_degree_centrality
: Ranking degree centrality genesThis function finds the number of direct neighbours of a node in a network and allows the selection of those nodes with a number of direct neighbours higher than a selected cut-off.
top10_cent_gene<-SpidermiRanalyze_degree_centrality(miRNA_NET)
SpidermiRanalyze_Community_detection
: Find community detectionThis function find the communities in the network, and describes them in terms of number of community elements (both genes and miRNAs). The function uses one of the algorithms currently implemented in (Csardi G, Nepusz T. 2006), selected by the user according to the user need.
The user can choose the algorithm in order to calculate the community structure:
comm<- SpidermiRanalyze_Community_detection(data=miRNA_NET,type="FC")
SpidermiRanalyze_Community_detection_net
: Community detectionStarting from one community to which some BI belong (the output of the previously described function) this function describes the community as network of elements (both genes and miRNAs).
cd_net<-SpidermiRanalyze_Community_detection_net(data=miRNA_NET,comm_det=comm,size=1)
SpidermiRanalyze_Community_detection_bi
: Community detection from a set of biomarkers of interestStarting from the community to which BI belong (the output of the previously described function), this function indicates if a set of BI is included within such community.
gi=c("P53","PTEN","KIT","CCND2")
mol<-SpidermiRanalyze_Community_detection_bi(data=comm,BI=gi)
SpidermiRvisualize
: To visualize the networkSpidermiRvisualize_mirnanet
: To Visualize the network.The user can visualize a 3D representation of the network in different colours for miRNA, genes, and pharmaco. The user can manage the network directly moving the nodes and the edges, in order to interpret the results in the graphic way.
library(networkD3)
SpidermiRvisualize_mirnanet(data=mir_pharmnet[sample(nrow(mir_pharmnet), 100), ] )
SpidermiRvisualize_BI
: To Visualize the network from a set of BI.Starting from a graphical representation of a network, the user can highlight with a different color specific BI.
biomark_of_interest<-c("hsa-let-7b","MUC1","PEX7","hsa-miR-222")
SpidermiRvisualize_BI(data=mir_pharmnet[sample(nrow(mir_pharmnet), 100), ],BI=biomark_of_interest)
SpidermiRvisualize_direction
: To visualize the networklibrary(visNetwork)
SpidermiRvisualize_direction(data=mir_pharmnet[sample(nrow(mir_pharmnet), 100), ] )
SpidermiRvisualize_plot_target
: Visualize the plot with miRNAs and the number of their targets in the network.For each BI of a community, the user can visualize a plot showing the number of direct neighbours of such BI (the degree centrality of such BI).
SpidermiRvisualize_plot_target(data=miRNA_NET)
## NULL
SpidermiRvisualize_degree_dist
: plots the degree distribution of the networkThis function plots the cumulative frequency distribution of degree centrality of a community.
SpidermiRvisualize_degree_dist(data=miRNA_NET)
SpidermiRvisualize_adj_matrix
: plots the adjacency matrix of the networkIt plots the adjacency matrix of the community, representing the degree of connections among the nodes.
SpidermiRvisualize_adj_matrix(data=miRNA_NET)
SpidermiRvisualize_3Dbarplot
: 3D barplotIt plots a summary representation of the networks with the number of edges, nodes and miRNAs.
SpidermiRvisualize_3Dbarplot(Edges_1net=1041003,Edges_2net=100016,Edges_3net=3008,Edges_4net=1493,Edges_5net=1598,NODES_1net=16502,NODES_2net=13338,NODES_3net=1429,NODES_4net=675,NODES_5net=712,nmiRNAs_1net=0,nmiRNAs_2net=74,nmiRNAs_3net=0,nmiRNAs_4net=0,nmiRNAs_5net=37)
Features databases SpidermiR
:Features of databases integrated in SpidermiR
are:
CATEGORY | EXTERNAL DATABASE | VERSION | LAST UPDATE | LINK |
---|---|---|---|---|
Gene network | GeneMania | Current | 2016 | http://genemania.org/data/current/ |
Validated miRNA-target | miRwalk | miRwalk2 | 2015 | http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk2/downloads/vtm/hsa-vtm-gene.rdata.zip |
miRTarBase | miRTarBase 7 | 2017 | mirtarbase.mbc.nctu.edu.tw/cache/download/7.0/miRTarBase_SE_WR.xls | |
Predicted miRNA-target | DIANA | DIANA- 5.0 | 2013 | https://bioconductor.org/packages/release/bioc/html/miRNAtap.html |
Miranda | N/A | 2010 | https://bioconductor.org/packages/release/bioc/html/miRNAtap.html | |
PicTar | N/A | N/A | https://bioconductor.org/packages/release/bioc/html/miRNAtap.html | |
TargetScan | TargetScan7.1 | 2016 | https://bioconductor.org/packages/release/bioc/html/miRNAtap.html | |
Extracellular Circulating microRNAs | miRandola | miRandola v 02/2017 | 2017 | http://mirandola.iit.cnr.it/download/miRandola_version_02_2017.txt |
Drug Associations | Pharmaco-miR | N/A | N/A | http://pharmaco-mir.org/home/download_VERSE_db/pharmacomir_VERSE_DB.csv |
Session Information ******
sessionInfo()
## R version 3.5.2 (2018-12-20)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.8-bioc/R/lib/libRblas.so
## LAPACK: /home/biocbuild/bbs-3.8-bioc/R/lib/libRlapack.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats4 stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] visNetwork_2.0.5 networkD3_0.4 SpidermiR_1.12.1
## [4] testthat_2.0.1 miRNAtap_1.16.0 AnnotationDbi_1.44.0
## [7] IRanges_2.16.0 S4Vectors_0.20.1 Biobase_2.42.0
## [10] BiocGenerics_0.28.0 BiocStyle_2.10.0
##
## loaded via a namespace (and not attached):
## [1] backports_1.1.3 circlize_0.4.5
## [3] aroma.light_3.12.0 plyr_1.8.4
## [5] igraph_1.2.2 selectr_0.4-1
## [7] ConsensusClusterPlus_1.46.0 lazyeval_0.2.1
## [9] splines_3.5.2 BiocParallel_1.16.5
## [11] usethis_1.4.0 GenomeInfoDb_1.18.1
## [13] ggplot2_3.1.0 sva_3.30.1
## [15] digest_0.6.18 foreach_1.4.4
## [17] htmltools_0.3.6 gdata_2.18.0
## [19] magrittr_1.5 memoise_1.1.0
## [21] cluster_2.0.7-1 doParallel_1.0.14
## [23] limma_3.38.3 remotes_2.0.2
## [25] ComplexHeatmap_1.20.0 Biostrings_2.50.2
## [27] readr_1.3.1 annotate_1.60.0
## [29] matrixStats_0.54.0 R.utils_2.7.0
## [31] prettyunits_1.0.2 colorspace_1.4-0
## [33] rvest_0.3.2 ggrepel_0.8.0
## [35] blob_1.1.1 xfun_0.4
## [37] dplyr_0.7.8 callr_3.1.1
## [39] crayon_1.3.4 RCurl_1.95-4.11
## [41] jsonlite_1.6 genefilter_1.64.0
## [43] bindr_0.1.1 zoo_1.8-4
## [45] iterators_1.0.10 survival_2.43-3
## [47] miRNAtap.db_0.99.10 glue_1.3.0
## [49] survminer_0.4.3 gtable_0.2.0
## [51] zlibbioc_1.28.0 XVector_0.22.0
## [53] GetoptLong_0.1.7 DelayedArray_0.8.0
## [55] pkgbuild_1.0.2 shape_1.4.4
## [57] scales_1.0.0 DESeq_1.34.1
## [59] edgeR_3.24.3 DBI_1.0.0
## [61] ggthemes_4.0.1 Rcpp_1.0.0
## [63] cmprsk_2.2-7 xtable_1.8-3
## [65] progress_1.2.0 matlab_1.0.2
## [67] bit_1.1-14 km.ci_0.5-2
## [69] sqldf_0.4-11 htmlwidgets_1.3
## [71] httr_1.4.0 gplots_3.0.1.1
## [73] RColorBrewer_1.1-2 pkgconfig_2.0.2
## [75] XML_3.98-1.16 R.methodsS3_1.7.1
## [77] locfit_1.5-9.1 labeling_0.3
## [79] tidyselect_0.2.5 rlang_0.3.1
## [81] munsell_0.5.0 tools_3.5.2
## [83] downloader_0.4 cli_1.0.1
## [85] generics_0.0.2 gsubfn_0.7
## [87] RSQLite_2.1.1 broom_0.5.1
## [89] devtools_2.0.1 evaluate_0.12
## [91] stringr_1.3.1 yaml_2.2.0
## [93] processx_3.2.1 org.Hs.eg.db_3.7.0
## [95] knitr_1.21 bit64_0.9-7
## [97] fs_1.2.6 survMisc_0.5.5
## [99] caTools_1.17.1.1 purrr_0.3.0
## [101] bindrcpp_0.2.2 TCGAbiolinks_2.10.3
## [103] nlme_3.1-137 EDASeq_2.16.3
## [105] R.oo_1.22.0 xml2_1.2.0
## [107] biomaRt_2.38.0 compiler_3.5.2
## [109] rstudioapi_0.9.0 curl_3.3
## [111] tibble_2.0.1 geneplotter_1.60.0
## [113] stringi_1.2.4 highr_0.7
## [115] ps_1.3.0 GenomicFeatures_1.34.3
## [117] desc_1.2.0 lattice_0.20-38
## [119] Matrix_1.2-15 KMsurv_0.1-5
## [121] pillar_1.3.1 BiocManager_1.30.4
## [123] GlobalOptions_0.1.0 data.table_1.12.0
## [125] bitops_1.0-6 rtracklayer_1.42.1
## [127] GenomicRanges_1.34.0 R6_2.3.0
## [129] latticeExtra_0.6-28 hwriter_1.3.2
## [131] bookdown_0.9 ShortRead_1.40.0
## [133] KernSmooth_2.23-15 gridExtra_2.3
## [135] codetools_0.2-16 sessioninfo_1.1.1
## [137] gtools_3.8.1 assertthat_0.2.0
## [139] pkgload_1.0.2 chron_2.3-53
## [141] SummarizedExperiment_1.12.0 proto_1.0.0
## [143] rprojroot_1.3-2 rjson_0.2.20
## [145] withr_2.1.2 GenomicAlignments_1.18.1
## [147] Rsamtools_1.34.1 GenomeInfoDbData_1.2.0
## [149] mgcv_1.8-26 hms_0.4.2
## [151] grid_3.5.2 tidyr_0.8.2
## [153] rmarkdown_1.11 ggpubr_0.2
Csardi G, Nepusz T. 2006. “The Igraph Software Package for Complex Network Research.”
Dweep H, Sticht C, Pandey P, Gretz N. 2011. “miRWalk - Database Prediction of Possible miRNA Binding Sites by ‘Walking’ the Genes of 3 Genomes.”
Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, Li M, Wang G, Liu Y. 2009. “miR2Disease a Manually Curated Database for microRNA Deregulation in Human Disease.”
Rukov J, Wilentzik R, Jaffe I, Vinther J, Shomron N. 2013. “Pharmaco miR Linking microRNAs and Drug Effects.”
Warde-Farley D, Donaldson S, Comes O, Zuberi K, Badrawi R, and others. 2010. “The Gene Mania Prediction Server Biological Network Integration for Gene Prioritization and Predicting Gene Function.”