--- title: "Working with SpidermiR package" author: " Claudia Cava, Antonio Colaprico, Alex Graudenzi, Gloria Bertoli,Tiago C. Silva,Catharina Olsen,Houtan Noushmehr, Gianluca Bontempi, Giancarlo Mauri, Isabella Castiglioni" date: "`r Sys.Date()`" output: BiocStyle::html_document: toc: true number_sections: false toc_depth: 2 highlight: haddock references: - id: ref1 title: The Gene Mania prediction server biological network integration for gene prioritization and predicting gene function author: - family: Warde-Farley D, Donaldson S, Comes O, Zuberi K, Badrawi R, and others given: journal: Nucleic Acids Res. volume: 38 number: 2 pages: 214-220 issued: year: 2010 - id: ref2 title: miR2Disease a manually curated database for microRNA deregulation in human disease. author: - family: Jiang Q, Wang Y, Hao Y, Juan L, Teng M, Zhang X, Li M, Wang G, Liu Y. given: journal: Nucleic Acids Res. volume: 37 number: 1 pages: 98-104 issued: year: 2009 - id: ref3 title: miRWalk - database prediction of possible miRNA binding sites by "walking" the genes of 3 genomes. author: - family: Dweep H, Sticht C, Pandey P, Gretz N. given: journal: Journal of Biomedical Informatics volume: 44 number: 1 pages: 839-7 issued: year: 2011 - id: ref4 title: miRandola Extracellular Circulating microRNAs Database. author: - family: Russo F, Di Bella S, Nigita G, Macca V, Lagana A, Giugno R, Pulvirenti A, Ferro A. given: journal: PLoS ONE volume: 7 number: 10 pages: e47786 issued: year: 2012 - id: ref5 title: The igraph software package for complex network research. author: - family: Csardi G, Nepusz T. given: journal: InterJournal volume: Complex Systems number: pages: 1695 issued: year: 2006 - id: ref6 title: Pharmaco miR linking microRNAs and drug effects. author: - family: Rukov J, Wilentzik R, Jaffe I, Vinther J, Shomron N. given: journal: Briefings in Bioinformatics volume: 15 number: 4 pages: 648-59 issued: year: 2013 - id: ref7 title: DGIdb 3.0 a redesign and expansion of the drug gene interaction database. author: - family: Cotto KC, Wagner AH, Feng YY, Kiwala S, Coffman AC, Spies G, Wollam A, Spies NC, Griffith OL, Griffith M. given: journal: Nucleic Acids Res volume: 46 number: D1 pages: 1068-1073 issued: year: 2018 - id: ref8 title: SuperTarget and Matador resources for exploring drug-target relationships. author: - family: GUnther S, Kuhn M, Dunkel M, Campillos M, Senger C, Petsalaki E, Ahmed J, Urdiales EG, Gewiess A, Jensen LJ, Schneider R, Skoblo R, Russell RB, Bourne PE, Bork P, Preissner R. given: journal: Nucleic Acids Res volume: 36 number: D1 pages: 919-22 issued: year: 2008 - id: ref9 title: miRNAtap microRNA Targets - Aggregated Predictions. author: - family: Pajak M, Simpson TI given: journal: Nucleic Acids Res volume: R package version 1.20.0 number: pages: issued: year: 2019 vignette: > %\VignetteIndexEntry{Working with SpidermiR package} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(dpi = 300) knitr::opts_chunk$set(cache=FALSE) ``` ```{r, eval = TRUE, echo = FALSE,hide=TRUE, message=FALSE,warning=FALSE} devtools::load_all() ``` # Introduction Biological systems are composed of multiple layers of dynamic interaction networks. These networks can be decomposed, for example, into: co-expression, physical, co-localization, genetic, pathway, and shared protein domains. GeneMania provides us with an enormous collection of data sets for interaction network studies [@ref1]. The data can be accessed and downloaded from different database, using a web portal. But currently, there is not a R-package to query and download these data. An important regulatory mechanism of these network data involves microRNAs (miRNAs). miRNAs are involved in various cellular functions, such as differentiation, proliferation, and tumourigenesis. However, our understanding of the processes regulated by miRNAs is currently limited and the integration of miRNA data in these networks provides a comprehensive genome-scale analysis of miRNA regulatory networks.Actually, GeneMania doesn't integrate the information of miRNAs and their interactions in the network. `SpidermiR` allows the user to query, prepare, download network data (e.g. from GeneMania), and to integrate this information with miRNA data with the possibility to analyze these downloaded data directly in one single R package. This techincal report gives a short overview of the essential `SpidermiR` methods and their application. # Installation To install use the code below. ```{r, eval = FALSE} if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("SpidermiR") ``` # `SpidermiRquery`: Searching network You can easily search GeneMania data using the `SpidermiRquery` function. ## `SpidermiRquery_species`: Searching by species The user can query the species supported by GeneMania, using the function SpidermiRquery_species: ```{r, eval = TRUE} org<-SpidermiRquery_species(species) ``` The list of species is shown below: ```{r, eval = TRUE, echo = FALSE} knitr::kable(org, digits = 2, caption = "List of species",row.names = TRUE) ``` ## `SpidermiRquery_networks_type`: Searching by network categories The user can query the network types supported by GeneMania for a specific specie, using the function `SpidermiRquery_networks_type`. The user can select a specific specie using an index obtained by the function `SpidermiRquery_species` (e.g. organismID=org[6,] is the input for Homo_sapiens,organismID=org[9,] is the input for Saccharomyces cerevisiae ) ```{r, eval = TRUE} net_type<-SpidermiRquery_networks_type(organismID=org[9,]) ``` The list of network categories in Saccharomyces cerevisiae is shown below: ```{r, eval = TRUE, echo = FALSE} net_type ``` ## `SpidermiRquery_spec_networks`: Searching by species, and network categories You can filter the search by species using organism ID (above reported), and the network category. The network category can be filtered using the following parameters: * **COexp** Co-expression * **PHint** Physical_interactions * **COloc** Co-localization * **GENint** Genetic_interactions * **PATH** Pathway * **SHpd** Shared_protein_domains * **pred** predicted ```{r, eval = TRUE} net_shar_prot<-SpidermiRquery_spec_networks(organismID = org[9,], network = "SHpd") ``` The databases, which data are collected, are the output of this step. An example is shown below ( for Shared protein domains in Saccharomyces_cerevisiae data are collected in INTERPRO, and PFAM): ```{r, eval = TRUE, echo = FALSE} net_shar_prot ``` # `SpidermiRdownload`: Downloading network data The user in this step can download the data, as previously queried. ## `SpidermiRdownload_net`: Download network The user can download the data (previously queried) with `SpidermiRdownload_net`. ```{r, eval = TRUE} out_net<-SpidermiRdownload_net(net_shar_prot) ``` The list of SpidermiRdownload_net is shown below: ```{r, eval = TRUE, echo = FALSE} str(out_net) ``` ## `SpidermiRdownload_miRNAprediction`: Downloading miRNA predicted data target The user can download the predicted miRNA-gene from 4 databases:DIANA, Miranda, PicTar and TargetScan using miRNAtap [@ref9]. ```{r, eval = FALSE} mirna<-c('hsa-miR-567','hsa-miR-566') SpidermiRdownload_miRNAprediction(mirna_list=mirna) ``` ## `SpidermiRdownload_miRNAvalidate`: Downloading miRNA validated data target The user can download the validated miRNA-gene from: miRTAR and miRwalk [@ref2] [@ref3]. ```{r, eval = FALSE} list<-SpidermiRdownload_miRNAvalidate(validated) ``` ## `SpidermiRdownload_miRNAextra_cir`:Download Extracellular Circulating microRNAs The user can download extracellular circulating miRNAs from miRandola database ```{r, eval = FALSE} list_circ<-SpidermiRdownload_miRNAextra_cir(miRNAextra_cir) ``` # `SpidermiRprepare`: Preparing the data ## `SpidermiRprepare_NET`: Prepare matrix of gene network with Ensembl Gene ID, and gene symbols `SpidermiRprepare_NET` reads network data from `SpidermiRdownload_net` and enables user to prepare them for downstream analysis. In particular, it prepares matrix of gene network mapping Ensembl Gene ID to gene symbols. Gene symbols are needed to integrate miRNAdata. ```{r, eval = TRUE} geneSymb_net<-SpidermiRprepare_NET(organismID = org[9,], data = out_net) ``` The network with gene symbols ID is shown below: ```{r, eval = TRUE, echo = FALSE} knitr::kable(geneSymb_net[[1]][1:5,c(1,2,3,5,8)], digits = 2, caption = "shared protein domain",row.names = FALSE) ``` # `SpidermiRanalyze`: : Analyze data from network data ## `SpidermiRanalyze_direct_net`: Searching by biomarkers of interest with direct interaction Starting from a set of biomarkers of interest (BI), genes, miRNA or both, given by the user, this function finds sub-networks including all direct interactions involving at least one of the BI. ```{r, eval = TRUE} biomark_of_interest<-c("hsa-let-7a","CDC34","hsa-miR-27a","PEX7","EPT1","FOX","hsa-miR-5a") miRNA_NET <-data.frame(V1=c('hsa-let-7a','CASP3','BRCA','hsa-miR-7a','hsa-miR-5a','SMAD','SOX'),V2=c('CASP3','TAMOXIFEN','MYC','PTEN','FOX','HIF1','P53'),stringsAsFactors=FALSE) GIdirect_net<-SpidermiRanalyze_direct_net(data=miRNA_NET,BI=biomark_of_interest) ``` The data frame of `SpidermiRanalyze_direct_net`, GIdirect_net, is shown below: ```{r, eval = TRUE, echo = FALSE} str(GIdirect_net) ``` ## `SpidermiRanalyze_direct_subnetwork`: Network composed by only the nodes in a set of biomarkers of interest Starting from BI, this function finds sub-networks including all direct interactions involving only BI. ```{r, eval = FALSE} subnet<-SpidermiRanalyze_direct_subnetwork(data=miRNA_NET,BI=biomark_of_interest) ``` ## `SpidermiRanalyze_subnetwork_neigh`: Network composed by the nodes in the list of BI and all the edges among this brunch of nodes. Starting from BI, this function finds sub-networks including all direct and indirect interactions involving at least one of BI. ```{r, eval = FALSE} GIdirect_net_neigh<-SpidermiRanalyze_subnetwork_neigh(data=miRNA_NET,BI=biomark_of_interest) ``` ## `SpidermiRanalyze_degree_centrality`: Ranking degree centrality genes This function finds the number of direct neighbours of a node in a network and allows the selection of those nodes with a number of direct neighbours higher than a selected cut-off. ```{r, eval = FALSE} top10_cent_gene<-SpidermiRanalyze_degree_centrality(miRNA_NET) ``` # `Features databases SpidermiR`: Features of databases integrated in `SpidermiR` are: ```{r, eval = TRUE,echo = FALSE} B<-matrix( c("Gene network", "Validated miRNA-target","", "Predicted miRNA-target","","","", "Extracellular Circulating microRNAs", "Pharmaco-miR","", "GeneMania", "miRwalk","miRTarBase", "DIANA", "Miranda", "PicTar","TargetScan","miRandola","DGIdb","MATADOR", "Current","miRwalk2","miRTarBase 7","DIANA- 5.0","N/A","N/A","TargetScan7.1","miRandola v 02/2017","N/A","N/A", 2017,2015,2017,2013,2010,"N/A","2016",2017,2018,"N/A", "http://genemania.org/data/current/","http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk2/downloads/vtm/hsa-vtm-gene.rdata.zip","http://mirtarbase.mbc.nctu.edu.tw/cache/download/7.0/miRTarBase_SE_WR.xls","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","https://bioconductor.org/packages/release/bioc/html/miRNAtap.html","http://mirandola.iit.cnr.it/download/miRandola_version_02_2017.txt","http://dgidb.org/data/interactions.tsv", "http://matador.embl.de/media/download/matador.tsv.gz" ), nrow=10, ncol=5) colnames(B)<-c("CATEGORY","EXTERNAL DATABASE","VERSION","LAST UPDATE","LINK") ``` ```{r, eval = TRUE, echo = FALSE} knitr::kable(B, digits = 2, caption = "Features",row.names = FALSE) ``` ****** Session Information ****** ```{r sessionInfo} sessionInfo() ``` # References