The R package to infer and plot Bayesian networks. The network are inferred from expression data based on clusterProfiler or ReactomePA results. It makes use of libraries including clusterProfiler, ReactomePA, bnlearn, graphite and depmap. In this vignette, the description of functions and several use cases are depicted using GSE133624, which contains RNA-Seq data of bladder cancer. The more detail can be found on the book (https://noriakis.github.io/CBNplot/).
BiocManager::install("CBNplot")
library(CBNplot)
library(bnlearn)
library(DESeq2)
library(org.Hs.eg.db)
library(GEOquery)
## Load dataset and make metadata
filePaths <- getGEOSuppFiles("GSE133624")
counts = read.table(rownames(filePaths)[1], header=1, row.names=1)
meta = sapply(colnames(counts), function (x) substring(x,1,1))
meta = data.frame(meta)
colnames(meta) = c("Condition")
dds <- DESeqDataSetFromMatrix(countData = counts,
colData = meta,
design= ~ Condition)
## Prefiltering
filt <- rowSums(counts(dds) < 10) > dim(meta)[1]*0.9
dds <- dds[!filt,]
## Perform DESeq2()
dds = DESeq(dds)
res = results(dds, pAdjustMethod = "bonferroni")
## apply variance stabilizing transformation
v = vst(dds, blind=FALSE)
vsted = assay(v)
## Define the input genes, and use clusterProfiler::bitr to convert the ID.
sig = subset(res, padj<0.05)
cand.entrez = clusterProfiler::bitr(rownames(sig),
fromType="ENSEMBL", toType="ENTREZID", OrgDb=org.Hs.eg.db)$ENTREZID
## Perform enrichment analysis
pway = ReactomePA::enrichPathway(gene = cand.entrez)
pway = clusterProfiler::setReadable(pway, org.Hs.eg.db)
## Define including samples
incSample = rownames(subset(meta, Condition=="T"))
Then use CBNplot. Basically, you need to supply the enrichment analysis result, normalized expression value and samples to be included. For bngeneplot
, the pathway number in the result
slot of enrichment analysis results must be given.
bngeneplot(results = pway,exp = vsted,
expSample = incSample, pathNum = 15)
Data frame of raw values used in the inference, data frame containing strength and direction, averaged network, and plot can be obtained by specifying returnNet=TRUE
ret <- bngeneplot(results = pway,exp = vsted,
expSample = incSample, pathNum = 15, returnNet=TRUE)
ret$str |> head()
FALSE from to strength direction
FALSE 1 RFC2 MCM2 0.25 0.4000000
FALSE 2 RFC2 GSK3B 0.30 0.3333333
FALSE 3 RFC2 ORC1 0.15 0.3333333
FALSE 4 RFC2 ORC6 0.45 0.6666667
FALSE 5 RFC2 CDC45 0.45 0.8888889
FALSE 6 RFC2 CDC6 0.10 0.5000000
The resulting network can be converted to igraph
object using bnlearn::as.igraph()
.
g <- bnlearn::as.igraph(ret$av)
igraph::evcent(g)$vector
## Warning: `evcent()` was deprecated in igraph 2.0.0.
## ℹ Please use `eigen_centrality()` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## RFC2 MCM2 GSK3B ORC1 ORC6 CDC45 CDC6 POLE2
## 0.4038410 0.3542079 0.2540685 0.2754006 0.1817224 0.4526096 0.4572003 0.6733291
## PSMB5 PSMA6 GINS1 PSMA7 E2F1 MCM4 CCNE1 LIG1
## 0.3368666 0.2246280 0.2630526 0.6139067 0.3264540 0.1235753 0.5547492 0.3337290
## PSMA2 POLD2 UBE2S RFC5 PSMD14 ORC2 ORC4 PDS5A
## 0.5090758 0.2467986 0.6855119 0.3702360 0.2703286 0.3843402 0.3641142 0.5294996
## MCM8 GINS2 PCNA RFC3 CDK7 DNA2 CCNA2 SKP2
## 0.4932056 0.1785340 0.7576684 0.3115340 0.5354951 0.4702573 1.0000000 0.5002437
## PRIM2 CDCA5 GINS4 ANAPC1 PSMD4 RBBP4 RFC4 CDC25A
## 0.4951122 0.4493055 0.7748917 0.3360281 0.3483747 0.6780029 0.4904165 0.3846575
## RAD21 ORC5 MCM7 CDT1 FEN1 ESCO2 CKS1B UBE2C
## 0.2053328 0.3697173 0.4129129 0.7970267 0.4988884 0.3111688 0.4244616 0.4059801
## CCNE2 POLE GINS3 LIN9 PRIM1
## 0.6505500 0.4792401 0.4231692 0.3605505 0.5103124
The relationship between pathways can be drawn by bnpathplot
. The number to be included in the inference can be specified by nCategory
.
bnpathplot(results = pway,exp = vsted,
expSample = incSample, nCategory=10, shadowText = TRUE)
bngeneplotCustom
and bnpathplotCustom
can be used to customize visualization with more flexibility, like highlighting the nodes and edges of interest by glowEdgeNum
and hub
.
bnpathplotCustom(results = pway, exp = vsted, expSample = incSample,
fontFamily="serif", glowEdgeNum=3, hub=3)
bngeneplotCustom(results = pway, exp = vsted, expSample = incSample,
pathNum=15, fontFamily="sans", glowEdgeNum=NULL, hub=3)
The detailed usage for the package, like including covariates to the plot and probabilistic reasoning is available in the package documentation (https://noriakis.github.io/CBNplot/).
sessionInfo()
## R version 4.4.0 RC (2024-04-16 r86468)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 22.04.4 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] GEOquery_2.73.0 org.Hs.eg.db_3.19.1
## [3] AnnotationDbi_1.67.0 DESeq2_1.45.0
## [5] SummarizedExperiment_1.35.0 Biobase_2.65.0
## [7] MatrixGenerics_1.17.0 matrixStats_1.3.0
## [9] GenomicRanges_1.57.0 GenomeInfoDb_1.41.0
## [11] IRanges_2.39.0 S4Vectors_0.43.0
## [13] BiocGenerics_0.51.0 bnlearn_4.9.3
## [15] CBNplot_1.5.0 BiocStyle_2.33.0
##
## loaded via a namespace (and not attached):
## [1] splines_4.4.0 ggplotify_0.1.2 filelock_1.0.3
## [4] tibble_3.2.1 oaqc_1.0 polyclip_1.10-6
## [7] graph_1.83.0 lifecycle_1.0.4 lattice_0.22-6
## [10] MASS_7.3-60.2 ggdist_3.3.2 magrittr_2.0.3
## [13] limma_3.61.0 sass_0.4.9 rmarkdown_2.26
## [16] jquerylib_0.1.4 yaml_2.3.8 cowplot_1.1.3
## [19] DBI_1.2.2 RColorBrewer_1.1-3 abind_1.4-5
## [22] zlibbioc_1.51.0 purrr_1.0.2 ggraph_2.2.1
## [25] yulab.utils_0.1.4 tweenr_2.0.3 rappdirs_0.3.3
## [28] pvclust_2.2-0 GenomeInfoDbData_1.2.12 enrichplot_1.25.0
## [31] ggrepel_0.9.5 tidytree_0.4.6 reactome.db_1.88.0
## [34] codetools_0.2-20 DelayedArray_0.31.0 DOSE_3.31.0
## [37] xml2_1.3.6 ggforce_0.4.2 tidyselect_1.2.1
## [40] aplot_0.2.2 UCSC.utils_1.1.0 farver_2.1.1
## [43] gmp_0.7-4 viridis_0.6.5 BiocFileCache_2.13.0
## [46] jsonlite_1.8.8 tidygraph_1.3.1 tools_4.4.0
## [49] treeio_1.29.0 Rcpp_1.0.12 glue_1.7.0
## [52] gridExtra_2.3 SparseArray_1.5.0 xfun_0.43
## [55] qvalue_2.37.0 distributional_0.4.0 dplyr_1.1.4
## [58] withr_3.0.0 BiocManager_1.30.22 fastmap_1.1.1
## [61] fansi_1.0.6 digest_0.6.35 R6_2.5.1
## [64] gridGraphics_0.5-1 colorspace_2.1-0 GO.db_3.19.1
## [67] RSQLite_2.3.6 utf8_1.2.4 tidyr_1.3.1
## [70] generics_0.1.3 data.table_1.15.4 graphlayouts_1.1.1
## [73] httr_1.4.7 S4Arrays_1.5.0 scatterpie_0.2.2
## [76] graphite_1.51.0 pkgconfig_2.0.3 gtable_0.3.5
## [79] Rmpfr_0.9-5 blob_1.2.4 XVector_0.45.0
## [82] clusterProfiler_4.13.0 shadowtext_0.1.3 htmltools_0.5.8.1
## [85] bookdown_0.39 fgsea_1.31.0 scales_1.3.0
## [88] png_0.1-8 ggfun_0.1.4 knitr_1.46
## [91] tzdb_0.4.0 reshape2_1.4.4 nlme_3.1-164
## [94] curl_5.2.1 cachem_1.0.8 stringr_1.5.1
## [97] BiocVersion_3.20.0 parallel_4.4.0 HDO.db_0.99.1
## [100] ReactomePA_1.49.0 pillar_1.9.0 grid_4.4.0
## [103] vctrs_0.6.5 dbplyr_2.5.0 evaluate_0.23
## [106] magick_2.8.3 readr_2.1.5 tinytex_0.50
## [109] cli_3.6.2 locfit_1.5-9.9 compiler_4.4.0
## [112] rlang_1.1.3 crayon_1.5.2 labeling_0.4.3
## [115] plyr_1.8.9 fs_1.6.4 stringi_1.8.3
## [118] viridisLite_0.4.2 BiocParallel_1.39.0 munsell_0.5.1
## [121] Biostrings_2.73.0 lazyeval_0.2.2 GOSemSim_2.31.0
## [124] Matrix_1.7-0 ExperimentHub_2.13.0 hms_1.1.3
## [127] patchwork_1.2.0 bit64_4.0.5 ggplot2_3.5.1
## [130] KEGGREST_1.45.0 statmod_1.5.0 highr_0.10
## [133] AnnotationHub_3.13.0 igraph_2.0.3 memoise_2.0.1
## [136] bslib_0.7.0 ggtree_3.13.0 fastmatch_1.1-4
## [139] bit_4.0.5 ape_5.8 gson_0.1.0