mist
(Methylation Inference for Single-cell along
Trajectory) is an R package for differential methylation (DM) analysis
of single-cell DNA methylation (scDNAm) data. The package employs a
Bayesian approach to model methylation changes along pseudotime and
estimates developmental-stage-specific biological variations. It
supports both single-group and two-group analyses, enabling users to
identify genomic features exhibiting temporal changes in methylation
levels or different methylation patterns between groups.
This vignette demonstrates how to use mist
for: 1.
Single-group analysis. 2. Two-group analysis.
To install the latest version of mist
, run the following
commands:
# Install devtools if you don't have it installed already
install.packages("devtools")
# Install mist from GitHub
devtools::install_github("https://github.com/dxd429/mist")
From Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("mist")
To view the package vignette in HTML format, run the following lines in R:
In this section, we will estimate parameters and perform differential methylation analysis using single-group data.
Here we load the example data from GSE121708.
estiParamSingle
# Estimate parameters for single-group
beta_sigma_list <- estiParamSingle(
Dat_sce = Dat_sce,
Dat_name = "Methy_level_group1",
ptime_name = "pseudotime"
)
# Check the output
head(beta_sigma_list)
## $ENSMUSG00000000001
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1
## 1.24065094 -0.60122344 0.53793840 0.32636368 0.01947859 5.64780093
## Sigma2_2 Sigma2_3 Sigma2_4
## 11.92614012 3.81190530 2.06809942
##
## $ENSMUSG00000000003
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1 Sigma2_2
## 1.5375190 0.4479362 8.6940476 -9.5927661 0.1342782 23.5911931 5.5250254
## Sigma2_3 Sigma2_4
## 6.4150751 9.2355445
##
## $ENSMUSG00000000028
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1
## 1.296064866 -0.001949552 0.058216969 0.045935974 0.009727932 8.339542830
## Sigma2_2 Sigma2_3 Sigma2_4
## 7.321258910 3.423641306 2.193556354
##
## $ENSMUSG00000000037
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1 Sigma2_2 Sigma2_3
## 1.033106 -4.315972 11.693381 -4.172590 -3.217091 8.652779 14.874852 8.460898
## Sigma2_4
## 2.384070
##
## $ENSMUSG00000000049
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1
## 1.02050932 -0.10267896 0.13920771 0.08535147 0.04308801 5.84699324
## Sigma2_2 Sigma2_3 Sigma2_4
## 7.95875841 2.93852958 1.20037150
##
## $ENSMUSG00000000056
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1
## 1.28615339 -0.01421717 0.09391707 0.05109912 0.03787798 9.18374844
## Sigma2_2 Sigma2_3 Sigma2_4
## 12.01570920 5.26560686 3.23271594
dmSingle
# Perform differential methylation analysis for the single-group
dm_results <- dmSingle(beta_sigma_list)
# View the top genomic features with drastic methylation changes
head(dm_results)
## ENSMUSG00000000568 ENSMUSG00000000486 ENSMUSG00000000282 ENSMUSG00000000223
## 0.11442048 0.07927437 0.07071541 0.06252971
## ENSMUSG00000000037 ENSMUSG00000000359
## 0.06159561 0.05612552
plotGene
# Produce scatterplot with fitted curve of a specific gene
plotGene(Dat_sce = Dat_sce,
Dat_name = "Methy_level_group1",
ptime_name = "pseudotime",
beta_sigma_list,
gene_name = "ENSMUSG00000000037")
In this section, we will estimate parameters and perform DM analysis using data from two phenotypic groups.
estiParamTwoGroups
# Estimate parameters for both groups
beta_sigma_list_group <- estiParamTwo(
Dat_sce = Dat_sce,
Dat_name_g1 = "Methy_level_group1",
Dat_name_g2 = "Methy_level_group2",
ptime_name_g1 = "pseudotime",
ptime_name_g2 = "pseudotime_g2"
)
# Check the output
names(beta_sigma_list_group)
## [1] "Group1" "Group2"
## $ENSMUSG00000000001
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1
## 1.25809847 -0.36104585 0.33367033 0.21237691 0.03162836 5.29707820
## Sigma2_2 Sigma2_3 Sigma2_4
## 12.89618178 4.60961402 1.73212107
##
## $ENSMUSG00000000003
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1 Sigma2_2 Sigma2_3
## 1.596588 1.609934 2.711167 -1.647963 -2.964354 24.419095 2.403587 7.182017
## Sigma2_4
## 9.128952
##
## $ENSMUSG00000000028
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1
## 1.30065755 -0.01271698 0.07336106 0.05571895 0.01570911 8.09229936
## Sigma2_2 Sigma2_3 Sigma2_4
## 7.15913952 3.22609158 2.31954314
## $ENSMUSG00000000001
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1 Sigma2_2 Sigma2_3
## 1.915505 -1.088066 5.828407 -3.343740 -1.579720 5.416206 6.306331 3.308247
## Sigma2_4
## 1.524147
##
## $ENSMUSG00000000003
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1 Sigma2_2
## -0.8196619 -0.6818815 2.1500895 -0.9999998 -0.4106212 6.8252274 10.1149136
## Sigma2_3 Sigma2_4
## 4.6254361 2.7659230
##
## $ENSMUSG00000000028
## Beta_0 Beta_1 Beta_2 Beta_3 Beta_4 Sigma2_1 Sigma2_2
## 2.2936810 -0.6681344 2.7488722 -0.9902494 -1.0090488 10.8988700 6.4611817
## Sigma2_3 Sigma2_4
## 3.8892465 3.2592253
dmTwoGroups
# Perform DM analysis to compare the two groups
dm_results_two <- dmTwoGroups(beta_sigma_list_group)
# View the top genomic features with different temporal patterns between groups
head(dm_results_two)
## ENSMUSG00000000568 ENSMUSG00000000392 ENSMUSG00000000326 ENSMUSG00000000295
## 0.14312295 0.11152751 0.09498556 0.09017207
## ENSMUSG00000000216 ENSMUSG00000000555
## 0.07645545 0.07339488
mist
provides a comprehensive suite of tools for
analyzing scDNAm data along pseudotime, whether you are working with a
single group or comparing two phenotypic groups. With the combination of
Bayesian modeling and differential methylation analysis,
mist
is a powerful tool for identifying significant genomic
features in scDNAm data.
## R Under development (unstable) (2024-10-21 r87258)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] SingleCellExperiment_1.29.1 SummarizedExperiment_1.37.0
## [3] Biobase_2.67.0 GenomicRanges_1.59.1
## [5] GenomeInfoDb_1.43.2 IRanges_2.41.1
## [7] S4Vectors_0.45.2 BiocGenerics_0.53.3
## [9] generics_0.1.3 MatrixGenerics_1.19.0
## [11] matrixStats_1.4.1 mist_0.99.4
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 farver_2.1.2 dplyr_1.1.4
## [4] Biostrings_2.75.1 bitops_1.0-9 fastmap_1.2.0
## [7] RCurl_1.98-1.16 GenomicAlignments_1.43.0 XML_3.99-0.17
## [10] digest_0.6.37 lifecycle_1.0.4 survival_3.7-0
## [13] magrittr_2.0.3 compiler_4.5.0 rlang_1.1.4
## [16] sass_0.4.9 tools_4.5.0 utf8_1.2.4
## [19] yaml_2.3.10 rtracklayer_1.67.0 knitr_1.49
## [22] labeling_0.4.3 S4Arrays_1.7.1 curl_6.0.1
## [25] DelayedArray_0.33.3 abind_1.4-8 BiocParallel_1.41.0
## [28] withr_3.0.2 grid_4.5.0 fansi_1.0.6
## [31] colorspace_2.1-1 ggplot2_3.5.1 scales_1.3.0
## [34] MASS_7.3-61 mcmc_0.9-8 cli_3.6.3
## [37] mvtnorm_1.3-2 rmarkdown_2.29 crayon_1.5.3
## [40] httr_1.4.7 rjson_0.2.23 cachem_1.1.0
## [43] zlibbioc_1.53.0 splines_4.5.0 parallel_4.5.0
## [46] XVector_0.47.0 restfulr_0.0.15 vctrs_0.6.5
## [49] Matrix_1.7-1 jsonlite_1.8.9 SparseM_1.84-2
## [52] carData_3.0-5 car_3.1-3 MCMCpack_1.7-1
## [55] Formula_1.2-5 jquerylib_0.1.4 glue_1.8.0
## [58] codetools_0.2-20 gtable_0.3.6 BiocIO_1.17.1
## [61] UCSC.utils_1.3.0 munsell_0.5.1 tibble_3.2.1
## [64] pillar_1.9.0 htmltools_0.5.8.1 quantreg_5.99.1
## [67] GenomeInfoDbData_1.2.13 R6_2.5.1 evaluate_1.0.1
## [70] lattice_0.22-6 Rsamtools_2.23.1 bslib_0.8.0
## [73] MatrixModels_0.5-3 coda_0.19-4.1 SparseArray_1.7.2
## [76] xfun_0.49 pkgconfig_2.0.3