--- title: "Introduction to spatialDE" author: - name: Davide Corso affiliation: - University of Padova email: davide.corso.2@phd.unipd.it - name: Milan Malfait affiliation: - Ghent University email: milan.malfait94@gmail.com - name: Lambda Moses affiliation: - California Institute of Technology email: dlu2@caltech.edu output: BiocStyle::html_document: self_contained: yes toc: true toc_float: true toc_depth: 2 code_folding: show df_print: paged date: "`r BiocStyle::doc_date()`" vignette: > %\VignetteIndexEntry{Introduction to spatialDE} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction [SpatialDE](https://github.com/Teichlab/SpatialDE) by [Svensson et al., 2018](https://www.nature.com/articles/nmeth.4636), is a method to identify spatially variable genes (SVGs) in spatially resolved transcriptomics data. # Installation You can install `r BiocStyle::Biocpkg("spatialDE")` from *Bioconductor* with the following code: ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly = TRUE)) { install.packages("BiocManager") } BiocManager::install("spatialDE") ``` # Example: [Mouse Olfactory Bulb](https://github.com/Teichlab/SpatialDE/tree/master/Analysis/MouseOB) Reproducing the [example analysis](https://github.com/Teichlab/SpatialDE#spatialde-significance-test-example-use) from the original `r basilisk::PyPiLink("SpatialDE")` Python package. ```{r setup} library(spatialDE) library(ggplot2) ``` ## Load data Files originally retrieved from SpatialDE GitHub repository from the following links: https://github.com/Teichlab/SpatialDE/blob/master/Analysis/MouseOB/data/Rep11_MOB_0.csv https://github.com/Teichlab/SpatialDE/blob/master/Analysis/MouseOB/MOB_sample_info.csv ```{r load-data} # Expression file used in python SpatialDE. data("Rep11_MOB_0") # Sample Info file used in python SpatialDE data("MOB_sample_info") ``` The `Rep11_MOB_0` object contains spatial expression data for `r nrow(Rep11_MOB_0)` genes on `r ncol(Rep11_MOB_0)` spots, with genes as rows and spots as columns. ```{r} Rep11_MOB_0[1:5, 1:5] dim(Rep11_MOB_0) ``` The `MOB_sample_info` object contains a `data.frame` with coordinates for each spot. ```{r} head(MOB_sample_info) ``` ### Filter out pratically unobserved genes ```{r} Rep11_MOB_0 <- Rep11_MOB_0[rowSums(Rep11_MOB_0) >= 3, ] ``` ### Get total_counts for every spot ```{r} Rep11_MOB_0 <- Rep11_MOB_0[, row.names(MOB_sample_info)] MOB_sample_info$total_counts <- colSums(Rep11_MOB_0) head(MOB_sample_info) ``` ### Get coordinates from `MOB_sample_info` ```{r} X <- MOB_sample_info[, c("x", "y")] head(X) ``` ## `stabilize` The SpatialDE method assumes normally distributed data, so we stabilize the variance of the negative binomial distributed counts data using Anscombe's approximation. The `stabilize()` function takes as input a `data.frame` of expression values with samples in columns and genes in rows. Thus, in this case, we have to transpose the data. ```{r stabilize} norm_expr <- stabilize(Rep11_MOB_0) norm_expr[1:5, 1:5] ``` ## `regress_out` Next, we account for differences in library size between the samples by regressing out the effect of the total counts for each gene using linear regression. ```{r regres_out} resid_expr <- regress_out(norm_expr, sample_info = MOB_sample_info) resid_expr[1:5, 1:5] ``` ## `run` To reduce running time, the SpatialDE test is run on a subset of 1000 genes. Running it on the complete data set takes about 10 minutes. ```{r run-spatialDE} # For this example, run spatialDE on the first 1000 genes sample_resid_expr <- head(resid_expr, 1000) results <- spatialDE::run(sample_resid_expr, coordinates = X) head(results[order(results$qval), ]) ``` ## `model_search` Finally, we can classify the DE genes to interpetable DE classes using the `model_search` function. We apply the model search on filtered DE results, using a threshold of 0.05 for the Q-values. ```{r model_search} de_results <- results[results$qval < 0.05, ] ms_results <- model_search( sample_resid_expr, coordinates = X, de_results = de_results ) # To show ms_results sorted on qvalue, uncomment the following line # head(ms_results[order(ms_results$qval), ]) head(ms_results) ``` ## `spatial_patterns` Furthermore, we can group spatially variable genes (SVGs) into spatial patterns using automatic expression histology (AEH). ```{r spatial_patterns} sp <- spatial_patterns( sample_resid_expr, coordinates = X, de_results = de_results, n_patterns = 4L, length = 1.5 ) sp$pattern_results ``` ## Plots Visualizing one of the most significant genes. ```{r fig1} gene <- "Pcp4" ggplot(data = MOB_sample_info, aes(x = x, y = y, color = norm_expr[gene, ])) + geom_point(size = 7) + ggtitle(gene) + scale_color_viridis_c() + labs(color = gene) ``` ### Plot Spatial Patterns of Multiple Genes As an alternative to the previous figure, we can plot multiple genes using the normalized expression values. ```{r fig2, fig.height = 10, fig.width = 10} ordered_de_results <- de_results[order(de_results$qval), ] multiGenePlots(norm_expr, coordinates = X, ordered_de_results[1:6, ]$g, point_size = 4, viridis_option = "D", dark_theme = FALSE ) ``` ## Plot Fraction Spatial Variance vs Q-value ```{r} FSV_sig(results, ms_results) ``` # SpatialExperiment integration The SpatialDE workflow can also be executed with a `r BiocStyle::Biocpkg("SpatialExperiment")` object as input. ```{r, message=FALSE} library(SpatialExperiment) # For SpatialExperiment object, we neeed to transpose the counts matrix in order # to have genes on rows and spot on columns. # For this example, run spatialDE on the first 1000 genes partial_counts <- head(Rep11_MOB_0, 1000) spe <- SpatialExperiment( assays = list(counts = partial_counts), spatialData = DataFrame(MOB_sample_info[, c("x", "y")]), spatialCoordsNames = c("x", "y") ) out <- spatialDE(spe, assay_type = "counts", verbose = FALSE) head(out[order(out$qval), ]) ``` ## Plot Spatial Patterns of Multiple Genes (using SpatialExperiment object) We can plot spatial patterns of multiples genes using the `spe` object. ```{r fig3, fig.height = 10, fig.width = 10} spe_results <- out[out$qval < 0.05, ] ordered_spe_results <- spe_results[order(spe_results$qval), ] multiGenePlots(spe, assay_type = "counts", ordered_spe_results[1:6, ]$g, point_size = 4, viridis_option = "D", dark_theme = FALSE ) ``` ## Classify spatially variable genes with `model_search` and `spatial_patterns` ```{r} msearch <- modelSearch(spe, de_results = out, qval_thresh = 0.05, verbose = FALSE ) head(msearch) ``` ```{r} spatterns <- spatialPatterns(spe, de_results = de_results, qval_thresh = 0.05, n_patterns = 4L, length = 1.5, verbose = FALSE ) spatterns$pattern_results ``` # `sessionInfo` {-}
Session info ```{r session_info, echo=FALSE, cache=FALSE} Sys.time() sessionInfo() ```