--- title: "Overview" author: - name: William Hutchison affiliation: WEHI - Walter and Eliza Hall Institute of Medical Research email: hutchison.w@wehi.edu.au - name: Stefano Mangiola affiliation: WEHI - Walter and Eliza Hall Institute of Medical Research package: tidySpatialExperiment output: BiocStyle::html_document abstract: | A brief overview of the tidySpatialExperiment package - demonstrating the SpatialExperiment-tibble abstraction, compatibility with the *tidyverse* ecosystem, compatibility with the *tidyomics* ecosystem and a few helpful utility functions. vignette: | %\VignetteIndexEntry{Overview} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- # tidySpatialExperiment - part of *tidyomics* Resources to help you get started with tidySpatialExperiment and *tidyomics*: - [The tidySpatialExperiment website](http://william-hutchison.github.io/tidySpatialExperiment/) - [The tidyomics blog](https://tidyomics.github.io/tidyomicsBlog/) - [Third party tutorials](https://rstudio-pubs-static.s3.amazonaws.com/792462_f948e766b15d4ee5be5c860493bda0b3.html) The *tidyomics* ecosystem includes packages for: - Working with genomic features: - [plyranges](https://github.com/sa-lee/plyranges), for tidy manipulation of genomic range data. - [nullranges](https://github.com/nullranges/nullranges), for tidy generation of genomic ranges representing the null hypothesis. - [plyinteractions](https://github.com/tidyomics/plyinteractions), for tidy manipulation of genomic interaction data. - Working with transcriptomic features: - [tidySummarizedExperiment](https://github.com/stemangiola/tidySummarizedExperiment), for tidy manipulation of SummarizedExperiment objects. - [tidySingleCellExperiment](https://github.com/stemangiola/tidySingleCellExperiment), for tidy manipulation of SingleCellExperiment objects. - [tidyseurat](https://github.com/stemangiola/tidyseurat), for tidy manipulation of Seurat objects. - [tidybulk](https://github.com/stemangiola/tidybulk), for bulk RNA-seq analysis. - Working with cytometry features: - [tidytof](https://github.com/keyes-timothy/tidytof), for tidy manipulation of high-dimensional cytometry data. # Introduction tidySpatialExperiment provides a bridge between the [SpatialExperiment](https://github.com/drighelli/SpatialExperiment) [@righelli2022spatialexperiment] package and the [*tidyverse*](https://www.tidyverse.org) [@wickham2019welcome] ecosystem. It creates an invisible layer that allows you to interact with a SpatialExperiment object as if it were a tibble; enabling the use of functions from [dplyr](https://github.com/tidyverse/dplyr), [tidyr](https://github.com/tidyverse/tidyr), [ggplot2](https://github.com/tidyverse/ggplot2) and [plotly](https://github.com/plotly/plotly.R). But, underneath, your data remains a SpatialExperiment object. tidySpatialExperiment also provides five additional utility functions. ## Functions and utilities | Package | Functions available | |-----------------------------------|-------------------------------------| | `SpatialExperiment` | All | | `dplyr` | `arrange`,`bind_rows`, `bind_cols`, `distinct`, `filter`, `group_by`, `summarise`, `select`, `mutate`, `rename`, `left_join`, `right_join`, `inner_join`, `slice`, `sample_n`, `sample_frac`, `count`, `add_count` | | `tidyr` | `nest`, `unnest`, `unite`, `separate`, `extract`, `pivot_longer` | | `ggplot2` | `ggplot` | | `plotly` | `plot_ly` | | Utility | Description | |-----------------------------------|-------------------------------------| | `as_tibble` | Convert cell data to a `tbl_df` | | `join_features` | Append feature data to cell data | | `aggregate_cells` | Aggregate cell-feature abundance into a pseudobulk `SummarizedExperiment` object | | `rectangle` | Select rectangular region of space | | `ellipse` | Select elliptical region of space | ## Installation You can install the stable version of tidySpatialExperiment from Bioconductor with: ```{r, eval=FALSE} if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("tidySpatialExperiment") ``` You can install the development version of tidySpatialExperiment from GitHub with: ```{r, eval=FALSE} if (!requireNamespace("devtools", quietly=TRUE)) install.packages("devtools") devtools::install_github("william-hutchison/tidySpatialExperiment") ``` ## Load data Here, we attach tidySpatialExperiment and an example SpatialExperiment object. ```{r, message=FALSE, results=FALSE} # Load example SpatialExperiment object library(tidySpatialExperiment) example(read10xVisium) ``` ## SpatialExperiment-tibble abstraction A SpatialExperiment object represents observations (cells) as columns and variables (features) as rows, as is the Bioconductor convention. Additional information about the cells is accessed through the `reducedDims`, `colData` and `spatialCoords` functions. tidySpatialExperiment provides a SpatialExperiment-tibble abstraction, representing cells as rows and features as columns, as is the *tidyverse* convention. `colData` and `spatialCoords` are appended as columns to the same abstraction, allowing easy interaction with this additional data. The default view is now of the SpatialExperiment-tibble abstraction. ```{r} spe ``` However, our data maintains its status as a SpatialExperiment object. Therefore, we have access to all SpatialExperiment functions. ```{r} spe |> colData() |> head() spe |> spatialCoords() |> head() spe |> imgData() ``` # Integration with the *tidyverse* ecosystem ## Manipulate with dplyr Most functions from dplyr are available for use with the SpatialExperiment-tibble abstraction. For example, `filter` can be used to select cells by a variable of interest. ```{r} spe |> filter(array_col < 5) ``` And `mutate` can be used to add new variables, or modify the value of an existing variable. ```{r} spe |> mutate(in_region = c(in_tissue & array_row < 10)) ``` ## Tidy with tidyr Most functions from tidyr are also available. Here, `nest` is used to group the data by `sample_id`, and `unnest` is used to ungroup the data. ```{r} # Nest the SpatialExperiment object by sample_id spe_nested <- spe |> nest(data = -sample_id) # View the nested SpatialExperiment object spe_nested # Unnest the nested SpatialExperiment objects spe_nested |> unnest(data) ``` ## Plot with ggplot2 The `ggplot` function can be used to create a plot from a SpatialExperiment object. This example also demonstrates how tidy operations can be combined to build up more complex analysis. It should be noted that helper functions such `aes` are not included and should be imported from ggplot2. ```{r} spe |> filter(sample_id == "section1" & in_tissue) |> # Add a column with the sum of feature counts per cell mutate(count_sum = purrr::map_int(.cell, ~ spe[, .x] |> counts() |> sum() )) |> # Plot with tidySpatialExperiment and ggplot2 ggplot(ggplot2::aes(x = reorder(.cell, count_sum), y = count_sum)) + ggplot2::geom_point() + ggplot2::coord_flip() ``` ## Plot with plotly The `plot_ly` function can also be used to create a plot from a SpatialExperiment object. ```{r, eval=FALSE, message=FALSE, warning=FALSE} spe |> filter(sample_id == "section1") |> plot_ly( x = ~ array_col, y = ~ array_row, color = ~ in_tissue, type = "scatter" ) ``` ![plotly demonstration](../man/figures/plotly_demo.png) # Integration with the *tidyomics* ecosystem ## Interactively select cells with tidygate Different packages from the *tidyomics* ecosystem are easy to use together. Here, tidygate is used to interactively gate cells based on their array location. ```{r, eval=FALSE} spe_regions <- spe |> filter(sample_id == "section1") |> mutate(region = tidygate::gate_chr(array_col, array_row)) ``` ![tidygate demonstration](../man/figures/tidygate_demo.gif) ```{r, echo=FALSE} # Manually set gate information to match demonstration spe_regions <- spe |> filter(sample_id == "section1") |> mutate(region = ifelse( array_row < 48 & array_row > 20 & array_col < 80 & array_col > 60, 1, 0)) ``` The gated cells can then be divided into pseudobulks within a SummarizedExperiment object using tidySpatialExperiment's `aggregate_cells` utility function. ```{r} spe_regions_aggregated <- spe_regions |> aggregate_cells(region) ``` # Utilities ## Append feature data to cell data The *tidyomics* ecosystem places the emphasis on interacting with cell data. To interact with feature data, the `join_feature` function can be used to append feature values to cell data. ```{r} # Join feature data in wide format, preserving the SpatialExperiment object spe |> join_features(features = c("ENSMUSG00000025915", "ENSMUSG00000042501"), shape = "wide") |> head() # Join feature data in long format, discarding the SpatialExperiment object spe |> join_features(features = c("ENSMUSG00000025915", "ENSMUSG00000042501"), shape = "long") |> head() ``` ## Aggregate cells Sometimes, it is necessary to aggregate the gene-transcript abundance from a group of cells into a single value. For example, when comparing groups of cells across different samples with fixed-effect models. Cell aggregation can be achieved using the `aggregate_cells` function. ```{r} spe |> aggregate_cells(in_tissue, assays = "counts") ``` ## Elliptical and rectangular region selection To select cells by their geometric region in space, the `ellipse` and `rectangle` functions can be used. ```{r} spe |> filter(sample_id == "section1") |> mutate(in_ellipse = ellipse(array_col, array_row, c(20, 40), c(20, 20))) |> ggplot(aes(x = array_col, y = array_row, colour = in_ellipse)) + geom_point() ``` # Important considerations ## Read-only columns Removing the `.cell` column will return a tibble. This is consistent with the behaviour in other *tidyomics* packages. ```{r} spe |> select(-.cell) |> head() ``` The sample_id column cannot be removed with *tidyverse* functions, and can only be modified if the changes are accepted by SpatialExperiment's `colData` function. ```{r, error=TRUE} # sample_id is not removed, despite the user's request spe |> select(-sample_id) # This change maintains separation of sample_ids and is permitted spe |> mutate(sample_id = stringr::str_c(sample_id, "_modified")) |> head() # This change does not maintain separation of sample_ids and produces an error spe |> mutate(sample_id = "new_sample") ``` The `pxl_col_in_fullres` and `px_row_in_fullres` columns cannot be removed or modified with *tidyverse* functions. This is consistent with the behaviour of dimension reduction data in other *tidyomics* packages. ```{r, error=TRUE} # Attempting to remove pxl_col_in_fullres produces an error spe |> select(-pxl_col_in_fullres) # Attempting to modify pxl_col_in_fullres produces an error spe |> mutate(pxl_col_in_fullres) ``` ```{r} sessionInfo() ```