--- title: Saving arrays to artifacts and back again author: - name: Aaron Lun email: infinite.monkeys.with.keyboards@gmail.com package: alabaster.matrix date: "Revised: November 28, 2023" output: BiocStyle::html_document vignette: > %\VignetteIndexEntry{Saving and loading arrays} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, echo=FALSE} library(BiocStyle) self <- Githubpkg("ArtifactDB/alabaster.matrix") knitr::opts_chunk$set(error=FALSE, warning=FALSE, message=FALSE) ``` # Overview The `r self` package implements methods to save matrix-like objects to file artifacts and load them back into R. Check out the `r Githubpkg("ArtifactDB/alabaster.base")` for more details on the motivation and the **alabaster** framework. # Quick start Given an array-like object, we can use `saveObject()` to save it inside a staging directory: ```{r} library(Matrix) y <- rsparsematrix(1000, 100, density=0.05) library(alabaster.matrix) tmp <- tempfile() saveObject(y, tmp) list.files(tmp, recursive=TRUE) ``` We then load it back into our R session with `loadObject()`. This creates a HDF5-backed S4 array that can be easily coerced into the desired format, e.g., a `dgCMatrix`. ```{r} roundtrip <- readObject(tmp) class(roundtrip) ``` This process is supported for all base arrays, `r CRANpkg("Matrix")` objects and `r Biocpkg("DelayedArray")` objects. # Saving delayed operations For `DelayedArray`s, we may instead choose to save the delayed operations themselves to file. This creates a HDF5 file following the [**chihaya**](https://ltla.github.io/chihaya) format, containing the delayed operations rather than the results of their evaluation. ```{r} library(DelayedArray) y <- DelayedArray(rsparsematrix(1000, 100, 0.05)) y <- log1p(abs(y) / 1:100) # adding some delayed ops. tmp <- tempfile() saveObject(y, tmp, DelayedArray.preserve.ops=TRUE) # Inspecting the HDF5 file reveals many delayed operations: rhdf5::h5ls(file.path(tmp, "array.h5")) # And indeed, we can recover those same operations. readObject(tmp) ``` This allows users to avoid evaluation of the operations when saving objects, which may improve efficiency, e.g., by avoiding loss of sparsity or casting to a larger type. # Session information {-} ```{r} sessionInfo() ```