TileDB implements a framework for local and remote storage of dense
and sparse arrays. We can use this as a DelayedArray
backend to provide an array-level abstraction, thus allowing the data to
be used in many places where an ordinary array or matrix might be used.
The TileDBArray
package implements the necessary wrappers around TileDB-R to
support read/write operations on TileDB arrays within the DelayedArray
framework.
TileDBArrayCreating a TileDBArray is as easy as:
## <100 x 10> TileDBMatrix object of type "double":
## [,1] [,2] [,3] ... [,9] [,10]
## [1,] -0.0443744170 -0.0008111017 0.0169970400 . 0.4168291 -0.4952206
## [2,] -0.7332672233 -0.7234585219 0.1431904752 . 0.8931815 1.1322358
## [3,] -1.4348455193 1.6222740169 -1.9961944497 . -1.1154539 1.7740424
## [4,] 0.5707620828 -0.7208047650 -0.4982973708 . 1.4941551 1.4263812
## [5,] -0.6291573841 -0.2637663312 -0.6705351446 . 0.3968626 1.4622191
## ... . . . . . .
## [96,] 0.2845708 -1.6341036 -0.9742882 . 1.36407753 1.21734108
## [97,] 0.5051839 1.0933622 -0.2515511 . 0.92543553 0.15657597
## [98,] 0.8392054 -1.1410098 -0.1714934 . 0.71158522 -1.43986011
## [99,] 1.6705225 -1.8326691 -0.6687744 . -0.08071136 0.03484107
## [100,] 1.3067771 -0.4008141 -0.5959631 . 0.92578590 -1.23605077
Alternatively, we can use coercion methods:
## <100 x 10> TileDBMatrix object of type "double":
## [,1] [,2] [,3] ... [,9] [,10]
## [1,] -0.0443744170 -0.0008111017 0.0169970400 . 0.4168291 -0.4952206
## [2,] -0.7332672233 -0.7234585219 0.1431904752 . 0.8931815 1.1322358
## [3,] -1.4348455193 1.6222740169 -1.9961944497 . -1.1154539 1.7740424
## [4,] 0.5707620828 -0.7208047650 -0.4982973708 . 1.4941551 1.4263812
## [5,] -0.6291573841 -0.2637663312 -0.6705351446 . 0.3968626 1.4622191
## ... . . . . . .
## [96,] 0.2845708 -1.6341036 -0.9742882 . 1.36407753 1.21734108
## [97,] 0.5051839 1.0933622 -0.2515511 . 0.92543553 0.15657597
## [98,] 0.8392054 -1.1410098 -0.1714934 . 0.71158522 -1.43986011
## [99,] 1.6705225 -1.8326691 -0.6687744 . -0.08071136 0.03484107
## [100,] 1.3067771 -0.4008141 -0.5959631 . 0.92578590 -1.23605077
This process works also for sparse matrices:
## <1000 x 1000> sparse TileDBMatrix object of type "double":
## [,1] [,2] [,3] ... [,999] [,1000]
## [1,] 0.59 0.00 0.00 . 0 0
## [2,] 0.00 0.00 0.00 . 0 0
## [3,] 0.00 0.00 0.00 . 0 0
## [4,] 0.00 0.00 0.00 . 0 0
## [5,] 0.00 0.00 0.00 . 0 0
## ... . . . . . .
## [996,] 0 0 0 . 0 0
## [997,] 0 0 0 . 0 0
## [998,] 0 0 0 . 0 0
## [999,] 0 0 0 . 0 0
## [1000,] 0 0 0 . 0 0
Logical and integer matrices are supported:
## <1000 x 1000> sparse TileDBMatrix object of type "logical":
## [,1] [,2] [,3] ... [,999] [,1000]
## [1,] TRUE FALSE FALSE . FALSE FALSE
## [2,] FALSE FALSE FALSE . FALSE FALSE
## [3,] FALSE FALSE FALSE . FALSE FALSE
## [4,] FALSE FALSE FALSE . FALSE FALSE
## [5,] FALSE FALSE FALSE . FALSE FALSE
## ... . . . . . .
## [996,] FALSE FALSE FALSE . FALSE FALSE
## [997,] FALSE FALSE FALSE . FALSE FALSE
## [998,] FALSE FALSE FALSE . FALSE FALSE
## [999,] FALSE FALSE FALSE . FALSE FALSE
## [1000,] FALSE FALSE FALSE . FALSE FALSE
As are matrices with dimension names:
rownames(X) <- sprintf("GENE_%i", seq_len(nrow(X)))
colnames(X) <- sprintf("SAMP_%i", seq_len(ncol(X)))
writeTileDBArray(X)## <100 x 10> TileDBMatrix object of type "double":
## SAMP_1 SAMP_2 SAMP_3 ... SAMP_9 SAMP_10
## GENE_1 -0.0443744170 -0.0008111017 0.0169970400 . 0.4168291 -0.4952206
## GENE_2 -0.7332672233 -0.7234585219 0.1431904752 . 0.8931815 1.1322358
## GENE_3 -1.4348455193 1.6222740169 -1.9961944497 . -1.1154539 1.7740424
## GENE_4 0.5707620828 -0.7208047650 -0.4982973708 . 1.4941551 1.4263812
## GENE_5 -0.6291573841 -0.2637663312 -0.6705351446 . 0.3968626 1.4622191
## ... . . . . . .
## GENE_96 0.2845708 -1.6341036 -0.9742882 . 1.36407753 1.21734108
## GENE_97 0.5051839 1.0933622 -0.2515511 . 0.92543553 0.15657597
## GENE_98 0.8392054 -1.1410098 -0.1714934 . 0.71158522 -1.43986011
## GENE_99 1.6705225 -1.8326691 -0.6687744 . -0.08071136 0.03484107
## GENE_100 1.3067771 -0.4008141 -0.5959631 . 0.92578590 -1.23605077
TileDBArraysTileDBArrays are simply DelayedArray
objects and can be manipulated as such. The usual conventions for
extracting data from matrix-like objects work as expected:
## [1] 100 10
## [1] "GENE_1" "GENE_2" "GENE_3" "GENE_4" "GENE_5" "GENE_6"
## GENE_1 GENE_2 GENE_3 GENE_4 GENE_5 GENE_6
## -0.04437442 -0.73326722 -1.43484552 0.57076208 -0.62915738 0.15833265
We can also perform manipulations like subsetting and arithmetic.
Note that these operations do not affect the data in the TileDB backend;
rather, they are delayed until the values are explicitly required, hence
the creation of the DelayedMatrix object.
## <5 x 5> DelayedMatrix object of type "double":
## SAMP_1 SAMP_2 SAMP_3 SAMP_4 SAMP_5
## GENE_1 -0.0443744170 -0.0008111017 0.0169970400 -0.1678368331 0.3986601738
## GENE_2 -0.7332672233 -0.7234585219 0.1431904752 0.3709619085 0.7350312726
## GENE_3 -1.4348455193 1.6222740169 -1.9961944497 -0.7256958093 -0.2182705845
## GENE_4 0.5707620828 -0.7208047650 -0.4982973708 1.1190370694 1.0282982457
## GENE_5 -0.6291573841 -0.2637663312 -0.6705351446 0.3926805731 -0.3727408210
## <100 x 10> DelayedMatrix object of type "double":
## SAMP_1 SAMP_2 SAMP_3 ... SAMP_9 SAMP_10
## GENE_1 -0.088748834 -0.001622203 0.033994080 . 0.8336581 -0.9904412
## GENE_2 -1.466534447 -1.446917044 0.286380950 . 1.7863630 2.2644717
## GENE_3 -2.869691039 3.244548034 -3.992388899 . -2.2309079 3.5480848
## GENE_4 1.141524166 -1.441609530 -0.996594742 . 2.9883101 2.8527623
## GENE_5 -1.258314768 -0.527532662 -1.341070289 . 0.7937252 2.9244383
## ... . . . . . .
## GENE_96 0.5691416 -3.2682073 -1.9485764 . 2.72815505 2.43468216
## GENE_97 1.0103678 2.1867244 -0.5031021 . 1.85087107 0.31315195
## GENE_98 1.6784107 -2.2820196 -0.3429868 . 1.42317044 -2.87972021
## GENE_99 3.3410449 -3.6653382 -1.3375487 . -0.16142271 0.06968214
## GENE_100 2.6135542 -0.8016283 -1.1919262 . 1.85157179 -2.47210154
We can also do more complex matrix operations that are supported by DelayedArray:
## SAMP_1 SAMP_2 SAMP_3 SAMP_4 SAMP_5 SAMP_6 SAMP_7
## -6.030938 -19.186307 -14.835407 -6.850979 7.077898 -2.395486 14.760276
## SAMP_8 SAMP_9 SAMP_10
## -6.828849 5.511474 -6.786565
## [,1]
## GENE_1 -2.74656763
## GENE_2 1.15488743
## GENE_3 0.44938973
## GENE_4 4.53727101
## GENE_5 2.86414130
## GENE_6 -0.02498635
## GENE_7 1.81574661
## GENE_8 1.73645892
## GENE_9 2.65698572
## GENE_10 -2.41173782
## GENE_11 0.76904234
## GENE_12 4.22326119
## GENE_13 -4.34820670
## GENE_14 0.95670568
## GENE_15 -0.79595637
## GENE_16 -1.79797402
## GENE_17 1.63145200
## GENE_18 2.70077412
## GENE_19 2.39853894
## GENE_20 -1.17220350
## GENE_21 -3.87533526
## GENE_22 -0.71093994
## GENE_23 -3.09942375
## GENE_24 0.24626698
## GENE_25 -2.16143750
## GENE_26 -0.89970763
## GENE_27 0.19646400
## GENE_28 -3.15658279
## GENE_29 3.86931217
## GENE_30 -1.35089429
## GENE_31 -0.24127056
## GENE_32 2.61069687
## GENE_33 -1.25423423
## GENE_34 -2.82593729
## GENE_35 -0.50807143
## GENE_36 -2.45664957
## GENE_37 -0.41006145
## GENE_38 0.33441769
## GENE_39 -2.52427880
## GENE_40 -0.26549764
## GENE_41 -0.67851637
## GENE_42 0.15943554
## GENE_43 0.73758565
## GENE_44 -1.13038061
## GENE_45 -0.47033188
## GENE_46 0.22689398
## GENE_47 -0.11070914
## GENE_48 1.19633891
## GENE_49 -0.67154603
## GENE_50 0.33634914
## GENE_51 1.12637376
## GENE_52 -1.54772574
## GENE_53 -1.53760827
## GENE_54 -1.30752925
## GENE_55 -3.07390264
## GENE_56 1.38358641
## GENE_57 -3.71958148
## GENE_58 1.51789394
## GENE_59 -1.36210496
## GENE_60 4.09332506
## GENE_61 0.87431073
## GENE_62 -2.79984054
## GENE_63 0.39424498
## GENE_64 -3.79831613
## GENE_65 -1.12788789
## GENE_66 -0.47716881
## GENE_67 -0.33322544
## GENE_68 -3.60177252
## GENE_69 1.19082451
## GENE_70 -0.02056048
## GENE_71 -3.11812543
## GENE_72 2.56421196
## GENE_73 -1.95891252
## GENE_74 1.16679386
## GENE_75 -2.74524119
## GENE_76 0.34054103
## GENE_77 -2.04091567
## GENE_78 -2.54486939
## GENE_79 3.77812997
## GENE_80 -0.96773864
## GENE_81 -0.90746205
## GENE_82 1.78830899
## GENE_83 -0.60718104
## GENE_84 3.07986600
## GENE_85 0.56532383
## GENE_86 -2.70380210
## GENE_87 2.22707470
## GENE_88 -1.44938195
## GENE_89 2.18620370
## GENE_90 0.59243013
## GENE_91 0.79868100
## GENE_92 2.29998103
## GENE_93 -3.72172638
## GENE_94 0.56799083
## GENE_95 1.88639217
## GENE_96 1.47013702
## GENE_97 3.07303511
## GENE_98 -1.00165298
## GENE_99 -0.18191041
## GENE_100 0.31034837
We can adjust some parameters for creating the backend with
appropriate arguments to writeTileDBArray(). For example,
the example below allows us to control the path to the backend as well
as the name of the attribute containing the data.
## <100 x 10> TileDBMatrix object of type "double":
## [,1] [,2] [,3] ... [,9] [,10]
## [1,] -0.02458814 1.52359084 0.63182124 . 0.4350149 -1.0125840
## [2,] -0.85877314 -0.74263880 -0.09237386 . 1.3393936 -1.5874116
## [3,] -0.10177113 0.39880655 0.97720153 . -0.2730836 -0.9345021
## [4,] 0.40619685 -0.38655532 1.53986379 . 0.4989098 0.1151319
## [5,] -1.57161553 -2.16554490 -1.41390699 . -0.2793398 0.4987026
## ... . . . . . .
## [96,] -2.063103850 0.732180804 1.304533995 . 2.2218856 -1.0474822
## [97,] -0.471793534 0.005793078 0.879812409 . 1.7921306 0.2589382
## [98,] -1.554407209 -0.100437015 0.625380320 . 0.6496302 -0.9340455
## [99,] -0.682782293 0.086042302 1.948936745 . -0.1578768 -1.5228820
## [100,] 0.484096187 -1.589224991 -0.136899694 . -0.9773228 0.6771734
As these arguments cannot be passed during coercion, we instead provide global variables that can be set or unset to affect the outcome.
## <100 x 10> TileDBMatrix object of type "double":
## [,1] [,2] [,3] ... [,9] [,10]
## [1,] -0.02458814 1.52359084 0.63182124 . 0.4350149 -1.0125840
## [2,] -0.85877314 -0.74263880 -0.09237386 . 1.3393936 -1.5874116
## [3,] -0.10177113 0.39880655 0.97720153 . -0.2730836 -0.9345021
## [4,] 0.40619685 -0.38655532 1.53986379 . 0.4989098 0.1151319
## [5,] -1.57161553 -2.16554490 -1.41390699 . -0.2793398 0.4987026
## ... . . . . . .
## [96,] -2.063103850 0.732180804 1.304533995 . 2.2218856 -1.0474822
## [97,] -0.471793534 0.005793078 0.879812409 . 1.7921306 0.2589382
## [98,] -1.554407209 -0.100437015 0.625380320 . 0.6496302 -0.9340455
## [99,] -0.682782293 0.086042302 1.948936745 . -0.1578768 -1.5228820
## [100,] 0.484096187 -1.589224991 -0.136899694 . -0.9773228 0.6771734
## R version 4.5.2 (2025-10-31)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.3 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats4 stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] RcppSpdlog_0.0.23 TileDBArray_1.20.0 DelayedArray_0.36.0
## [4] SparseArray_1.10.1 S4Arrays_1.10.0 IRanges_2.44.0
## [7] abind_1.4-8 S4Vectors_0.48.0 MatrixGenerics_1.22.0
## [10] matrixStats_1.5.0 BiocGenerics_0.56.0 generics_0.1.4
## [13] Matrix_1.7-4 BiocStyle_2.38.0
##
## loaded via a namespace (and not attached):
## [1] bit_4.6.0 jsonlite_2.0.0 compiler_4.5.2
## [4] BiocManager_1.30.26 Rcpp_1.1.0 nanoarrow_0.7.0-1
## [7] jquerylib_0.1.4 yaml_2.3.10 fastmap_1.2.0
## [10] lattice_0.22-7 RcppCCTZ_0.2.13 R6_2.6.1
## [13] XVector_0.50.0 tiledb_0.33.0 knitr_1.50
## [16] maketools_1.3.2 bslib_0.9.0 rlang_1.1.6
## [19] cachem_1.1.0 xfun_0.54 sass_0.4.10
## [22] sys_3.4.3 bit64_4.6.0-1 cli_3.6.5
## [25] spdl_0.0.5 digest_0.6.37 grid_4.5.2
## [28] lifecycle_1.0.4 data.table_1.17.8 evaluate_1.0.5
## [31] nanotime_0.3.12 zoo_1.8-14 buildtools_1.0.0
## [34] rmarkdown_2.30 tools_4.5.2 htmltools_0.5.8.1