---
title: "3. Tensor decomposition by DelayedTensor"
author:
- name: Koki Tsuyuzaki
affiliation: Laboratory for Bioinformatics Research,
RIKEN Center for Biosystems Dynamics Research
- name: Itoshi Nikaido
affiliation: Laboratory for Bioinformatics Research,
RIKEN Center for Biosystems Dynamics Research
email: k.t.the-answer@hotmail.co.jp
graphics: no
package: DelayedTensor
output:
BiocStyle::html_document:
toc_float: true
vignette: |
%\VignetteIndexEntry{TensorDecomposition}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r style, echo = FALSE, results = 'asis', message=FALSE}
BiocStyle::markdown()
```
**Authors**: `r packageDescription("DelayedTensor")[["Author"]] `
**Last modified:** `r file.info("DelayedTensor_3.Rmd")$mtime`
**Compiled**: `r date()`
# Setting
```{r Setting, echo=TRUE}
suppressPackageStartupMessages(library("DelayedTensor"))
suppressPackageStartupMessages(library("DelayedArray"))
suppressPackageStartupMessages(library("HDF5Array"))
suppressPackageStartupMessages(library("DelayedRandomArray"))
darr <- RandomUnifArray(c(3,4,5))
setVerbose(FALSE)
setSparse(FALSE)
setAutoBlockSize(1E+8)
tmpdir <- tempdir()
setHDF5DumpDir(tmpdir)
```
# Tensor Decomposition
Tensor decomposition models decompose multiple factor matrices and core tensor.
Each factor matrix means the patterns of each mode
and is used for the visualization and the downstream analysis.
Core tensor means the intensity of the patterns
and is used to decide which patterns are informative.
We reimplemented some of the tensor decomposition functions of
`r CRANpkg("rTensor")` using block processing of
`r Biocpkg("DelayedArray")`.
Only tensor decomposition algorithms and utility functions that require
Fast Fourier Transform (e.g., `t_mult`, `t_svd`, and `t_svd_reconstruct`)
are exceptions and have not yet been implemented in `r Biocpkg("DelayedArray")`
because we are still investigating how to calculate them with
out-of-core manner.
## Tucker Decomposition
Suppose a tensor $\mathcal{X} \in \Re^{I \times J \times K}$.
Tucker decomposition models decomposes a tensor $\mathcal{X}$ into a core tensor
$\mathcal{G} \in \Re^{p \times q \times r}$,
and multiple factor matrices $A \in \Re^{I \times p}$,
$B \in \Re^{J \times q}$, and $C \in \Re^{K \times r}$
($p \leq I, q \leq J, \textrm{and}\ r \leq K$).
$$
\mathcal{X} = \mathcal{G} \times_{1} A \times_{2} B \times_{3} C
$$
For simplicity, here we will use a third-order tensor
but the Tucker decomposition can be applied to tensors of larger orders.
There are well-known two algorithms;
Higher-Order Singular Value Decomposition (HOSVD) and
Higher-order Orthogonal Iteration (HOOI).
`hosvd` performs the HOSVD Tucker decomposition
and `tucker` performs HOOI Tucker decomposition.
For the details, check the `hosvd` and `tucker` functions of
`r CRANpkg("rTensor")`.
```{r Tensor Decomposition 1, echo=TRUE}
out_hosvd <- hosvd(darr, ranks=c(2,1,3))
str(out_hosvd)
out_tucker <- tucker(darr, ranks=c(2,3,2))
str(out_tucker)
```
## CANDECOMP/PARAFAC (CP) Decomposition
Suppose a tensor $\mathcal{X} \in \Re^{I \times J \times K}$.
CP decomposition models decomposes a tensor $X$ into a core diagonal tensor
$\mathcal{G} \in \Re^{r \times r \times r}$,
and multiple factor matrices $A \in \Re^{I \times r}$,
$B \in \Re^{J \times r}$, and $C \in \Re^{K \times r}$
($r \leq \min{\left(I,J,K\right)}$).
$$
\mathcal{X} = \mathcal{G} \times_{1} A \times_{2} B \times_{3} C
$$
For simplicity, here we will use a third-order tensor
but the CP decomposition can be applied to tensors of larger orders.
Alternating least squares (ALS) is a well-known algorithm for CP decomposition
and `cp` performs the ALS CP decomposition.
For the details, check the `cp` function of `r CRANpkg("rTensor")`.
```{r Tensor Decomposition 2, echo=TRUE}
out_cp <- cp(darr, num_components=2)
str(out_cp)
```
## Multilinear Principal Component Analysis (MPCA)
MPCA is a kind of Tucker decomposition and when the order of tensor is 3,
this is also known as the
Generalized Low-Rank Approximation of Matrices (GLRAM).
For the details, check the `mpca` function of `r CRANpkg("rTensor")`.
```{r Tensor Decomposition 3, echo=TRUE}
out_mpca <- mpca(darr, ranks=c(2,2))
str(out_mpca)
```
## Population Value Decomposition (PVD)
Suppose a series of 2D data X_{j}, where $X_{j} \in \Re^{n_{1} \times n_{2}}$,
and $j = 1, \ldots , n_{3}$.
PVD models decomposes a tensor $X_{j}$
into two common factor matrices across all $X_{j}$,
$P \in \Re^{n_{1} \times r_{1}}$ and $D \in \Re^{n_{2} \times r_{2}}$,
and a $X_{j}$ specific factor matrix $V_{j} \in \Re^{r_{1} \times r_{2}}$
($r_{1} \leq n_{1}, \textrm{and}\ r_{2} \leq n_{2}$).
$$
X_{j} = P \times V_{j} \times D
$$
For the details, check the `pvd` function of `r CRANpkg("rTensor")`.
```{r Tensor Decomposition 4, echo=TRUE}
out_pvd <- pvd(darr,
uranks=rep(2,dim(darr)[3]), wranks=rep(3,dim(darr)[3]), a=2, b=3)
str(out_pvd)
```
# Session information {.unnumbered}
```{r sessionInfo, echo=FALSE}
sessionInfo()
```