<!--
%\VignetteEngine{knitr}
%\VignetteIndexEntry{Make Enriched Heatmaps}
-->

Make Enriched Heatmaps
========================================

**Author**: Zuguang Gu ( z.gu@dkfz.de )

**Date**: `r Sys.Date()`

-------------------------------------------------------------

```{r, echo = FALSE, message = FALSE}
library(markdown)
options(markdown.HTML.options = c(options('markdown.HTML.options')[[1]], "toc"))

library(knitr)
knitr::opts_chunk$set(
    error = FALSE,
    tidy  = FALSE,
    message = FALSE,
    fig.align = "center")
options(markdown.HTML.stylesheet = "custom.css")

options(width = 100)
```

Enriched heatmap is a special type of heatmap which visualizes the enrichment of genomic signals
to specific target regions. It is broadly used to 
visualize e.g. how histone modifications are enriched to transcription start sites.

There are several tools that can make such heatmap (e.g. [ngs.plot](https://github.com/shenlab-sinai/ngsplot), [deepTools](https://github.com/fidelram/deepTools) or [ChIPseeker](https://bioconductor.org/packages/release/bioc/html/ChIPseeker.html)).
Here we implement Enriched heatmap by **ComplexHeatmap** package. Since this type of heatmap is just a
normal heatmap but with some special settings, with the functionality of **ComplexHeatmap**, it would be much easier
to customize the heatmap as well as concatenating to a list of heatmaps to show correspondance between different data sources.

```{r, echo = FALSE, message = FALSE}
suppressPackageStartupMessages(library(EnrichedHeatmap))
```

## Basic

```{r, eval = FALSE}
library(EnrichedHeatmap)
```

Load the example data that we will use for demostration:

```{r}
set.seed(123)
load(paste0(system.file("extdata", "chr21_test_data.RData", package = "EnrichedHeatmap")))
ls()
```

The example data are all `GRanges` objects:

- `H3K4me3`: coverage for H3K4me3 histone marks
- `cgi`: CpG islands
- `genes`: genes
- `meth`: methylation
- `rpkm`: gene expression

In order to build the vignette fast, the data only includes chromosome 21. Also we downsampled 100000 CpG
sites for methylation data.

We first visualize how H3K4me3 histone modification is enriched around TSS.
First we extract TSS of genes (note `tss` has strand information):

```{r, fig.width = 3}
tss = promoters(genes, upstream = 0, downstream = 1)
tss[1:5]
H3K4me3[1:5]
```

Similar as other tools, the task of visualization are separated into two steps:

1. get association between genomic signals and targets by normalizing to a matrix.
2. visualize the matrix by heatmap.

```{r}
mat1 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = 5000, mean_mode = "w0", w = 50)
mat1
class(mat1)
```

`normalizeToMatrix()` converts the association between genomic signals (`H3K4me3`) and targets(`tss`) in to a matrix (actually `mat1` is just a normal matrix with several additional attributes). 
It first splits the extended targets regions (the extension to upstream and downstream is controlled by `extend` argument)
into a list of small windows (the width of the windows is controlled by `w`), then overlaps genomic signals to these small windows and calculates
the value for every small window which is the mean value of genomic signals that intersects with the window (the value
corresponds to genomic signals are controlled by `value_column` and how to calcualte the mean value is controlled by `mean_mode`).

There are several modes for `mean_mode` according to different types of genomic signals. It will be explained in later sections. 

With `mat1`, we can visualize it as a heatmap:

```{r, fig.width = 3}
EnrichedHeatmap(mat1, name = "H3K4me3")
```

`EnrichedHeatmap()` returns an `EnrichedHeatmap` class instance which is inherited from `Heatmap` class,
so parameters and methods for `Heatmap` class can be directly applied to `EnrichedHeatmap` class. Users can
go to the [**ComplexHeatmap** package](http://www.bioconductor.org/packages/devel/bioc/html/ComplexHeatmap.html) to get a more comprehensive help.

### Colors

Similar as the normal heatmap, colors can be controlled by a vector.

```{r, fig.width = 3}
EnrichedHeatmap(mat1, col = c("white", "red"), name = "H3K4me3")
```

You may wonder why the color looks so light. The reason is in coverage values in `H3K4me3`, there exist
some extreme values, which results in extreme value in `mat1`.

```{r}
quantile(H3K4me3$coverage, c(0, 0.25, 0.5, 0.75, 0.99, 1))
quantile(mat1, c(0, 0.25, 0.5, 0.75, 0.99, 1))
```

If a vector of colors is specified, sequential values from minimal to maximal are mapped to the colors,
and other values are linearly interpolated. To get rid of such extreme values, there are two ways. 
The first is to specify `trim` option which trims extreme values both at lower and upper bounds.
(In following, it means only to trim values larger than 99th quantile.)

```{r, fig.width = 3}
mat1_trim = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = 5000, mean_mode = "w0", w = 50, trim = c(0, 0.01))
EnrichedHeatmap(mat1_trim, col = c("white", "red"), name = "H3K4me3")
```

The second way is to define a color mapping function which is robust to extreme values.
Another advantage of using a color mapping function is that if you have more than one heatmaps to make, it
makes colors in heatmaps comparable.

```{r, fig.width = 3}
library(circlize)
col_fun = colorRamp2(quantile(mat1, c(0, 0.99)), c("white", "red"))
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3")
```

To sum it up, the first way directly modified values in `mat1` while the second way keeps the original values
but using modified color mappings.

If `col` is not specified in `EnrichedHeatmap()`, blue-white-red is mapped to 1st quantile, median and 99th quantile by default.

In following sections, we will also use the matrix to do row-clustering, thus we directly use the trimmed matrix.

```{r}
mat1 = mat1_trim
```

### Split on rows

Split rows by a vector or a data frame by specifying `split` option.

```{r, fig.width = 3}
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", 
    split = sample(c("A", "B"), length(genes), replace = TRUE),
    column_title = "Enrichment of H3K4me3") 
```

Split rows by k-means clustering by specifying `km` option.

```{r, fig.width = 4}
set.seed(123)
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", km = 3,
    column_title = "Enrichment of H3K4me3", row_title_rot = 0)
```

### Cluster on rows

Cluster on rows. By default `show_row_dend` is turned off, so you don't need to specify it here.
More options for row clustering can be found in the help page of `Heatmap()`.

```{r, fig.width = 3}
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", 
    cluster_rows = TRUE, column_title = "Enrichment of H3K4me3")     
```

### Column annotation

There is a special column annotation function `anno_enriched()` which shows mean values of columns 
(i.e. mean signals across target regions) in the normalized matrix.

```{r, fig.width = 3}
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
    top_annotation = HeatmapAnnotation(lines = anno_enriched()), 
    top_annotation_height = unit(2, "cm"))
```

If rows are split, the column annotation will show enrichment lines for all row clusters.

```{r, kmeans_anno, fig.width = 4}
set.seed(123)
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
    # note we have three row-clusters, so we assign three colors for the annotation lines
    top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = 2:4))), 
    top_annotation_height = unit(2, "cm"),
    km = 3, row_title_rot = 0)
```

You can also add error areas (1 se) for each line. The color of error areas always have the same 
color of corresponding line but with 75 percent transparency.

Users should be careful with `show_error`. It only makes sense when patterns in heatmap or row clusters
are homogeneous.

Actually I am not quite sure whether we should visualize the errors because the spreadness of the data
can already be seen in the heatmaps.

Also we demonstrate how to add legend for the annotation lines by `Legend()` function from **ComplexHeatmap** package.

```{r,, fig.width = 4}
set.seed(123)
lgd = Legend(at = c("cluster1", "cluster2", "cluster3"), title = "Clusters", 
    type = "lines", legend_gp = gpar(col = 2:4))
ht = EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3",
    # note we have three row-clusters, so we assign three colors for the annotation lines
    top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = 2:4), show_error = TRUE)), 
    top_annotation_height = unit(2, "cm"),
    km = 3, row_title_rot = 0)
draw(ht, annotation_legend_list = list(lgd))
```

### Smoothing

Rows can be smoothed by setting `smooth` to `TRUE` when generating the matrix. 
Later we will demonstrate smoothing can also help to impute `NA` values.

As smoothing may change the original data range, the color mapping function `col_fun`
here ensures that the color palette is still the same as the unsmoothed one.

`empty_value` corresponds to the regions that have no signal overlapped. The proper value
depends on specific scenarios. Here since we visualize coverage from ChIP-Seq data, it is reasonable
to assign 0 to regions with no H3K4me3 signal.

```{r smooth, fig.width = 3}
mat1_smoothed = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = 5000, mean_mode = "w0", w = 50, empty_value = 0, smooth = TRUE)
EnrichedHeatmap(mat1_smoothed, col = col_fun, name = "H3K4me3")
```

### Extensions to target regions

Extension to upstream and downstream can be controled by `extend` either by a single value
or a vector of length 2.

```{r, fig.width = 3}
# upstream 1kb, downstream 2kb
mat12 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = c(1000, 2000), mean_mode = "w0", w = 50)
EnrichedHeatmap(mat12, name = "H3K4me3", col = col_fun)
```

Either upstream or downstream can be set to 0.

```{r, fig.width = 3}
mat12 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = c(0, 2000), mean_mode = "w0", w = 50)
EnrichedHeatmap(mat12, name = "H3K4me3", col = col_fun)
mat12 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = c(1000, 0), mean_mode = "w0", w = 50)
EnrichedHeatmap(mat12, name = "H3K4me3", col = col_fun)
```

### Other graphic settings

By default there is an axis on the bottom border of the heatmap and a vertical line which represents
the targets. There are several arguments which can be used to customize:

```{r, fig.width = 3}
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", 
    pos_line_gp = gpar(col = "blue", lwd = 2), axis_name = c("-5kb", "TSS", "5kb"), 
    axis_name_rot = -45, border = FALSE)
```

### Mean mode

Upstream and downstream also the target body are segmented into a list of small windows and overlap
to signal regions. Since signal regions and small windows do not always 100 percent overlap, 
to summarize values in small windows, there are four different average modes:
 
Following illustrates different settings for ``mean_mode`` (note there is a signal region overlapping with other signal regions):

       40      50     20     values in signal
     ++++++   +++    +++++   signal
            30               values in signal
          ++++++             signal
       =================     window (17bp), there are 4bp not overlapping to any signal region.
         4  6  3      3      overlap

     absolute: (40 + 30 + 50 + 20)/4
     weighted: (40*4 + 30*6 + 50*3 + 20*3)/(4 + 6 + 3 + 3)
     w0:       (40*4 + 30*6 + 50*3 + 20*3)/(4 + 6 + 3 + 3 + 4)
     coverage: (40*4 + 30*6 + 50*3 + 20*3)/17

To explain it more clearly, let's consider two scenarios:

First, we want to calculate mean methylation from 3 CpG sites in a 20bp window. Since methylation
is only measured at CpG site level, the mean value should only be calculated from the 3 CpG sites while not the non-CpG sites. In this
case, `absolute` mode should be used here.

Second, we want to calculate mean coverage in a 20bp window. Let's assume coverage is 5 in 1bp ~ 5bp, 10 in 11bp ~ 15bp and 20 in 16bp ~ 20bp.
Since converage is kind of attribute for all bases, all 20 bp should be taken in account. Thus, here `w0` mode should be used
which also takes account of the 0 coverage in 6bp ~ 10bp. The mean coverage will be caculated as `(5*5 + 10*5 + 20*5)/(5+5+5+5)`.

Third, genes have multiple transcripts and we want to calculate how many transcripts eixst in a certain position in the gene body.
In this case, values associated to each transcript are binary (either 1 or 0) and `coverage` mean mode should be used.

### Smoothing improves visualization on methylation profiles

Following heatmap visualizes the enrichment of low methylated regions on TSS. The grey colors
represent the windows with no CpG sites (note we set `NA` to `empty_value` and grey is the default color
for `NA` values by **ComplexHeatmap**).

```{r, fig.width = 3}
meth[1:5]
mat2 = normalizeToMatrix(meth, tss, value_column = "meth", mean_mode = "absolute",
    extend = 5000, w = 50, empty_value = NA)
meth_col_fun = colorRamp2(c(0, 0.5, 1), c("blue", "white", "red"))
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation", column_title = "methylation near TSS")
```

When overlapping CpG positions to segmented target regions, it is possible that there is no CpG site in some windows. 
The values for these windows which contain no CpG sites can be imputed by smoothing. Although it seems not proper
to assign methylation values to non-CpG windows, but it will enhance the effect of visualization a lot.

```{r, fig.width = 3}
mat2 = normalizeToMatrix(meth, tss, value_column = "meth", mean_mode = "absolute",
    extend = 5000, w = 50, empty_value = NA, smooth = TRUE)
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation", column_title = "methylation near TSS")
```

To do the smoothing, by default, `locfit()` is first applied to each row in the original matrix. If it is failed, `loess()` smoothing
is applied afterwards. If both smoothing methods are failed, there will be a warning and the original value is kept.

Users can provides their own smoothing function by `smooth_fun` argument. This self-defined function accepts a numeric
vector (may contains `NA` values) and returns a vector with same length. If the smoothing is failed, the function
should call `stop()` to throw errors so that `normalizeToMatrix()` can catch how many rows are failed in smoothing. 
Take a look at the source code of `default_smooth_fun()` to get an example.

## Targets are regions

In the example of H3K4me3, the target regions are single points. 
The target can also be regions with width larger than 1.
Following heatmap visualizes the enrichment of low methylation on CpG islands:

```{r, fig.width = 3}
mat3 = normalizeToMatrix(meth, cgi, value_column = "meth", mean_mode = "absolute",
    extend = 5000, w = 50, empty_value = NA, smooth = TRUE)
EnrichedHeatmap(mat3, col = meth_col_fun, name = "methylation", column_title = "methylation near CGI")
```

Width of the target regions shown on heatmap can be controlled by `target_ratio` which is relative to
the width of the complete heatmap.

Target regions are also splitted into small windows, but since width of the target regions are different from each other, 
they are splitted by percent to their full width (the percent value is calculated automatically).

```{r, fig.width = 3}
mat3 = normalizeToMatrix(meth, cgi, value_column = "meth", mean_mode = "absolute",
    extend = 5000, w = 50, empty_value = NA, smooth = TRUE, target_ratio = 0.3)
EnrichedHeatmap(mat3, col = meth_col_fun, name = "methylation", axis_name_rot = 90,
    column_title = "methylation near CGI")
```

When genomic targets are regions, upstream and/or downstream can be excluded in the heatmap.

```{r extension, fig.width = 3}
mat3 = normalizeToMatrix(meth, cgi, value_column = "meth", mean_mode = "absolute",
    extend = c(0, 5000), w = 50, empty_value = NA, smooth = TRUE, target_ratio = 0.5)
EnrichedHeatmap(mat3, col = meth_col_fun, name = "methylation", axis_name_rot = 90,
    column_title = "methylation near CGI")
mat3 = normalizeToMatrix(meth, cgi, value_column = "meth", mean_mode = "absolute",
    extend = c(5000, 0), w = 50, empty_value = NA, smooth = TRUE, target_ratio = 0.5)
EnrichedHeatmap(mat3, col = meth_col_fun, name = "methylation", axis_name_rot = 90,
    column_title = "methylation near CGI")
# since there is not upstream and downstream, the number of columns is controlled by `k` argument
mat3 = normalizeToMatrix(meth, cgi, value_column = "meth", mean_mode = "absolute",
    extend = 0, k = 20, empty_value = NA, smooth = TRUE, target_ratio = 1)
EnrichedHeatmap(mat3, col = meth_col_fun, name = "methylation", axis_name_rot = 90,
    column_title = "methylation near CGI")
```

You may notice there are warnings when executing above code, that is because there are very few signals overlapped to some rows,
which means there are too many `NA` to do the smoothing. Corresponding index for failed rows can be get by :

```{r}
attr(mat3, "failed_rows")
```

and maybe you can remove them beforehand.

## Multiple heatmaps

The power of **EnrichedHeatmap** package is that parallel heatmaps can be concatenated,
both for enriched heatmap, normal heatmap as well the normal row annotations, which
provides a very efficient way to visualize multiple sources of information.

With the functionality of **ComplexHeatmap** package, heatmaps can be concatenated
by `+` operator. `EnrichedHeatmap` objects, `Heatmap` objects and `HeatmapAnnotation` objects can be mixed.

Following heatmaps visualizes correspondance between H3K4me3 modification, methylation and
gene expression. It is quite straightforward to see high expression correlates
with low methylation and high H3K4me3 signal around TSS.

```{r, fig.width = 6}
EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", width = 1) + 
EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation", width = 1) +
Heatmap(log2(rpkm+1), col = c("white", "orange"), name = "log2(rpkm+1)", 
    show_row_names = FALSE, width = unit(5, "mm"))
```

Of course you can split rows by splitting the main heatmap. In following heatmaps,
the most right color bar can be corresponded to the colors in column annotation on both
histone modification heatmap and methylation heatmap.

Here we emphasize again, proper trimming on the matrix will greatly help to reveal the patterns. 
You can try replace `mat1` to a un-trimmed matrix and see whether this patterns shown below still preserve.

```{r, fig.width = 7, fig.height = 8}
set.seed(123)
partition = kmeans(mat1, centers = 3)$cluster
lgd = Legend(at = c("cluster1", "cluster2", "cluster3"), title = "Clusters", 
    type = "lines", legend_gp = gpar(col = 2:4))
ht_list = Heatmap(partition, col = structure(2:4, names = as.character(1:3)), name = "partition",
              show_row_names = FALSE, width = unit(3, "mm")) +
          EnrichedHeatmap(mat1, col = col_fun, name = "H3K4me3", split = partition, width = 1,
              top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = 2:4))), 
              top_annotation_height = unit(2, "cm"), row_title_rot = 0,
              column_title = "H3K4me3", combined_name_fun = NULL) + 
          EnrichedHeatmap(mat2, col = meth_col_fun, name = "methylation", width = 1,
              top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = 2:4))), 
              top_annotation_height = unit(2, "cm"),
              column_title = "Methylation") +
          Heatmap(log2(rpkm+1), col = c("white", "orange"), name = "log2(rpkm+1)", 
              show_row_names = FALSE, width = unit(5, "mm"))
draw(ht_list, main_heatmap = "H3K4me3", gap = unit(c(2, 10, 2), "mm"),
    annotation_legend_list = list(lgd))
```

## Visualize positive signals and negative signals separatedly

Sometimes we visualize the general correlation around certain genomic targets or the difference
between two subgroups. In this case, it makes more sense to visualize the enrichment for the 
positive signals and negative signals separatedly. In following example, variable `mat_H3K4me1`
contains correlation between H3K4me1 signal and expression of corresponding genes in (-5kb, 10kb)
of the gene TSS.

```{r}
load(paste0(system.file("extdata", "H3K4me1_corr_normalize_to_tss.RData", package = "EnrichedHeatmap")))
mat_H3K4me1
```

In `anno_enriched()`, there are two non-standard parameters `neg_col` and `pos_col` for `gp`. If these
two are set, the enrichment lines are drawn for the positive signals and negative signals in the matrix
separatedly.

```{r, fig.width = 4}
EnrichedHeatmap(mat_H3K4me1, col = colorRamp2(c(-1, 0, 1), c("darkgreen", "white", "red")), name = "corr_H3K4me1",
    top_annotation = HeatmapAnnotation(line = anno_enriched(gp = gpar(neg_col = "darkgreen", pos_col = "red"))),
    top_annotation_height = unit(2, "cm"))
```

## Restrict overlapping by providing mapping

By default every genomic signal tries to intersect to every target region, but if mapping is
provided, only those genomic signals that are mapped to the corresponding target region will be overlapped.

To illustrate it more clearly, we load the example data. 
`gene` column in `neg_cr` is used to map to the names of `all_tss`.
In following example, `neg_cr` is the signal and `all_tss` is the target.

```{r}
load(paste0(system.file("extdata", "neg_cr.RData", package = "EnrichedHeatmap")))
all_tss = promoters(all_genes, upstream = 0, downstream = 1)
all_tss = all_tss[unique(neg_cr$gene)]
neg_cr[1:2]
all_tss[1:2]
```

In this example, `neg_cr` contains regions that show negative correlation between methylation
and expression for the genes. The negative correlated regions are detected as:

1. extend gene to upstream 5kb and downtream 5kb;
2. for every gene, use a sliding window to go through left to right
    and find correlated regions by looking at the correlation between methylation in the window and expression for the gene.

Since genes may be close to each other, it is possible that one correlated region for gene A
overlaps with gene B, and actually we only want to overlap this correlated regions to gene A while not gene B. 
By specifying the mapping, we can correspond correlated regions to the correct genes.

```{r, fig.width = 3}
mat4 = normalizeToMatrix(neg_cr, all_tss, mapping_column = "gene", w = 50, mean_mode = "w0")
EnrichedHeatmap(mat4, col = c("white", "green"), name = "neg_cr", cluster_rows = TRUE,
    top_annotation = HeatmapAnnotation(lines = anno_enriched(gp = gpar(col = "green"))), 
    top_annotation_height = unit(2, "cm"))
```

Above heatmap shows negative correlated regions are enriched at some distance downstream of the TSS.
We guess it is because genes have alternative transcripts and negative correlated regions are
actually enriched at the start sites of transcripts.

Next we add another heatmap showing how transcripts are distributed to gene TSS. Note here `mean_mode`
is set to `coverage` when normalizing the matrix.

Maybe here the heatmap
is not a nice way for showing transcripts, but according to the annotation graphs on the both top, 
we can see there is a perfect fitting for the peaks of negative correlated regions and transcripts.

```{r, fig.width = 6}
mat5 = normalizeToMatrix(tx, all_tss, mapping_column="gene", extend = c(5000, 10000), w = 50, 
    mean_mode = "coverage")
ht_list = EnrichedHeatmap(mat4, col = c("white", "green"), name = "neg_cr", cluster_rows = TRUE,
              top_annotation = HeatmapAnnotation(lines1 = anno_enriched(gp = gpar(col = "green"))), 
              top_annotation_height = unit(2, "cm")) +
          EnrichedHeatmap(mat5, col = c("white", "black"), name = "tx",
              top_annotation = HeatmapAnnotation(lines2 = anno_enriched(gp = gpar(col = "black"))), 
              top_annotation_height = unit(2, "cm"))
draw(ht_list, gap = unit(1, "cm"))
```

## Features coming from ComplexHeatmap package

Since **EnrichedHeatmap** is built upon the **ComplexHeatmap** package, features in **ComplexHeatmap** can be
used directly for **EnrichedHeatmap**. As shown before, heatmaps can be split either by `km` or `spilt` arguments.

The order of rows can be retrieved by `row_order()`.

```{r, eval = FALSE}
# code not run
ht_list = draw(ht_list)
row_order(ht_list)
```

If you are interested in a small cluster, under the interactive mode,
you can use mouse to select this region by `selectArea()` function, and it will give you the index of rows
in the selected sub-region.

```{r, eval = FALSE}
# code not run
draw(ht_list)
pos = selectArea()
```

Since `EnrichedHeatmap` and `EnrichedHeamtapList` class are inherited from `Heamtap` and `HeamtapList` class
respectively, all advanced parameters in the latter two classes can be directly used in the former two classes.

E.g. to change graphic settings for the heatmap title:

```{r, eval = FALSE}
# code not run
EnrichedHeatmap(..., column_title_gp = ...)
```

To change graphic settings for legends:

```{r, eval = FALSE}
# code not run
EnrichedHeatmap(..., heatmap_legend_param = ...)
# or set is globally
ht_global_opt(...)
EnrichedHeatmap(...)
ht_global_opt(RESET = TRUE)
```

To set the width of the heatmaps if there are more than one heatmaps:

```{r, eval = FALSE}
# code not run
EnrichedHeatmap(..., width = ...) + EnrichedHeatmap(...)
```

For more advanced settings, please directly go to [the vignettes in the **ComplexHeamtap** package](http://www.bioconductor.org/packages/devel/bioc/vignettes/ComplexHeatmap/inst/doc/s1.introduction.html).

Together with above features, you can make very complex heatmaps. Following example is from a real-world dataset.
Some information is hidden because the data is not published yet, but still, you can see it is really easy
to correspond different sources of information.

<p><img src='complex.png' /></p>

## Summarize from a list of matrix

Let's assume you have a list of histone modification signals for different samples and you want
to visualize the mean pattern across samples. You can first normalize histone mark signals for each sample and then
calculate means values across all samples. In following example code, `hm_gr_list` is a list of `GRanges` objects
which contain positions of histone modifications, `tss` is a `GRanges` object containing positions of gene TSS.

```{r, eval = FALSE}
# code not run
mat_list = NULL
for(i in seq_along(hm_gr_list)) {
    mat_list[[i]] = normalizeToMatrix(hm_gr_list[[i]], tss, value_column = ...)
}
```

Applying `getSignalsFromList()` to `mat_list`, it gives a new normalized matrix which contains mean signals and can
be directly used in `EnrichedHeatmap()`.

```{r, eval = FALSE}
# code not run
mat = getSignalsFromList(mat_list)
EnrichedHeatmap(mat)
```

Next let's consider a second scenario: we want to see the correlation between histone modification and gene expression.
In this case, `fun` can have a second argument so that users can correspond histone signals to the expression of the
associated gene. In following code, `expr` is a matrix of expression, columns in `expr` correspond to elements in `hm_gr_list`,
rows in `expr` are same as `tss`.

```{r, eval = FALSE}
# code not run
mat = getSignalsFromList(mat_list, fun = function(x, i) cor(x, expr[i, ], method = "spearman"))
```

Then `mat` here can be used to visualize how gene expression is correlated to histone modification around TSS.

```{r, eval = FALSE}
# code not run
EnrichedHeatmap(mat)
```

## Use your own matrix

`normalizeToMatrix()` is used to normalize the associations between genomic signals to the targets. The returned value
is just a simple matrix but with several attributes attached. Sometimes, users may have their own way to generate such matrix.
It is easy to add the addtional attributes and send to `EnrichedHeamtap()` for visualization.

Following four attributes should be attached. Basically they are used for making the axes and labels.

```{r, eval = FALSE}
attr(mat, "upstream_index")
attr(mat, "target_index")
attr(mat, "downstream_index")
attr(mat, "extend")
```

To taks as an example, in following code, `mat2` is a simple matrix which only contains `dim` attributes.
`mat2` can be thought as a matrix obtained from other methods.

```{r}
mat1 = normalizeToMatrix(H3K4me3, tss, value_column = "coverage", 
    extend = 5000, mean_mode = "w0", w = 50)
mat2 = mat1
attributes(mat2) = NULL
dim(mat2) = dim(mat1)
mat2[1:4, 1:4]
```

As we already know, in `mat2`, upstream is extended to 5kb by 50bp window, which means the first 100 columns
correspond to the upstream. Similar the last 100 columns for downstream. Here the targets is TSS which can be thought
as with no width. So we can set column index attributes for upstream, target and downstream as follows:

```{r}
attr(mat2, "upstream_index") = 1:100
attr(mat2, "target_index") = integer(0)
attr(mat2, "downstream_index") = 101:200
attr(mat2, "extend") = c(5000, 5000)  # it must be a vector of length two
```

And don't forget to set `mat2` to `normalizedMatrix` class. And now `mat2` is a valid object for `EnrichedHeamtap()`.

```{r}
class(mat2) = c("normalizedMatrix", "matrix")
mat2
```

Above four attributes are enough for making the heatmaps, there are several more attributes which can give
better information when printing `mat2`.

```{r}
attr(mat2, "signal_name") = "H3K4me3"
attr(mat2, "target_name") = "TSS"
mat2
```

## Notice

If you generate the plot for the whole genome, I suggest you first save the figure as pdf format and then
convert to png by [`convert`](http://www.imagemagick.org/script/convert.php) software, instead of directly saving as png format.
A second solution is to set `use_raster` to `TRUE` to replace the heatmap bodies with raster images. Check [this document](http://www.bioconductor.org/packages/devel/bioc/vignettes/ComplexHeatmap/inst/doc/s2.single_heatmap.html#toc_7)

```{r, eval = FALSE}
# code not run
EnrichedHeatmap(mat, use_raster = TRUE, raster_device = ..., raster_device_param = ...)
```

If you meet following error when doing smoothing in `normalizeToMatrix()`:

```
Error: segfault from C stack overflow
```

You can either:

1. remove target regions for which there are few signals overlapped to them;
2. change to another `smooth_fun()` or change parameters in `locfit()`.

For solution 1, you can first calculate the matrix without smoothing and calculate
the percent of `NA` values in each row. Rows having high `NA` values can be removed.

```{r, eval = FALSE}
# code not run
mat = normalizeToMatrix(..., smooth = FALSE)
# the percent of NA values in each row
apply(mat, 1, function(x) sum(is.na(x)/length(x)))
```

## Session info

```{r}
sessionInfo()
```