---
title: "hicVennDiagram Vignette: overview"
author: "Jianhong Ou"
date: "`r BiocStyle::doc_date()`"
package: "`r BiocStyle::pkg_ver('hicVennDiagram')`"
vignette: >
  %\VignetteIndexEntry{hicVennDiagram Vignette: overview}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
output:
  html_document:
    theme: simplex
    toc: true
    toc_float: true
    toc_depth: 4
    fig_caption: true
---

```{r, echo=FALSE, results="hide", warning=FALSE}
suppressPackageStartupMessages({
    library(hicVennDiagram)
    library(GenomicRanges)
    library(TxDb.Hsapiens.UCSC.hg38.knownGene)
})
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
```

# Introduction
When comparing samples, it is common to perform the task of identifying 
overlapping loops among two or more sets of genomic interactions.
Traditionally, this is achieved through the use of visualizations such as
`vennDiagram` or `UpSet` plots.
However, it is frequently observed that the total count displayed in these
plots does not match the original counts for each individual list.
The reason behind this discrepancy is that a single overlap may encompass
multiple interactions for one or more samples.
This issue is extensively discussed in the realm of overlapping caller
for ChIP-Seq peaks.

The _hicVennDiagram_ aims to provide a easy to use tool for overlapping
interactions calculation and proper visualization methods. The _hicVennDiagram_
generates plots specifically crafted to eliminate the deceptive visual
representation caused by the counts method.

# Quick start

Here is an example using _hicVennDiagram_ with 3 files in `BEDPE` format.

## Installation
First, install _hicVennDiagram_ and other packages required to run 
the examples.

```{r installation, eval=FALSE}
library(BiocManager)
BiocManager::install("hicVennDiagram")
```
## Load library

```{r load_library}
library(hicVennDiagram)
library(ggplot2)
```


```{r quick_start}
# list the BEDPE files
file_folder <- system.file("extdata",
                           package = "hicVennDiagram",
                           mustWork = TRUE)
file_list <- dir(file_folder, pattern = ".bedpe", full.names = TRUE)
names(file_list) <- sub(".bedpe", "", basename(file_list))
basename(file_list)
venn <- vennCount(file_list)
## upset plot
## temp fix for https://github.com/krassowski/complex-upset/issues/195
upset_themes_fix <- lapply(ComplexUpset::upset_themes, function(.ele){
    lapply(.ele, function(.e){
        do.call(theme, .e[names(.e) %in% names(formals(theme))])
    })
})
upsetPlot(venn,
          themes = upset_themes_fix)
## venn plot
vennPlot(venn)
## use browser to adjust the text position, and shape colors.
browseVenn(vennPlot(venn))
```

# Details about `vennCount`
The `vennCount` function borrows the power of `InteractionSet:findOverlaps` to
calculate the overlaps and then summarizes the results for each category.
Users may want to try different combinations of `maxgap` and `minoverlap`
parameters to calculate the overlapping loops.
```{r vennCount}
venn <- vennCount(file_list, maxgap=50000, FUN = max) # by default FUN = min
upsetPlot(venn, label_all=list(
                          na.rm = TRUE,
                          color = 'black',
                          alpha = .9,
                          label.padding = unit(0.1, "lines")
                      ),
          themes = upset_themes_fix)
```

# Plot for overlapping peaks output by `ChIPpeakAnno`

```{r chippeakanno_findOverlapsOfPeaks, warning=FALSE}
library(ChIPpeakAnno)
bed <- system.file("extdata", "MACS_output.bed", package="ChIPpeakAnno")
gr1 <- toGRanges(bed, format="BED", header=FALSE)
gff <- system.file("extdata", "GFF_peaks.gff", package="ChIPpeakAnno")
gr2 <- toGRanges(gff, format="GFF", header=FALSE, skip=3)
ol <- findOverlapsOfPeaks(gr1, gr2)
overlappingPeaksToVennTable <- function(.ele){
    .venn <- .ele$venn_cnt
    k <- which(colnames(.venn)=="Counts")
    rownames(.venn) <- apply(.venn[, seq.int(k-1)], 1, paste, collapse="")
    colnames(.venn) <- sub("count.", "", colnames(.venn))
    vennTable(combinations=.venn[, seq.int(k-1)],
              counts=.venn[, k],
              vennCounts=.venn[, seq.int(ncol(.venn))[-seq.int(k)]])
}
venn <- overlappingPeaksToVennTable(ol)
vennPlot(venn)
## or you can simply try vennPlot(vennCount(c(bed, gff)))
upsetPlot(venn, themes = upset_themes_fix)
## change the font size of labels and numbers
updated_theme <- ComplexUpset::upset_modify_themes(
              ## get help by vignette('Examples_R', package = 'ComplexUpset')
              list('intersections_matrix'=
                       ggplot2::theme(
                           ## font size of label: gr1/gr2
                           axis.text.y=ggplot2::element_text(size=24),
                           ## font size of label `group`
                           axis.title.x=ggplot2::element_text(size=24)),
                   'overall_sizes'=
                       ggplot2::theme(
                           ## font size of x-axis 0-200
                           axis.text=ggplot2::element_text(size=12),
                           ## font size of x-label `Set size`
                           axis.title=ggplot2::element_text(size=18)),
                   'Intersection size'=
                       ggplot2::theme(
                           ## font size of y-axis 0-150
                           axis.text=ggplot2::element_text(size=20),
                           ## font size of y-label `Intersection size`
                           axis.title=ggplot2::element_text(size=16)
                       ),
                   'default'=ggplot2::theme_minimal())
              )
updated_theme <- lapply(updated_theme, function(.ele){
    lapply(.ele, function(.e){
        do.call(theme, .e[names(.e) %in% names(formals(theme))])
    })
})
upsetPlot(venn,
          label_all=list(na.rm = TRUE, color = 'gray30', alpha = .7,
                         label.padding = unit(0.1, "lines"),
                         size = 8 #control the font size of the individual num
                         ),
          base_annotations=list('Intersection size'=
                                    ComplexUpset::intersection_size(
                                        ## font size of counts in the bar-plot
                                        text = list(size=6)
                                        )),
          themes = updated_theme
          )
```

# GLEAM test

Genomic Loops Enrichment Analysis Method (GLEAM) do enrichment analysis like
GREAT (Genomic Regions Enrichment of Annotations Tool) for the interactions.

```{r}
pd <- system.file("extdata", package = "hicVennDiagram", mustWork = TRUE)
fs <- dir(pd, pattern = ".bedpe", full.names = TRUE)
fs <- fs[!grepl('group1', fs)] # make the test data smaller
set.seed(123)
background <- createGIbackground(fs)
gleamTest(fs, background = background, method = 'binom')
```


# Session Info
```{r sessionInfo, results='asis'}
sessionInfo()
```