---
title: "An introduction to the bambu package using NanoporeRNASeq data"
output: rmarkdown::html_vignette
vignette: >
    %\VignetteIndexEntry{NanoporeRNASeq}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
---


```{r, include = FALSE}
knitr::opts_chunk$set(
    collapse = TRUE, tidy = TRUE,
    warning=FALSE, message=FALSE,
    comment = "##>"
)
```

## Introduction
*[NanoporeRNASeq](https://github.com/GoekeLab/NanoporeRNASeq)*  contains RNA-Seq
data from the K562 and MCF7 cell lines that were generated by the SG-NEx project
(https://github.com/GoekeLab/sg-nex-data). Each of these cell line has three 
replicates, with 1 direct RNA sequencing data and 2 cDNA sequencing data. 
The files contains reads aligned to the human genome (Grch38) 
chromosome 22 (1:25500000). 

## Accessing NanoporeRNASeq data
### Load the NanoporeRNASeq package
```{r setup}
library("NanoporeRNASeq")
```

### List the samples
```{r samples}
data("SGNexSamples")
SGNexSamples
```


### List the available BamFile
```{r bamfiles}
library(ExperimentHub)
NanoporeData <- query(ExperimentHub(), c("NanoporeRNA", "GRCh38","Bam"))
bamFiles <- Rsamtools::BamFileList(NanoporeData[["EH3808"]],
    NanoporeData[["EH3809"]],NanoporeData[["EH3810"]], NanoporeData[["EH3811"]],
    NanoporeData[["EH3812"]], NanoporeData[["EH3813"]])
```

### Get the annotation GRangesList
```{r annotation}
data("HsChr22BambuAnnotation")
HsChr22BambuAnnotation
```

## Visualizing gene of interest from a single bam file
We can visualize the one sample for a single gene ENST00000215832 (MAPK1)
```{r, fig.width = 8, fig.height = 6}
library(ggbio)
range <- HsChr22BambuAnnotation$ENST00000215832
# plot mismatch track
library(BSgenome.Hsapiens.NCBI.GRCh38)
# plot annotation track
tx <- autoplot(range, aes(col = strand), group.selfish = TRUE)
# plot coverage track
coverage <- autoplot(bamFiles[[1]], aes(col = coverage),which = range)

#merge the tracks into one plot
tracks(annotation = tx, coverage = coverage,
        heights = c(1, 3)) + theme_minimal()
```

## Running Bambu with NanoporeRNASeq data
### Load the bambu package
```{r load bambu}
library(bambu)
genomeSequenceData <- query(ExperimentHub(), c("NanoporeRNA", "GRCh38","FASTA"))
genomeSequence <- genomeSequenceData[["EH7260"]]
```

### Run bambu
Applying bambu to bamFiles 
```{r, results = "hide"}
se <- bambu(reads = bamFiles,
            annotations = HsChr22BambuAnnotation,
            genome = genomeSequence)
```

*bambu* returns a SummarizedExperiment object 
```{r}
se
```


### Visualizing gene examples
We can visualize the annotated and novel isoforms identified in this gene
example using plot functions from *bambu*
```{r, fig.width = 8, fig.height = 10}
plotBambu(se, type = "annotation", gene_id = "ENSG00000099968")
```
```{r}
sessionInfo()
```