---
title: "specL automatic report"
author:
- name: Christian Panse
  affiliation:
    - &fgcz Functional Genomics Center Zurich - Swiss Federal Institute of Technology in Zurich
    - &sib Swiss Institute of Bioinformatics
  email: cp@fgcz.ethz.ch
- name: Witold E. Wolski
  affiliation:
    - *fgcz
    - *sib
  email: wew@fgcz.ethz.ch
date: "`r doc_date()`"
package: specL
abstract: |
  This file contains all the commands performing a default SWATH ion
  library generation at the FGCZ. This document is usually triggered by
  the B-Fabric system [@panse2022bridging] and is meant for training and
  reproducibility.
vignette: |
  %\VignetteIndexEntry{Automatic specL Workflow}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: specL.bib
output: 
  BiocStyle::html_document:
    toc_float: true
---

# Requirements

In the first step, the peptide identification result is generated by a standard shotgun proteomics experiment and has to be processed using the _bibliospec_ software [@pmid18428681].


For generating the ion library, the  `r Biocpkg('specL')` is used. The workflow is described in [@pmid25712692].

The following R package has to be installed on the compute box.

```{r library}
library(specL)
```

This file can be rendered by using the following code snippet.

```{r render, eval=FALSE}
library(rmarkdown)
library(BiocStyle)
report_file <- tempfile(fileext='.Rmd'); 
file.copy(system.file("doc", "report.Rmd", 
                      package = "specL"), 
          report_file); 
rmarkdown::render(report_file, 
                  output_format='html_document', 
                  output_file='/tmp/report_specL.html')
```       


# Input

## Parameter

If no `INPUT` is defined, the report uses the `r Biocpkg("specL")` package's data and the following default parameters. 

```{r defineInput}
if(!exists("INPUT")){
  INPUT <- list(FASTA_FILE 
      = system.file("extdata", "SP201602-specL.fasta.gz",
                    package = "specL"),
    BLIB_FILTERED_FILE 
      = system.file("extdata", "peptideStd.sqlite",
                    package = "specL"),
    BLIB_REDUNDANT_FILE 
      = system.file("extdata", "peptideStd_redundant.sqlite",
                    package = "specL"),
    MIN_IONS = 5,
    MAX_IONS = 6,
    MZ_ERROR = 0.05,
    MASCOTSCORECUTOFF = 17,
    FRAGMENTIONMZRANGE = c(300, 1250),
    FRAGMENTIONRANGE = c(5, 200),
    NORMRTPEPTIDES = specL::iRTpeptides,
    OUTPUT_LIBRARY_FILE = tempfile(fileext ='.csv'),
    RDATA_LIBRARY_FILE = tempfile(fileext ='.RData'),
    ANNOTATE = TRUE
    )
} 
```

The library generation workflow was performed using the following parameters:

```{r cat, echo=FALSE, eval=FALSE}
  cat(
  " MASCOTSCORECUTOFF = ", INPUT$MASCOTSCORECUTOFF, "\n",
  " BLIB_FILTERED_FILE = ", INPUT$BLIB_FILTERED_FILE, "\n",
  " BLIB_REDUNDANT_FILE = ", INPUT$BLIB_REDUNDANT_FILE, "\n",
  " MZ_ERROR = ", INPUT$MZ_ERROR, "\n",
  " FRAGMENTIONMZRANGE = ", INPUT$FRAGMENTIONMZRANGE, "\n",
  " FRAGMENTIONRANGE = ", INPUT$FRAGMENTIONRANGE, "\n",
  " FASTA_FILE = ", INPUT$FASTA_FILE, "\n",
  " MAX_IONS = ", INPUT$MAX_IONS, "\n",
  " MIN_IONS = ", INPUT$MIN_IONS, "\n"
  )

```

```{r kableParameter, echo=FALSE, results='asis'}
library(knitr)
# kable(t(as.data.frame(INPUT)))
ii <- ((lapply(INPUT, function(x){ if(typeof(x) %in% c("character", "double")){paste(x, collapse = ', ')}else{NULL} } )))


parameter <- as.data.frame(unlist(ii))
names(parameter) <- 'parameter.values'
kable(parameter, caption = 'used INPUT parameter')
```

## Define the fragment ions of interest

The following R helper function is used for composing the in-silico 
fragment ions using `r CRANpkg("protViz")`.

```{r defineFragmenIons}
fragmentIonFunction_specL <- function (b, y) {
  Hydrogen <- 1.007825
  Oxygen <- 15.994915
  Nitrogen <- 14.003074
  b1_ <- (b )
  y1_ <- (y )
  b2_ <- (b + Hydrogen) / 2
  y2_ <- (y + Hydrogen) / 2 
  return( cbind(b1_, y1_, b2_, y2_) )
}
```


## Read the sqlite files

```{r readSqliteFILTERED, warning=FALSE}
BLIB_FILTERED <- read.bibliospec(INPUT$BLIB_FILTERED_FILE) 

summary(BLIB_FILTERED)
```


```{r readSqliteREDUNDANT, warning=FALSE}
BLIB_REDUNDANT <- read.bibliospec(INPUT$BLIB_REDUNDANT_FILE) 
summary(BLIB_REDUNDANT)
```


## Protein (re)-annotation

After processing the psm using bibliospec, the protein information is
gone. The `read.fasta` function is provided by the CRAN package
`r CRANpkg("seqinr")`.

```{r read.fasta}
if(INPUT$ANNOTATE){
  FASTA <- read.fasta(INPUT$FASTA_FILE, 
                    seqtype = "AA", 
                    as.string = TRUE)

  BLIB_FILTERED <- annotate.protein_id(BLIB_FILTERED, 
                                       fasta = FASTA)
}
```

## Peptides used for RT normalization

The following peptides are used for retention time (RT) normalization.
The last column  indicates by `FALSE|TRUE` if a peptide is included in the
data. The rows were ordered by the RT values.

```{r checkIRTs, echo=FALSE, results='asis'}
library(knitr)
incl <-  INPUT$NORMRTPEPTIDES$peptide %in% sapply(BLIB_REDUNDANT, function(x){x$peptideSequence})
INPUT$NORMRTPEPTIDES$included <- incl

if (sum(incl) > 0){
  res <- INPUT$NORMRTPEPTIDES[order(INPUT$NORMRTPEPTIDES$rt),]
  # row.names(res) <- 1:nrow(res)
  kable(res, caption='peptides used for RT normaization.')
}
```

# Generate the ion library

```{r specL::genSwathIonLib, message=FALSE}
specLibrary <- specL::genSwathIonLib(
  data = BLIB_FILTERED,
  data.fit = BLIB_REDUNDANT,
  max.mZ.Da.error = INPUT$MZ_ERROR,
  topN = INPUT$MAX_IONS,
  fragmentIonMzRange = INPUT$FRAGMENTIONMZRANGE,
  fragmentIonRange = INPUT$FRAGMENTIONRANGE,
  fragmentIonFUN = fragmentIonFunction_specL,
  mascotIonScoreCutOFF = INPUT$MASCOTSCORECUTOFF,
  iRT = INPUT$NORMRTPEPTIDES
  )
```

## Library Generation Summary

Total Number of PSM's with Mascot e-value < 0.05,
in your search, is __`r length(BLIB_REDUNDANT)`__.
The number of unique precursors is __`r length(BLIB_FILTERED)`__.
The size of the generated ion library is __`r length(specLibrary@ionlibrary)`__.
That means that __`r round(length(specLibrary@ionlibrary)/length(BLIB_FILTERED) * 100, 2)`__ % 
of the unique precursors fulfilled the filtering criteria.


```{r summarySpecLibrary}
summary(specLibrary)
```

In the following two code snippets the first element of the ion library is displayed:

```{r showSpecLibrary}
#  slotNames(specLibrary@ionlibrary[[1]])
specLibrary@ionlibrary[[1]]
```

```{r plotSpecLibraryIons, fig.retina=3}
plot(specLibrary@ionlibrary[[1]])
```

```{r plotSpecLibrary, fig.retina=3}
plot(specLibrary)
```

The code snippet below plots an overview of the whole ion library. 
Please note that the iRT peptides used for the normalization of RT do not have to be included in the resulting  \code{specLibrary}.

# Output

```{r write.spectronaut, eval=TRUE}
write.spectronaut(specLibrary, file =  INPUT$OUTPUT_LIBRARY_FILE)
```

```{r save, eval=TRUE}
save(specLibrary, file = INPUT$RDATA_LIBRARY_FILE)
```

saves the result object to a file.

# Remarks

For questions and improvements please do contact the authors of the
`r BiocStyle::Biocpkg('specL')`. 
This report Rmarkdown file has been written by WEW and
is maintained by CP [@specLBioInf].

# Session info

Here is the output of `sessionInfo()` on the system on which this
document was compiled:

```{r sessionInfo, echo=FALSE}
sessionInfo()
```

# References