--- title: "specL automatic report" author: - name: Christian Panse affiliation: - &fgcz Functional Genomics Center Zurich - Swiss Federal Institute of Technology in Zurich - &sib Swiss Institute of Bioinformatics email: cp@fgcz.ethz.ch - name: Witold E. Wolski affiliation: - *fgcz - *sib email: wew@fgcz.ethz.ch date: "`r doc_date()`" package: specL abstract: | This file contains all the commands performing a default SWATH ion library generation at the FGCZ. This document is usually triggered by the B-Fabric system [@panse2022bridging] and is meant for training and reproducibility. vignette: | %\VignetteIndexEntry{Automatic specL Workflow} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: specL.bib output: BiocStyle::html_document: toc_float: true --- # Requirements In the first step, the peptide identification result is generated by a standard shotgun proteomics experiment and has to be processed using the _bibliospec_ software [@pmid18428681]. For generating the ion library, the `r Biocpkg('specL')` is used. The workflow is described in [@pmid25712692]. The following R package has to be installed on the compute box. ```{r library} library(specL) ``` This file can be rendered by using the following code snippet. ```{r render, eval=FALSE} library(rmarkdown) library(BiocStyle) report_file <- tempfile(fileext='.Rmd'); file.copy(system.file("doc", "report.Rmd", package = "specL"), report_file); rmarkdown::render(report_file, output_format='html_document', output_file='/tmp/report_specL.html') ``` # Input ## Parameter If no `INPUT` is defined, the report uses the `r Biocpkg("specL")` package's data and the following default parameters. ```{r defineInput} if(!exists("INPUT")){ INPUT <- list(FASTA_FILE = system.file("extdata", "SP201602-specL.fasta.gz", package = "specL"), BLIB_FILTERED_FILE = system.file("extdata", "peptideStd.sqlite", package = "specL"), BLIB_REDUNDANT_FILE = system.file("extdata", "peptideStd_redundant.sqlite", package = "specL"), MIN_IONS = 5, MAX_IONS = 6, MZ_ERROR = 0.05, MASCOTSCORECUTOFF = 17, FRAGMENTIONMZRANGE = c(300, 1250), FRAGMENTIONRANGE = c(5, 200), NORMRTPEPTIDES = specL::iRTpeptides, OUTPUT_LIBRARY_FILE = tempfile(fileext ='.csv'), RDATA_LIBRARY_FILE = tempfile(fileext ='.RData'), ANNOTATE = TRUE ) } ``` The library generation workflow was performed using the following parameters: ```{r cat, echo=FALSE, eval=FALSE} cat( " MASCOTSCORECUTOFF = ", INPUT$MASCOTSCORECUTOFF, "\n", " BLIB_FILTERED_FILE = ", INPUT$BLIB_FILTERED_FILE, "\n", " BLIB_REDUNDANT_FILE = ", INPUT$BLIB_REDUNDANT_FILE, "\n", " MZ_ERROR = ", INPUT$MZ_ERROR, "\n", " FRAGMENTIONMZRANGE = ", INPUT$FRAGMENTIONMZRANGE, "\n", " FRAGMENTIONRANGE = ", INPUT$FRAGMENTIONRANGE, "\n", " FASTA_FILE = ", INPUT$FASTA_FILE, "\n", " MAX_IONS = ", INPUT$MAX_IONS, "\n", " MIN_IONS = ", INPUT$MIN_IONS, "\n" ) ``` ```{r kableParameter, echo=FALSE, results='asis'} library(knitr) # kable(t(as.data.frame(INPUT))) ii <- ((lapply(INPUT, function(x){ if(typeof(x) %in% c("character", "double")){paste(x, collapse = ', ')}else{NULL} } ))) parameter <- as.data.frame(unlist(ii)) names(parameter) <- 'parameter.values' kable(parameter, caption = 'used INPUT parameter') ``` ## Define the fragment ions of interest The following R helper function is used for composing the in-silico fragment ions using `r CRANpkg("protViz")`. ```{r defineFragmenIons} fragmentIonFunction_specL <- function (b, y) { Hydrogen <- 1.007825 Oxygen <- 15.994915 Nitrogen <- 14.003074 b1_ <- (b ) y1_ <- (y ) b2_ <- (b + Hydrogen) / 2 y2_ <- (y + Hydrogen) / 2 return( cbind(b1_, y1_, b2_, y2_) ) } ``` ## Read the sqlite files ```{r readSqliteFILTERED, warning=FALSE} BLIB_FILTERED <- read.bibliospec(INPUT$BLIB_FILTERED_FILE) summary(BLIB_FILTERED) ``` ```{r readSqliteREDUNDANT, warning=FALSE} BLIB_REDUNDANT <- read.bibliospec(INPUT$BLIB_REDUNDANT_FILE) summary(BLIB_REDUNDANT) ``` ## Protein (re)-annotation After processing the psm using bibliospec, the protein information is gone. The `read.fasta` function is provided by the CRAN package `r CRANpkg("seqinr")`. ```{r read.fasta} if(INPUT$ANNOTATE){ FASTA <- read.fasta(INPUT$FASTA_FILE, seqtype = "AA", as.string = TRUE) BLIB_FILTERED <- annotate.protein_id(BLIB_FILTERED, fasta = FASTA) } ``` ## Peptides used for RT normalization The following peptides are used for retention time (RT) normalization. The last column indicates by `FALSE|TRUE` if a peptide is included in the data. The rows were ordered by the RT values. ```{r checkIRTs, echo=FALSE, results='asis'} library(knitr) incl <- INPUT$NORMRTPEPTIDES$peptide %in% sapply(BLIB_REDUNDANT, function(x){x$peptideSequence}) INPUT$NORMRTPEPTIDES$included <- incl if (sum(incl) > 0){ res <- INPUT$NORMRTPEPTIDES[order(INPUT$NORMRTPEPTIDES$rt),] # row.names(res) <- 1:nrow(res) kable(res, caption='peptides used for RT normaization.') } ``` # Generate the ion library ```{r specL::genSwathIonLib, message=FALSE} specLibrary <- specL::genSwathIonLib( data = BLIB_FILTERED, data.fit = BLIB_REDUNDANT, max.mZ.Da.error = INPUT$MZ_ERROR, topN = INPUT$MAX_IONS, fragmentIonMzRange = INPUT$FRAGMENTIONMZRANGE, fragmentIonRange = INPUT$FRAGMENTIONRANGE, fragmentIonFUN = fragmentIonFunction_specL, mascotIonScoreCutOFF = INPUT$MASCOTSCORECUTOFF, iRT = INPUT$NORMRTPEPTIDES ) ``` ## Library Generation Summary Total Number of PSM's with Mascot e-value < 0.05, in your search, is __`r length(BLIB_REDUNDANT)`__. The number of unique precursors is __`r length(BLIB_FILTERED)`__. The size of the generated ion library is __`r length(specLibrary@ionlibrary)`__. That means that __`r round(length(specLibrary@ionlibrary)/length(BLIB_FILTERED) * 100, 2)`__ % of the unique precursors fulfilled the filtering criteria. ```{r summarySpecLibrary} summary(specLibrary) ``` In the following two code snippets the first element of the ion library is displayed: ```{r showSpecLibrary} # slotNames(specLibrary@ionlibrary[[1]]) specLibrary@ionlibrary[[1]] ``` ```{r plotSpecLibraryIons, fig.retina=3} plot(specLibrary@ionlibrary[[1]]) ``` ```{r plotSpecLibrary, fig.retina=3} plot(specLibrary) ``` The code snippet below plots an overview of the whole ion library. Please note that the iRT peptides used for the normalization of RT do not have to be included in the resulting \code{specLibrary}. # Output ```{r write.spectronaut, eval=TRUE} write.spectronaut(specLibrary, file = INPUT$OUTPUT_LIBRARY_FILE) ``` ```{r save, eval=TRUE} save(specLibrary, file = INPUT$RDATA_LIBRARY_FILE) ``` saves the result object to a file. # Remarks For questions and improvements please do contact the authors of the `r BiocStyle::Biocpkg('specL')`. This report Rmarkdown file has been written by WEW and is maintained by CP [@specLBioInf]. # Session info Here is the output of `sessionInfo()` on the system on which this document was compiled: ```{r sessionInfo, echo=FALSE} sessionInfo() ``` # References