%\VignetteIndexEntry{RI correction extra} %\VignetteKeywords{preprocess, GC-MS, analysis} %\VignettePackage{TargetSearch} %\VignetteEngine{knitr::knitr} \documentclass[a4paper]{article} <>= BiocStyle::latex() @ \usepackage{booktabs} \definecolor{NoteBlue}{RGB}{45, 28, 156} \newcommand{\Note}[2][Note]{\textsl{\textcolor{NoteBlue}{#1: #2}}} \title{RI Correction Extra} \author{\'Alvaro Cuadros-Inostroza} \affil{Max Planck Institute for Molecular Plant Physiology, Potsdam, Germnay} \begin{document} \maketitle \begin{abstract} This vignette explains how to transform retention time (RT) to retention index (RI) when the RI markers are not spiked in the samples. The same principle can be applied to other uncommon scenarios. \end{abstract} \packageVersion{\Sexpr{BiocStyle::pkg_ver("TargetSearch")}} Report issues on \url{https://github.com/acinostroza/TargetSearch/issues} \newpage \tableofcontents \section{Motivation} The \Biocpkg{TargetSearch} package assumes that the retention index markers (RIM), such as n-alkanes or FAMEs (fatty acid methyl esters), are injected together with the biological samples. A different approach often used is to inject the RIMs separately, alternating between RIMs and biological samples. For example, a GC-MS run may look like the list shown in Table~\ref{tab:01}. \begin{table}[ht] \begin{tabular}{cc} \toprule Measurement Order & Sample Type \\ \midrule 1 & Alkanes \\ 2 & Biological \\ 3 & Biological \\ 4 & Biological \\ 5 & Biological \\ 6 & Alkanes \\ 7 & Biological \\ 8 & Biological \\ 9 & Biological \\ 10 & Biological \\ 11 & Alkanes \\ 12 & Biological \\ 13 & Biological \\ 14 & Biological \\ 15 & Biological \\ \bottomrule \end{tabular} \caption{An example of a GC-MS run. \emph{Alkanes} samples are retention index markers (RIMs).} \label{tab:01} \end{table} In the example (Table~\ref{tab:01}), samples 1, 6, 11 are RIMs and the rest are biological samples. The assumption is that the retention time shifts between consecutive runs are not significant, so sample \#1 is used to correct samples \#2-5, sample \#6 corrects samples \#7-10, and so on. This document shows how to perform RI correction in such case with \Biocpkg{TargetSearch}, using the available chromatograms of the \Biocpkg{TargetSearchData} package. \section{Retention Index correction} First, we load the required packages <>= library(TargetSearch) library(TargetSearchData) @ Specify the directory where the CDF files are. In this example we will use the CDF files of \Biocpkg{TargetSearchData} package. <>= cdfPath <- file.path(find.package("TargetSearchData"), "gc-ms-data") dir(cdfPath, pattern="cdf$") @ Create a \Robject{tsSample} object using all chromatograms from the previous directory. Also set the RI path to the current directory. Use the R commands \Rfunction{setwd()} and \Rfunction{getwd()} to set/get the working directory. <>= samples.all <- ImportSamplesFromDir(cdfPath) RIpath(samples.all) <- "." @ Import retention index marker times limits. We will use the ones defined in \Biocpkg{TargetSearchData}. <>= rimLimits <- ImportFameSettings(file.path(cdfPath,"rimLimits.txt")) rimLimits @ Run the retention index correction methods. Here we use an intensity threshold of 50, the peak detection method is "ppc", and the time window width is $2*15+1=31$ scan points (equal to 1.5 seconds in this example). \Note{We skip conversion to CDF-4 files for illustration purposes. In a normal workflow you would set \Rcode{writeCDF4path=TRUE} (the default)}. <>= RImatrix <- RIcorrect(samples.all, rimLimits, writeCDF4path=FALSE, Window=15, pp.method="ppc", IntThreshold=50) RImatrix @ Now you should make sure that the retention times (RT) of the RIMs (not the biological samples!!) are correct by using a chromatogram visualization tool such as LECO Pegasus, AMDIS, etc. Because there are no RIMs inyected in the biological samples, the RTs of the RIMs found in \Robject{RImatrix} will make no sense, so you don't need to check them. Up to now we have run the usual \Biocpkg{TargetSearch} workflow which you can find in the main vignette. To correct the RI of the biological samples, the following procedure can be used. First, we need a logical vector indicating which samples are RIMs and which biological. For example, you could create a vector like this. <>= isRIMarker <- c(T, F, F, F, F, T, F, F, F, F, T, F, F, F, F) @ where \Robject{isRIMarker} is \Rcode{TRUE} if the respective sample is a RIM and \Rcode{FALSE} otherwise. Note that here we have set components 1, 6, 11 to \Rcode{TRUE} just like the example in Table~\ref{tab:01}. Then we have to copy the RIM columns of \Robject{RImatrix} to their respective biological sample columns. In other words, copy column 1 to columns 2, 3, 4, 5; column 6 to columns 7, 8, 9, 10; and so on. There are many ways to achieve that, here I show two examples. <>= RImatrix2 <- RImatrix RImatrix2[, 2:5] <- RImatrix[,1] RImatrix2[, 7:10] <- RImatrix[,6] RImatrix2[, 12:15] <- RImatrix[,11] @ This code snippet is a straight forward method, but has the disadvantage that the column indexes must be filled manually. A more general approach could be. <>= RImatrix2 <- RImatrix z <- cumsum(as.numeric(isRIMarker)) for(i in unique(z)) RImatrix2[, z==i] <- RImatrix[, z==i][,1] RImatrix2 @ After the \Robject{RImatrix2} object is corrected, the RI files of the biological samples must be fixed as well. <>= fixRI(samples.all, rimLimits, RImatrix2, which(!isRIMarker)) @ Finally, we remove the standards, since we don't need them anymore. <>= samples <- samples.all[!isRIMarker] RImatrix <- RImatrix2[, !isRIMarker] @ After that, we can continue we the normal \Biocpkg{TargetSearch} workflow. If you prefer to use the GUI, run \Rfunction{TargetSearchGUI()} from this point and import the RI files by selecting the option \emph{Apex Data} (do not import the standard files). A script with all the commands used in this document is located in the \file{doc} directory of the \Biocpkg{TargetSearch} package. \section{Session info} Output of \Rfunction{sessionInfo()} on the system on which this document was compiled. <>= toLatex(sessionInfo()) @ \end{document}