% \VignetteEngine{knitr::knitr} % \VignetteIndexEntry{TPP_introduction_2D} % \VignettePackage{TPP} \documentclass[10pt,a4paper,twoside]{article} \usepackage[utf8]{inputenc} \usepackage{graphicx} \usepackage{forloop} \usepackage{mathtools} %\pagestyle{empty} <>= BiocStyle::latex() @ \bioctitle[TPP]{Introduction to the \Rpackage{TPP} package for analyzing Thermal Proteome Profiling data: 2D-TPP experiments} \author{Dorothee Childs, Nils Kurzawa\\ European Molecular Biology Laboratory (EMBL), \\ Heidelberg, Germany\\ \texttt{dorothee.childs@embl.de}} \date{\Rpackage{TPP} version \Sexpr{packageDescription("TPP")$Version} (Last revision 2016-10-25)} \begin{document} <>= knitr::opts_chunk$set(concordance = TRUE, eval = TRUE, cache = FALSE, resize.width="0.45\\textwidth", fig.align='center', tidy = FALSE, message=FALSE) @ \maketitle \begin{abstract} Thermal Proteome Profiling (TPP) combines the cellular thermal shift assay concept \cite{molina2013} with mass spectrometry based proteome-wide protein quantitation \cite{savitski2014tracking}. Thereby, drug-target interactions can be inferred from changes in the thermal stability of a protein upon drug binding, or upon downstream cellular regulatory events, in an unbiased manner. The package \Rpackage{TPP} facilitates this process by providing exectuable workflows that conduct all necessary data analysis steps. Recent advances in the field have lead to the development of so called 2D Thermal Proteome Profiling (2D-TPP) experiments \cite{becher2016}. Recent advances in the field have lead to the development of so called 2D Thermal Proteome Profiling (2D-TPP) experiments \cite{becher2016}. Similar as for the TPP-TR and the TPP-CCR analysis, the function analyze2DTPP executes the whole workflow from data import through normalization and curve fitting to statistical analysis. Nevertheless, all of these steps can also be invoked separately by the user. The corresponding functions can be recognized by their suffix tpp2d. Here, we first show how to start the whole analysis using analyze2DTPP. Afterwards, we demonstrate how to carry out single steps individually. For details about the analysis of 1D TR- or CCR experiments \cite{savitski2014tracking, franken2015}, please refer to the vignette \texttt{TPP\_introduction\_1D}. \end{abstract} \tableofcontents \section{Installation} To install the package, type the following commands into the \R{} console <>= source("http://bioconductor.org/biocLite.R") biocLite("TPP") @ The installed package can be loaded by <>= library("TPP") @ \subsection{Special note for Windows users} The \Rpackage{TPP} package uses the \Rpackage{openxlsx} package to produce Excel output \cite{openxlsx}. \Rpackage{openxlsx} requires a zip application to be installed on your system and to be included in the path. On Windows, such a zip application ist not installed by default, but is available, for example, via \href{http://cran.r-project.org/bin/windows/Rtools/}{Rtools}. Without the zip application, you can still use the 'TPP' package and access its results via the dataframes produced by the main functions. \clearpage \section{Analyzing 2D-TPP experiments} \subsection{Overview} Before you can start your analysis, you need to specify information about your experiments: The mandatory information comprises a unique experiment name, as well as the isobaric labels and corresponding temperature values for each experiment. The package retrieves this information from a configuration table that you need to specify before starting the analysis. This table can either be a data frame that you define in your R session, or a spreadsheet in .xlsx or .csv format. In a similar manner, the measurements themselves can either be provided as a list of data frames, or imported directlyfrom files during runtime. We demonstrate the functionality of the package using the dataset Panobinostat\_2DTPP\_smallExampleData. It contains an illustrative subset of a larger dataset which was obtained by 2D-TPP experiments on HepG$2$ cells treated with the histone deacetylase (HDAC) inhibitor panobinostat in the treatment groups and with vehicle in the control groups. The experiments were performed for different temperatures. The raw MS data were processed with the Python package isobarQuant, which provides protein fold changes relative to the protein abundance at the lowest temperature as input for the TPP package \cite{becher2016}. \subsection{Performing the analysis} Fist of all, we load an example data set: <>= data("panob2D_isobQuant_example") @ Using this command we load two objects: \begin{enumerate} \item \texttt{Panobinostat\_2DTPP\_smallExampleData}: a list of data frames that contain the measurements to be analyzed, \item \texttt{hdac2D\_config}: a configuration table with details about each experiment. \end{enumerate} <>= config_tpp2d <- panobinostat_2DTPP_config data_tpp2d <- panobinostat_2DTPP_data config_tpp2d data_tpp2d %>% str(1) @ The data object \texttt{Panobinostat\_2DTPP\_smallExampleData} is organized as a list of data frames which contain the experimental raw data of an 2D-TPP experiment. The names of the list elements correspond to the different multiplexed experiments. Each experimental dataset constains the following columns: <>= data_tpp2d$Experiment1 %>% colnames @ In ordern to perform the complete workflow we can now simply use: <>= tpp2dResults <- analyze2DTPP(configFile = config_tpp2d, data = data_tpp2d, fcStr = NULL, methods = "doseResponse", nCores = 2) tpp2dResults %>% mutate_if(is.character, factor) %>% summary @ Moreover, we can also invoke the single functions of the workflow manually. Therefore, we start with importing the data. Using the import function the data is subsequently imported and stored in a single dataframe containing all the required data columns and those that the user likes to take along through the analysis to be displayed together with the results of this workflow. <>= data2d <- tpp2dImport(configTable = config_tpp2d, data = data_tpp2d, fcStr = NULL) head(data2d) @ If we haven't computed fold changes from the raw "sumionarea" data, as it is the case in this example, we can invoke the function \textit{tpp2dComputeFoldChanges} in order to do so: <>= fcData2d <- tpp2dComputeFoldChanges(configTable = config_tpp2d, data = data2d) @ Thereon the function adds additional columns to our dataframe containing corresponding fold changes: <>= head(fcData2d) @ We can then normalize the data by performing a median normalization on the fold changes, in order to account for experiment specific noise. <>= normData2d <- tpp2dNormalize(configTable = config_tpp2d, data = fcData2d) head(normData2d) # we have to update our fcStr, if we want the normalized columns to be used in the following analysis fcStrUpdated <- "norm_rel_fc_" @ A configuration file for the TPP-CCR function can be then generated using the function \texttt{tpp2dCreateCCRConfigFile} <>= config_ccr <- tpp2dCreateCCRConfigFile(configTable = config_tpp2d) @ To run the TPP-CCR main function on our 2D-TPP data we now invoke: <>= ccr2dResults <- tpp2dCurveFit(configFile = config_ccr, data = normData2d, fcStr = fcStrUpdated) @ Now we can plot the curves for any of the proteins for which at least one CCR curve could be fitted. In this case we choose HDAC2: <>= goodCurves <- tpp2dPlotCCRGoodCurves(configTable = config_tpp2d, data = ccr2dResults, fcStr = fcStrUpdated) @ <>= goodCurves[["HDAC2"]] @ %\begin{center} %\includegraphics[width=0.75\textwidth]{figure/plotCurve-1} %\end{center} And we can also plot the single curves for each of the proteins with: <>= singleCurve <- tpp2dPlotCCRSingleCurves(configTable = config_tpp2d, data = ccr2dResults, fcStr = fcStrUpdated) singleCurve[["HDAC2"]][["54"]] @ %\begin{center} %\includegraphics[width=0.65\textwidth]{figure/plotSingleCurves-1} %\end{center} \bibliography{TPP_references} \clearpage \end{document}