--- title: "TSAR Workflow by Command" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{TSAR Workflow by Command} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- Pipeline analysis from raw data reading to graphic visualization ## 1. Introduction TSAR Package provides simple solution to qPCR data processing, computing thermal shift analysis given either raw fluorescent data or smoothed curves. The functions provide users with the protocol to conduct preliminary data checks and also expansive analysis on large scale of data. Furthermore, it showcases simple graphic presentation of analysis, generating clear box plot and line graphs given input of desired designs. Overall, TSAR Package offers a workflow easy to manage and visualize. ## 2. Installation Load Package and other relevant packages. Usage of dplyr and ggplot2 along with TSAR package is recommended for enhanced analysis. ```{r, include = FALSE} knitr::opts_chunk$set(warning = FALSE, message = FALSE, comment = "#>") ``` Use commands below to install TSAR package: library(BiocManager) BiocManager::install("CGAO123/TSAR", build_vignettes = TRUE) ```{r setup, message=FALSE, warning=FALSE} library(TSAR) library(dplyr) library(ggplot2) ``` ## 3. Load Data Read data in .txt or .csv format. Use read.delim function to input tab delimited file; use read.csv to input comma separated files. Other formats of input are welcomed as long as data is stored in data frame structure as numeric type (calculation-required data) and characters (non-calculation-required data). Ensure excessive lines are removed (e.g. skip = , nrows = ). Means to check these are View(), pre-opening data file in excel, or manually removing all excessive data before input reading. Package defaults variable names as "Well.Position", "Temperature", "Fluorescence", "Normalized". Consider renaming data frame before proceding to following step. ```{r} data("qPCR_data1") raw_data <- qPCR_data1 ``` ## 4. Data Pre-Processing Select data of individual cell for pre-analysis screening. e.g. select well A1 ```{r} test <- raw_data %>% filter(Well.Position == "A01") ``` Run example analysis on one well to screen potential errors and enhancement of model. ```{r} # normalize fluorescence reading into scale between 0 and 1 test <- normalize(test, fluo = 5, selected = c( "Well.Position", "Temperature", "Fluorescence", "Normalized" )) head(test) gammodel <- model_gam(test, x = test$Temperature, y = test$Normalized) test <- model_fit(test, model = gammodel) ``` Output analysis result using `view_model()` to view normalized data and fitted model. Determine if any noise need to be removed (i.e. subsetting by temperature range). Determine which model is the best (i.e. is currrent data already smoothed, does fitted model suit well.) Determine if Tm-estimation is proper. \*current model assumes derivative estimation of Tm value. ```{r} Tm_est(test) view <- view_model(test) view[[1]] + theme(aspect.ratio = 0.7, legend.position = "bottom") view[[2]] + theme(aspect.ratio = 0.7, legend.position = "bottom") ``` Screen all wells for curve shape on raw_data set and sift out corrupted data. This step is not required but may help remove data modeling errors. ```{r} myApp <- weed_raw(raw_data, checkrange = c("A", "C", "1", "12")) ``` ```{r, eval = FALSE} shiny::runApp(myApp) ``` ```{r} raw_data <- remove_raw(raw_data, removerange = c("B", "H", "1", "12")) screen(raw_data) + theme( aspect.ratio = 0.7, legend.position = "bottom", legend.text = element_text(size = 6), legend.key.size = unit(0.4, "cm"), legend.title = element_text(size = 8) ) + guides(color = guide_legend(nrow = 2, byrow = TRUE)) ``` ## 5. 96-well Analysis Application TSAR package excels in mass analysis by propagating identical protocols to all 96 wells. `smoothed = T` infers current data is smoothed and no separate modeling is needed. If modeling is needed, input argument as `smoothed = F`. TSAR package performs derivative analysis using a generalized additive model through package `mgcv` or boltzmann analysis using nlsLM from package `minpack.lm`. ```{r, echo = FALSE} x <- gam_analysis(raw_data, smoothed = TRUE, fluo_col = 5, selections = c( "Well.Position", "Temperature", "Fluorescence", "Normalized" ) ) x <- na.omit(x) ``` ## 6. Intermediate Data Output Read analysis using read_tsar() function and view head and tail to ensure appropriate output was achieved. Data output can also be saved locally into .csv or .txt format using function wrtie_tsar. However, pipeline to downstream analysis does not require output to be locally saved. ```{r, echo = FALSE} # look at only Tm result by well output <- read_tsar(x, output_content = 0) head(output) tail(output) ``` write output data file `write_tsar(read_tsar(x, output_content = 2),` `name = "0923_tm_val", file = "csv")` ## 7. Complete Dataset with Ligand and Protein Information For downstream analysis, data need to be mapped towards specific ligand and compound. Use may input by default excel template included in the package or input as .txt or .csv table, specifying Ligand and Compound by Well ID. Data with coumpound and ligand labels can also be stored locally using the same mean as previous step. All data are kept including wells with blank condition information (specified as NA). In case removal is needed, call function na.omit(). ```{r, echo = FALSE} # join protein and ligand information data("well_information") norm_data <- join_well_info( file_path = NULL, file = well_information, read_tsar(x, output_content = 2), type = "by_template" ) norm_data <- na.omit(norm_data) head(norm_data) tail(norm_data) ``` Write output into the working directory with write_tsar `write_tsar(norm_data, name = "vitamin_tm_val_norm", file = "csv")` ## 8. Merge Data across Biological Replicates Repeat step 2 through 6 on replicate data set. A five step function call will complete all analysis. If additional screening is desired, a two step call will run the interactive window to allow selection of ```{r} data("qPCR_data2") raw_data_rep <- qPCR_data2 raw_data_rep <- remove_raw(raw_data_rep, removerange = c("B", "H", "1", "12")) ``` ```{r, eval = FALSE} myApp <- weed_raw(raw_data_rep) shiny::runApp(myApp) ``` ```{r} raw_data_rep <- remove_raw(raw_data_rep, removelist = "A12") screen(raw_data_rep) + theme( aspect.ratio = 0.7, legend.position = "bottom", legend.text = element_text(size = 6), legend.key.size = unit(0.4, "cm"), legend.title = element_text(size = 8) ) + guides(color = guide_legend(nrow = 2, byrow = TRUE)) analysis_rep <- gam_analysis(raw_data_rep, smoothed = TRUE) output_rep <- read_tsar(analysis_rep, output_content = 2) norm_data_rep <- join_well_info( file_path = NULL, file = well_information, output_rep, type = "by_template" ) norm_data_rep <- na.omit(norm_data_rep) ``` Merge data by content. All data are marked its source file name and experiment date. ```{r} norm_data <- na.omit(norm_data) norm_data_rep <- na.omit(norm_data_rep) tsar_data <- merge_norm( data = list(norm_data, norm_data_rep), name = c( "Vitamin_RawData_Thermal Shift_02_162.eds.csv", "Vitamin_RawData_Thermal Shift_02_168.eds.csv" ), date = c("20230203", "20230209") ) ``` ## 9. Tm Estimation Shift Visualization Use condition_IDs() and well_IDs() to select or remove condition to visualize. Visualize Tm estimation by compound or ligand type in the format of box graph. ```{r} condition_IDs(tsar_data) well_IDs(tsar_data) conclusion <- tsar_data %>% filter(condition_ID != "NA_NA") %>% filter(condition_ID != "CA FL_Riboflavin") TSA_boxplot(conclusion, color_by = "Protein", label_by = "Ligand", separate_legend = FALSE ) ``` ## 10. TSA Curve Visualization Specify Control condition by assigning condition_ID to control. TSA_compare_plot generated multiple line graphs for comparison. ```{r} control_ID <- "CA FL_DMSO" TSA_compare_plot(conclusion, y = "RFU", control_condition = control_ID ) ``` Select by condition or well IDs to view curves and estimated Tm values. ```{r} error <- conclusion %>% filter(condition_ID == "CA FL_PyxINE HCl") TSA_wells_plot(error, separate_legend = FALSE) ``` To further visualize the comparison, graph first derivatives grouped by needs (i.e. well_ID, condition_ID, or other separately appended conditions). Below is an example command. Due to size limit of vignette, graph will not be displayed. `view_deriv(conclusion, frame_by = "condition_ID")` ## 11. Session Info ### 11.1 Citation ```{r} citation("TSAR") citation() citation("dplyr") citation("ggplot2") ``` ### 11.2 Session Info ```{r} sessionInfo() ```