---
title: "TSAR Workflow by Command"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{TSAR Workflow by Command}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

Pipeline analysis from raw data reading to graphic visualization

## 1. Introduction
TSAR Package provides simple solution to qPCR data processing, computing 
thermal shift analysis given either raw fluorescent data or smoothed curves. 
The functions provide users with the protocol to conduct preliminary data checks
and also expansive analysis on large scale of data. Furthermore, it showcases 
simple graphic presentation of analysis, generating clear box plot and line 
graphs given input of desired designs. Overall, TSAR Package offers a workflow
easy to manage and visualize.

## 2. Installation
Load Package and other relevant packages. Usage of dplyr and ggplot2 
along with TSAR package is recommended for enhanced analysis.

```{r, include = FALSE}
knitr::opts_chunk$set(warning = FALSE, message = FALSE, comment = "#>")
```

Use commands below to install TSAR package:
library(BiocManager)
BiocManager::install("CGAO123/TSAR", build_vignettes = TRUE)

```{r setup, message=FALSE, warning=FALSE}
library(TSAR)
library(dplyr)
library(ggplot2)
```

## 3. Load Data

Read data in .txt or .csv format. Use read.delim function to input tab
delimited file; use read.csv to input comma separated files. Other
formats of input are welcomed as long as data is stored in data frame
structure as numeric type (calculation-required data) and characters 
(non-calculation-required data). Ensure excessive lines are removed (e.g.
skip = , nrows = ). Means to check these are View(), pre-opening data
file in excel, or manually removing all excessive data before input
reading. Package defaults variable names as "Well.Position",
"Temperature", "Fluorescence", "Normalized". Consider renaming data
frame before proceding to following step.

```{r}
data("qPCR_data1")
raw_data <- qPCR_data1
```

## 4. Data Pre-Processing

Select data of individual cell for pre-analysis screening. e.g. select
well A1

```{r}
test <- raw_data %>% filter(Well.Position == "A01")
```

Run example analysis on one well to screen potential errors and
enhancement of model.

```{r}
# normalize fluorescence reading into scale between 0 and 1
test <- normalize(test, fluo = 5, selected = c(
    "Well.Position", "Temperature",
    "Fluorescence", "Normalized"
))
head(test)
gammodel <- model_gam(test, x = test$Temperature, y = test$Normalized)
test <- model_fit(test, model = gammodel)
```

Output analysis result using `view_model()` to view normalized data and fitted
model. Determine if any noise need to be removed (i.e. subsetting by
temperature range). Determine which model is the best (i.e. is currrent
data already smoothed, does fitted model suit well.) Determine if
Tm-estimation is proper. \*current model assumes derivative estimation
of Tm value.

```{r}
Tm_est(test)
view <- view_model(test)
view[[1]] + theme(aspect.ratio = 0.7, legend.position = "bottom")
view[[2]] + theme(aspect.ratio = 0.7, legend.position = "bottom")
```

Screen all wells for curve shape on raw_data set and sift out corrupted
data. This step is not required but may help remove data modeling
errors.

```{r}
myApp <- weed_raw(raw_data, checkrange = c("A", "C", "1", "12"))
```

```{r, eval = FALSE}
shiny::runApp(myApp)
```

```{r}
raw_data <- remove_raw(raw_data, removerange = c("B", "H", "1", "12"))
screen(raw_data) + theme(
    aspect.ratio = 0.7,
    legend.position = "bottom",
    legend.text = element_text(size = 6),
    legend.key.size = unit(0.4, "cm"),
    legend.title = element_text(size = 8)
) +
    guides(color = guide_legend(nrow = 2, byrow = TRUE))
```

## 5. 96-well Analysis Application

TSAR package excels in mass analysis by propagating identical protocols
to all 96 wells. `smoothed = T` infers current data is smoothed and no
separate modeling is needed. If modeling is needed, input argument as
`smoothed = F`.

TSAR package performs derivative analysis using a generalized additive model 
through package `mgcv` or boltzmann analysis using nlsLM from package 
`minpack.lm`.

```{r, echo = FALSE}
x <- gam_analysis(raw_data,
    smoothed = TRUE, fluo_col = 5,
    selections = c(
        "Well.Position", "Temperature",
        "Fluorescence", "Normalized"
    )
)
x <- na.omit(x)
```

## 6. Intermediate Data Output

Read analysis using read_tsar() function and view head and tail to
ensure appropriate output was achieved. Data output can also be saved
locally into .csv or .txt format using function wrtie_tsar. However,
pipeline to downstream analysis does not require output to be locally
saved.

```{r, echo = FALSE}
# look at only Tm result by well
output <- read_tsar(x, output_content = 0)
head(output)
tail(output)
```

write output data file
`write_tsar(read_tsar(x, output_content = 2),`
    `name = "0923_tm_val", file = "csv")`

## 7. Complete Dataset with Ligand and Protein Information

For downstream analysis, data need to be mapped towards specific ligand
and compound. Use may input by default excel template included in the
package or input as .txt or .csv table, specifying Ligand and Compound
by Well ID. Data with coumpound and ligand labels can also be stored
locally using the same mean as previous step. All data are kept including
wells with blank condition information (specified as NA). 
In case removal is needed, call function na.omit().

```{r, echo = FALSE}
# join protein and ligand information
data("well_information")
norm_data <- join_well_info(
    file_path = NULL,
    file = well_information,
    read_tsar(x, output_content = 2), type = "by_template"
)
norm_data <- na.omit(norm_data)
head(norm_data)
tail(norm_data)
```

Write output into the working directory with write_tsar
`write_tsar(norm_data, name = "vitamin_tm_val_norm", file = "csv")`


## 8. Merge Data across Biological Replicates

Repeat step 2 through 6 on replicate data set. A five step function call
will complete all analysis. If additional screening is desired, a two
step call will run the interactive window to allow selection of

```{r}
data("qPCR_data2")
raw_data_rep <- qPCR_data2

raw_data_rep <- remove_raw(raw_data_rep, removerange = c("B", "H", "1", "12"))
```

```{r, eval = FALSE}
myApp <- weed_raw(raw_data_rep)
shiny::runApp(myApp)
```

```{r}
raw_data_rep <- remove_raw(raw_data_rep, removelist = "A12")
screen(raw_data_rep) + theme(
    aspect.ratio = 0.7,
    legend.position = "bottom",
    legend.text = element_text(size = 6),
    legend.key.size = unit(0.4, "cm"),
    legend.title = element_text(size = 8)
) +
    guides(color = guide_legend(nrow = 2, byrow = TRUE))

analysis_rep <- gam_analysis(raw_data_rep, smoothed = TRUE)
output_rep <- read_tsar(analysis_rep, output_content = 2)
norm_data_rep <- join_well_info(
    file_path = NULL,
    file = well_information,
    output_rep, type = "by_template"
)
norm_data_rep <- na.omit(norm_data_rep)
```

Merge data by content. All data are marked its source file name and
experiment date.

```{r}
norm_data <- na.omit(norm_data)
norm_data_rep <- na.omit(norm_data_rep)
tsar_data <- merge_norm(
    data = list(norm_data, norm_data_rep),
    name = c(
        "Vitamin_RawData_Thermal Shift_02_162.eds.csv",
        "Vitamin_RawData_Thermal Shift_02_168.eds.csv"
    ),
    date = c("20230203", "20230209")
)
```

## 9. Tm Estimation Shift Visualization

Use condition_IDs() and well_IDs() to select or remove condition to
visualize. Visualize Tm estimation by compound or ligand type in the
format of box graph.

```{r}
condition_IDs(tsar_data)
well_IDs(tsar_data)
conclusion <- tsar_data %>%
    filter(condition_ID != "NA_NA") %>%
    filter(condition_ID != "CA FL_Riboflavin")
TSA_boxplot(conclusion,
    color_by = "Protein", label_by = "Ligand",
    separate_legend = FALSE
)
```

## 10. TSA Curve Visualization

Specify Control condition by assigning condition_ID to control.
TSA_compare_plot generated multiple line graphs for comparison.

```{r}
control_ID <- "CA FL_DMSO"

TSA_compare_plot(conclusion,
    y = "RFU",
    control_condition = control_ID
)
```

Select by condition or well IDs to view curves and estimated Tm values.

```{r}
error <- conclusion %>% filter(condition_ID == "CA FL_PyxINE HCl")
TSA_wells_plot(error, separate_legend = FALSE)
```

To further visualize the comparison, graph first derivatives grouped by needs 
(i.e. well_ID, condition_ID, or other separately appended conditions).
Below is an example command. Due to size limit of vignette, graph will not be
displayed.
`view_deriv(conclusion, frame_by = "condition_ID")`

## 11. Session Info

### 11.1 Citation
```{r}
citation("TSAR")
citation()
citation("dplyr")
citation("ggplot2")
```

### 11.2 Session Info
```{r}
sessionInfo()
```