--- title: "Step 3: Post-Processing" author: "Tyler J Burns" date: "October 2, 2017" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Final Post-Processing Steps for Scone} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, results = "markup", message = FALSE, warning = FALSE) knitr::opts_chunk$set(fig.width=6, fig.height=4) ``` ### The post-processing function: This vignette covers what takes place following the generation of SCONE output detailed in TheSconeWorkflow.Rmd. The obvious step that needs to take place is the Scone generated columns being merged into the original input data. The user gets the option of log base 10 transforming q values, which is easier to visualize. The user also gets the option to run t-SNE on the data, such that said maps can be colored by SCONE generated values. In this case, t-SNE is run utilizing the Rtsne package, using the same markers that were used as input for the KNN. generation. ```{r} library(Sconify) wand.final <- PostProcessing(scone.output = wand.scone, cell.data = wand.combined, input = input.markers) wand.combined # input data wand.scone # scone-generated data wand.final # the data after post-processing # tSNE map shows highly responsive population of interest TsneVis(wand.final, "pSTAT5(Nd150)Di.IL7.change", "IL7 -> pSTAT5 change") # tSNE map now colored by q value TsneVis(wand.final, "pSTAT5(Nd150)Di.IL7.qvalue", "IL7 -> pSTAT5 -log10(qvalue)") # tSNE map colored by KNN density estimation TsneVis(wand.final, "density") ``` ### Subsampling your data prior to running t-SNE: If one has a large number of cells in the dataset (>100K), then t-SNE can become time-consuming and produce results that are less clean. As such, I provide a wrapper that allows one to subsample the final data and run t-SNE on the subsampled data, producing a new tibble that contains the subsampled data along with two t-SNE dimensions added to it. Note the two added dimensions at the end of the tibble are called "bh-SNE11" and "bh-SNE21". This is because the dimensions "bh-SNE1" and "bh-SNE2" are already in the data, because t-SNE was run during the post processing step in this example. As I have stated, a user would realistically use this function with a much larger number of cells, in which case the user would have selected "tsne = FALSE" in the post.processing function detailed above in this vignette. ```{r} wand.final.sub <- SubsampleAndTsne(dat = wand.final, input = input.markers, numcells = 500) wand.final.sub ```