--- title: "ROSeq" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{ROSeq} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # ROSeq Modeling expression ranks for noise-tolerant differential expression analysis of scRNA-Seq data ## Introduction ROSeq - A rank based approach to modeling gene expression with filtered and normalized read count matrix. ROSeq takes filtered and normalized read matrix and cell-annotation/condition as input and determines the differentially expressed genes between the contrasting groups of single cells. One of the input parameters is the number of cores to be used. ## Installation The developer's version of the R package can be installed with the following R commands: ```r if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") # The following initializes usage of Bioc devel BiocManager::install(version='devel') BiocManager::install("ROSeq") ``` The github's version of the R package can be installed with the following R commands: ```r library(devtools) install_github('krishan57gupta/ROSeq') ``` ## Vignette tutorial This vignette uses the Tung dataset, which is already inbuilt in the package, to demonstrate a standard pipeline. ## Example Libraries need to be loaded before running. ```{r setup} library(ROSeq) library(edgeR) library(limma) ``` ### Loading tung dataset ```{r data, message=FALSE,warning = FALSE,include=TRUE, cache=FALSE} samples<-list() samples$count<-ROSeq::L_Tung_single$NA19098_NA19101_count samples$group<-ROSeq::L_Tung_single$NA19098_NA19101_group samples$count[1:5,1:5] ``` ### Data Preprocessing: #### Cells and genes filtering then voom transformation after TMM normalization Below commands can be used for Cell/gene filtering, TMM normalization and voom transformation. The user is free to use an alternative preprocessing strategy while using different filtering/normalization methods. ```{r preprocesing, message=FALSE,warning = FALSE,include=TRUE, cache=FALSE} gene_names<-rownames(samples$count) samples$count<-apply(samples$count,2,function(x) as.numeric(x)) rownames(samples$count)<-gene_names samples$count<-samples$count[,colSums(samples$count> 0) > 2000] gkeep<-apply(samples$count,1,function(x) sum(x>2)>=3) samples$count<-samples$count[gkeep,] samples$count<-limma::voom(ROSeq::TMMnormalization(samples$count)) ``` ### ROSeq analysis. Input: gene expression matrix with genes in rows and cells in columns. Condition/group annotation of cells also need to be supplied. User can set numCores based the hardware specifications in her computer. ```{r main, message=FALSE,warning = FALSE, include=TRUE, cache=FALSE} output<-ROSeq(countData=samples$count$E, condition = samples$group, numCores=1) ``` ### Showing results are in the form of pVals and pAdj ##### p_Vals : p_value (unadjusted) ##### p_Adj : Adjusted p-value, based on FDR method ```{r output, message=FALSE,warning = FALSE,include=TRUE, cache=FALSE} output[1:5,] ```