--- title: "LinTInd - tutorial" author: "Luyue Wang" date: "2021/12/22" output: html_document vignette: > %\VignetteIndexEntry{LinTInd - tutorial} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ## Introduction Single-cell RNA sequencing has become a common approach to trace developmental processes of cells, however, using exogenous barcodes is more direct than predicting from expression profiles recently, based on that, as gene-editing technology matures, combining this technological method with exogenous barcodes can generate more complex dynamic information for single-cell. In this application note, we introduce an R package: LinTInd for reconstructing a tree from alleles generated by the genome-editing tool known as CRISPR for a moderate time period based on the order in which editing occurs, and for sc-RNA seq, ScarLin can also quantify the similarity between each cluster in three ways. ## Installation Via GitHub ``` devtools::install_github("mana-W/LinTInd") ``` Via Bioconductor ``` if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("LinTInd") ``` ```{r load package, message=FALSE} library(LinTInd) ``` ## Import data The input for LinTInd consists three required files: - sequence - reference - position of cutsites and an optional file: - celltype ```{R,message=FALSE} data<-paste0(system.file("extdata",package = 'LinTInd'),"/CB_UMI") fafile<-paste0(system.file("extdata",package = 'LinTInd'),"/V3.fasta") cutsite<-paste0(system.file("extdata",package = 'LinTInd'),"/V3.cutSites") celltype<-paste0(system.file("extdata",package = 'LinTInd'),"/celltype.tsv") data<-read.table(data,sep="\t",header=TRUE) ref<-ReadFasta(fafile) cutsite<-read.table(cutsite,col.names = c("indx","start","end")) celltype<-read.table(celltype,header=TRUE,stringsAsFactors=FALSE) ``` For the sequence file, only the column contain reads' strings is requeired, the cell barcodes and UMIs are both optional. ```{R} head(data,3) ref cutsite head(celltype,3) ``` ## Array identify and indel visualization In the first step, we shold use `FindIndel()` to alignment and find indels, and the function `IndelForm()` will help us to generate an array-form string for each read. ```{R find indels and generate array-form strings, message=FALSE} scarinfo<-FindIndel(data=data,scarfull=ref,scar=cutsite,indel.coverage="All",type="test",cln=1) scarinfo<-IndelForm(scarinfo,cln=1) ``` Then for single-cell sequencing, we shold define a final-version of array-form string for each cell use `IndelIdents()`, there are three method are provided : - *"reads.num"*(default): find an array-form stirng supported by most reads in a cell - *"umi.num"*: find an array-form stirng supported by most UMIs in a cell - *"consensus"*: find the consistent sequences in each cell, and then generate array-form strings from the new reads For bulk sequencing, in this step, we will generate a "cell barcode" for each read. ```{r IndelIdents, message=FALSE} cellsinfo<-IndelIdents(scarinfo,method.use="umi.num",cln=1) ``` After define the indels for each cell, we can use `IndelPlot()` to visualise them. ```{r IndelPlot} IndelPlot(cellsinfo = cellsinfo) ``` ## Indel extract and similarity calculate We can use the function `TagProcess()` to extract indels for cells/reads. The parameter *Cells* is optional. ```{r TagProcess} tag<-TagProcess(cellsinfo$info,Cells=celltype) ``` And if the annotation of each cells are provided, we can also use `TagDist()` to calculate the relationship between each group in three way: - *"Jaccard"*(default): calculate the weighted jaccard similarity of indels between each pair of groups - *"P"*: right-tailed test, compare the Indels intersection level with the hypothetical result generated from random editing, and the former is expected to be significantly higher than the latter - *"spearman"*: Spearman correlation of indels between each pair of groups The heatmap of this result will be saved as a pdf file. ```{r TagDist} tag_dist=TagDist(tag,method = "Jaccard") tag_dist ``` ## Tree reconstruct In the laste part, we can use `BuildTree()` to Generate an array generant tree. ```{r BuildTree} treeinfo<-BuildTree(tag) ``` Finally, we can use the function `PlotTree()` to visualise the tree created before. ```{r PlotTree} plotinfo<-PlotTree(treeinfo = treeinfo,data.extract = "TRUE",annotation = "TRUE") plotinfo$p ``` ## Session Info ```{r sessionInfo, echo=TRUE} sessionInfo() ```