%\VignetteIndexEntry{CRISPRseek Vignette} %\VignetteDepends{CRISPRseek} %\VignetteKeywords{CRISPRseek-Cas9} %\VignettePackage{CRISPRseek} \documentclass[12pt]{article} <>= BiocStyle::latex() @ \usepackage{hyperref} \usepackage{url} \usepackage[numbers]{natbib} \usepackage{graphicx} \usepackage[section]{placeins} \bibliographystyle{plainnat} \author{Lihua Julie Zhu, Michael Brodsky} \begin{document} \SweaveOpts{concordance=TRUE} \title{CRISPRseek user's guide} \maketitle \tableofcontents %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Introduction} CRISPR-Cas9 nucleases and their derivatives have rapidly become widely used tools for both genome modification and regulation of gene expression. They can create genetic changes with high efficiency in human stem cells, in model organisms such as mice and Drosophila and in a wide variety of other organisms. CRISPR-Cas9 nucleases create double strand DNA breaks that can facilitate a variety of genome modifications including short insertions and/or deletions (indels) or specific sequence changes introduced by homology directed repair with a DNA donor molecule. The high activity and relative ease of construction has made CRISPR-Cas9 nucleases a popular replacement for related technologies such as zinc finger nucleases and TALENs. Derivatives of CRISPR-Cas9 complexes include nickases, which only cleave one DNA strand, and gene expression regulators, which lack any DNA cleavage activity but can increase or decrease gene transcription. CRISPR-Cas9 nucleases are composed of an RNA-protein complex that can target a variable sequence (guide RNA, abbreviated as "gRNA") that is directly adjacent to a constant motif (the "PAM" sequence). In the most widely use version from the species Streptococcus pyogenes, the gRNA is composed of a variable region of 20 bases and the preferred PAM sequence is an adjacent 3 base sequence of NGG (or NAG with lower activity). One potential limitation for CRISPR-Cas9 nucleases is that they can cleave at some sequences that do not precisely match the sequence targeted by the gRNA sequence. Thus, an important consideration for the design and application of CRISPR-Cas9 nucleases is the identification of gRNA regions with low rates of off-target cleavage. Several computational analyses can assist with the construction and application of CRISPR-Cas9 nucleases with high on-target and low off-target cleavage. First, gRNA sequences can be evaluated for possible off-target sequences in the target genome. Second, sequences flanking possible off-target sequences can be reported to assist in the experimental analysis of off-target cleavage and to determine if these sequences are within critical regions for gene function such as exons. Third, specific arrangements of target sequences can be selected; one alternate approach to lower off-target rates is to introduce pairs of CRISPR-Cas9 nickases, which will only create double strand DNA breaks at genomic regions where the two sites have the proper spacing and orientation. Finally, in some applications, it is useful to use restriction enzyme sequences that overlap CRISPR-Cas9 target sites in order to monitor cleavage events. We developed \Biocpkg{CRISPRseek} package that identifies candidate CRISPR-Cas9 target sequences within a given input sequence using a variety of experimentally useful constraints and also reports and ranks potential off-target sequences for each recovered target sequence. CRISPRseek will automatically find potential target sequences that are/are not present as pairs that can be used as double nickases or that have/don't have overlapping restriction enzyme cut site(s). It will then search genome-wide for off-targets with a user defined maximum number of mismatches, calculate the score of each off-target based on mismatches in the off-target and a penalty weight matrix, filter off-targets with user-defined criteria, and annotate off-targets with flanking sequences, and whether located in exon or not. Several reports are generated including a summary report with gRNAs ranked by total topN off-target score, restriction enzyme cut sites and possible paired gRNAs. Detailed paired gRNAs information, restriction enzyme cut sites, and off-target sequences and scores are stored in separate files in the output directory specified by the user. In total, four tab delimited files are generated in the output directory: OfftargetAnalysis.xls (off-target details), Summary.xls (gRNA summary), REcutDetails.xls (restriction enzyme cut sites of each gRNA), and pairedgRNAs.xls (potential paired gRNAs). These reports provide a comprehensive set of information to identify, select and utilize Streptococcus pyogenes CRISPR-Cas9 nucleases and their derivatives. The package can also be readily modified to accept different gRNA and PAM sequence requirements for CRISPR-Cas9 complexes from other bacterial species that can be used to target alternative genomic sequences. The package can also be modified to incorporate improved weight matrices or scoring.method for scoring off-target sequences as new experimental and computational results become available for CRISPR-Cas9 nucleases for Streptococcus pyogenes and other species. \section{Examples of using CRISPRseek} In this guide, we will illustrate five different gRNA search scenarios with a human sequence. First load \Biocpkg{CRISPRseek}, \Biocannopkg{BSgenome.Hsapiens.UCSC.hg19} and \Biocannopkg{TxDb.Hsapiens.UCSC.hg19.knownGene}. Then specify the sequence file path as inputFilePath, a fasta/fastq file containing a genomic sequence, restriction enzyme pattern file as REpatternFile and output directory as outputDir. Once the analysis is done, analysis results will be saved in the output directory. To find BSgenome of other species, please refer to available.genomes in the \Biocpkg{BSgenome} package. For example, \Biocannopkg{BSgenome.Hsapiens.UCSC.hg19} for hg19, \Biocannopkg{BSgenome.Mmusculus.UCSC.mm10} for mm10, \Biocannopkg{BSgenome.Celegans.UCSC.ce6} for ce6, \Biocannopkg{BSgenome.Rnorvegicus.UCSC.rn5} for rn5, \linebreak \Biocannopkg{BSgenome.Drerio.UCSC.danRer7} for Zv9, and \Biocannopkg{BSgenome.Dmelanogaster.UCSC.dm3} for dm3 To create and use TxDb objects, please refer to the \Biocpkg{GenomicFeatures} package. For a list of existing TxDb objects, please search for annotation package starting with Txdb at \linebreak http://www.bioconductor.org/packages/release/BiocViews.html, such as \Biocannopkg{TxDb.Rnorvegicus.UCSC.rn5.refGene} for rat, \Biocannopkg{TxDb.Mmusculus.UCSC.mm10.knownGene} for mouse, \Biocannopkg{TxDb.Hsapiens.UCSC.hg19.knownGene} for human, \Biocannopkg{TxDb.Dmelanogaster.UCSC.dm3.ensGene} for Drosophila and \Biocannopkg{TxDb.Celegans.UCSC.ce6.ensGene} for C.elegans \Biocpkg{org.Hs.eg.db} is the gene ID mapping package for human. For a list of existing OrgDb packages, please search for OrgDb at \linebreak http://www.bioconductor.org/packages/release/BiocViews.html, such as \Biocannopkg{org.Rn.eg.db} for rat \Biocannopkg{org.Mm.eg.db} for mouse \Biocannopkg{org.Dm.eg.db} for Drosophila \Biocannopkg{org.Ce.eg.db} for C.elegans \begin{scriptsize} <<>>= library(CRISPRseek) library(BSgenome.Hsapiens.UCSC.hg19) library(TxDb.Hsapiens.UCSC.hg19.knownGene) library(org.Hs.eg.db) outputDir <- getwd() inputFilePath <- system.file('extdata', 'inputseq.fa', package = 'CRISPRseek') REpatternFile <- system.file('extdata', 'NEBenzymes.fa', package = 'CRISPRseek') @ \end{scriptsize} \subsection{Scenario 1: Finding paired gRNAs with restriction enzyme cut site(s)} Paired gRNAs in proper spacing and orientation give more specificity and gRNAs overlap with restriction enzyme cut sites facilitates cleavage monitoring. Calling the function \Rfunction{offTargetAnalysis} with findPairedgRNAOnly=TRUE and findgRNAsWithREcutOnly=TRUE results in searching, scoring and annotating gRNAs that are in paired configuration and at least one of the pairs overlap a restriction enzyme cut site. To be considered as a pair, gap between forward gRNA and the corresponding reverse gRNA needs to be (min.gap, max.gap) inclusive and the reverse gRNA must sit before the forward gRNA. The default (min.gap, max.gap) is (0,20). Please note that chromToSearch is set to chrX here for speed purpose, usually you would set it to all, which is the default. In order for a gRNA to be considered overlap with restriction enzyme cut site, the enzyme cut pattern must overlap with one of the gRNA positions specified in overlap.gRNA.positions, default position 17 and 18. Please note that max.mismatch allowed for off-target finding is set to 4 by default, set it to a larger number will significantly slow down the search. org.Hs.egSYMBOL is entrezID to gene symbol mapping in org.Hs.eg.db package for human. For detailed parameter settings using function \Rfunction{offTargetAnalysis}, please type help(offTargetAnalysis) \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE, REpatternFile = REpatternFile, findPairedgRNAOnly = TRUE, BSgenomeName = Hsapiens, chromToSearch ="chrX", min.gap = 0, max.gap = 20, txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 0,overlap.gRNA.positions = c(17, 18), outputDir = outputDir,overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 2: Finding paired gRNAs with/without restriction enzyme cut site(s)} Calling the function \Rfunction{offTargetAnalysis} with findPairedgRNAOnly = TRUE and findgRNAsWithREcutOnly = FALSE results in searching, scoring and annotating gRNAs that are in paired configuration without requiring overlap any restriction enzyme cut site. The gRNAs will be annotated with restriction enzyme cut sites for users to review later. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile,findPairedgRNAOnly = TRUE, BSgenomeName = Hsapiens, chromToSearch = "chrX", txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 3: Finding all gRNAs with restriction enzyme cut site(s)} Calling the function \Rfunction{offTargetAnalysis} with findPairedgRNAOnly=FALSE and findgRNAsWithREcutOnly = TRUE results in searching, scoring and annotating all gRNAs (paired and not paired) overlap restriction enzyme cut site(s) and off-targets. The gRNAs will be annotated with paired information for users to review later. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, BSgenomeName = Hsapiens, chromToSearch = "chrX", txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 4: Finding all gRNAs} Calling the function \Rfunction{offTargetAnalysis} with findPairedgRNAOnly = FALSE and findgRNAsWithREcutOnly = FALSE results in searching, scoring and annotating all gRNAs and off-targets. The gRNAs will be annotated with paired information and restriction enzyme cut sites for users to review later. Please note that this search will be the slowest among all type of searches aforementioned. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile,findPairedgRNAOnly = FALSE, BSgenomeName = Hsapiens, chromToSearch = "chrX", txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 5: Target and off-target analysis for user specified gRNAs} Calling the function \Rfunction{offTargetAnalysis} with findgRNAs = FALSE results in target and off-target searching, scoring and annotating for the input gRNAs. The gRNAs will be annotated with restriction enzyme cut sites for users to review later. However, paired information will not be available. \begin{scriptsize} <<>>= gRNAFilePath <- system.file('extdata', 'testHsap_GATA1_ex2_gRNA1.fa', package = 'CRISPRseek') results <- offTargetAnalysis(inputFilePath = gRNAFilePath, findgRNAsWithREcutOnly = FALSE, REpatternFile = REpatternFile, findPairedgRNAOnly = FALSE, findgRNAs = FALSE, BSgenomeName = Hsapiens, chromToSearch = 'chrX', txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 6. Quick gRNA finding without target or off-target analysis} Calling the function \Rfunction{offTargetAnalysis} with chromToSearch = "" results in quick gRNA search without performing on-target and off-target analysis. Parameters findgRNAsWithREcutOnly and findPairedgRNAOnly can be tuned to indicate whether searching for gRNAs overlap restriction enzyme cut sites or not, and whether searching for gRNAs in paired configuration or not. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE, REpatternFile = REpatternFile,findPairedgRNAOnly = TRUE, chromToSearch = "", outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 7. Quick gRNA finding with gRNA efficacy prediction } Calling the function \Rfunction{offTargetAnalysis} with max.mismatch = 0 results in quick gRNA search with gRNA efficacy prediction without off-target analysis. Parameter rule.set can be set to CRISPRscan published in 2015 or Root\_RuleSet2\_2016 to calculate efficacy using the rule set 2 published in 2016. By default, gRNA efficacy is predicted using the rule set 1 published in 2014. To use the rule set 2, first install python 2.7, then install the python packages: scikit-learn 0.16.1, pickle, pandas, numpy nd scipy. In a R session, type the following script to use python 2.7 since rule set 2 is implemented in python 2.7. Sys.setenv(PATH = paste("~/anaconda2/bin", Sys.getenv("PATH"), sep=":")) system("python --version") should output Python 2.7.15. In addition, parameters findgRNAsWithREcutOnly and findPairedgRNAOnly can be tuned to indicate whether searching for gRNAs overlap restriction enzyme cut sites or not, and whether searching for gRNAs in paired configuration or not. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE, annotateExon = FALSE,findPairedgRNAOnly = TRUE, chromToSearch = "chrX", max.mismatch = 0, BSgenomeName = Hsapiens, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} Here is an example on how to use CRISPRscan to calculate efficacy. Please remember to reset baseBeforegRNA, baseAfterPAM, rule.set, and featureWeightMatrixFile. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = TRUE, annotateExon = FALSE,findPairedgRNAOnly = TRUE, chromToSearch = "chrX", max.mismatch = 0, BSgenomeName = Hsapiens, rule.set = "CRISPRscan", baseBeforegRNA = 6, baseAfterPAM = 6, featureWeightMatrixFile = system.file("extdata", "Morenos-Mateo.csv", package = "CRISPRseek"), outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 8. Find potential gRNAs preferentially targeting one of two alleles without running time-consuming off-target analysis on all possible gRNAs.} Below is an example to search for all gRNAs that target at least one of the alleles. Two files are provided containing sequences that differ by a single nucleotide polymorphism (SNP). The results are saved in file scoresFor2InputSequences.xls in outputDir directory. \begin{scriptsize} <<>>= inputFile1Path <- system.file("extdata", "rs362331C.fa", package = "CRISPRseek") inputFile2Path <- system.file("extdata", "rs362331T.fa", package = "CRISPRseek") REpatternFile <- system.file("extdata", "NEBenzymes.fa", package = "CRISPRseek") seqs <- compare2Sequences(inputFile1Path, inputFile2Path, outputDir = outputDir , REpatternFile = REpatternFile, overwrite = TRUE) seqs @ \end{scriptsize} rs362331C.fa and rs362331T.fa are the names of the two input files. The output file will list all of the possible gRNA sequences for each of the two input sequences and provide a cleavage score for each of the two input sequences. To preferentially target one allele, select gRNA sequences that have the lowest score for the other allele. Selected gRNAs can then be examined for off-target sequences as described in Step 6. \subsection{Scenario 9. gRNA search and offTarget analysis of super long input sequence (longer than 200kb) } Calling the function \Rfunction{offTargetAnalysis} with annotatePaired = FALSE, enable.multicore = TRUE and set n.cores.max will improve the performance. We also suggest split the super long sequence into smaller chunks and perform offTarget analysis for each subsequence separately (Thank Alex Williams for sharing this use case at https://support.bioconductor.org/p/72994/). In addition, please remember to use repeat masked sequence as input. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, annotatePaired = FALSE, chromToSearch = "chrX", enable.multicore = TRUE, n.cores.max = 10, annotateExon = FALSE, max.mismatch = 0, BSgenomeName = Hsapiens, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 10. Output cutting frerquency determiniation (CFD) score for offtargets} Calling the function \Rfunction{offTargetAnalysis} with scoring.method set to CFDscore and PAM.pattern as NNG or NGA will output CFD score using the algorithm by Doench et al., 2016, which models the effects of both mismatch position and mimatch type on cutting frequency. By default, scoring.method is set to Hsu-Zhang, which only models the effect of mismatch position. \begin{scriptsize} <<>>= results <- offTargetAnalysis(inputFilePath, annotatePaired = FALSE, scoring.method = "CFDscore", PAM.pattern = "NNG$|NGN$", chromToSearch = "chrX", annotateExon = FALSE, max.mismatch = 2, BSgenomeName = Hsapiens, outputDir = outputDir, overwrite = TRUE) @ \end{scriptsize} \subsection{Scenario 11. Design gRNAs for base editing systems} Cytosine or adenine base editors (CBEs or ABEs) can introduce specific DNA C-to-T or A-to-G alterations. Calling the function \Rfunction{offTargetAnalysis} with baseEditing set to TRUE, and targetBase, editingWindow and editingWindow.offtarget set appropriately. By default, targetBase is set to C, editingWindow to 4 to 8 and editingWindow.offtargets to 4 to 8 for the CBE system developed in the Liu Laboratory. \begin{scriptsize} <<>>= offTargetAnalysis(inputFilePath, findgRNAsWithREcutOnly = FALSE, findPairedgRNAOnly = FALSE, annotatePaired = FALSE, BSgenomeName = Hsapiens, chromToSearch = "chrX", txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 4, outputDir = outputDir, overwrite = TRUE, PAM.location = "5prime", PAM = "TGT", PAM.pattern = "^T[A|G]N", allowed.mismatch.PAM = 2, subPAM.position = c(1,2), baseEditing = TRUE, editingWindow = 10:20, targetBase = "A") @ \end{scriptsize} \subsection{Scenario 12. Design gRNAs and pegRNAs for the Prime Editor} Recently, the Liu Laboratory developed the Prime Editor (PE), which is more versatile and flexible with high efficacy and without the need to make a DSB or providing donor template. It can be used to make all possible 12 single base changes, 1-44 bp insertions, or 1-80 bp deletions. This editing system can be programmed to correct about 89 percent of human pathagenic variants. To design gRNAs and pegRNAs for PE, you need to set primeEditing to TRUE, set seq.length.changed, bp.after.target.end, PBS.length (default 13), RT.template.length (default 8 to 28), min.gap (default 40), max.gap (default 90), findPairedgRNAOnly (needs to be set to TRUE), paired.orientation (needs to be set to "PAMin"), target.start, target.end, and correct.seq accordingly. Please type help(offTargetAnalysis) for detailed description of these parameters and examples for designing PE. \subsection{Scenario 13. Predict indels and their frequencies for Cas9 genome editing systems} Calling the function \Rfunction{offTargetAnalysis} with predIndelFreq set to TRUE. The first command outputs the gRNA name,gRNAPlusPAM sequence, offtarget sequence, gene name, predicted offtarget score, genomic location, and predicted efficacy. The second command outputs the indel locations and frequencies for the first target site. The summary file contains Shannon entropy and the fraction of frameshift of mutational outcomes for each target. Currently, only the Lindel method by Chen et al., 2019 is available. Please type help(predictRelativeFreqIndels) for more details. \begin{scriptsize} <<>>= resultsIndelF <- offTargetAnalysis( inputFilePath, findgRNAsWithREcutOnly = FALSE, findPairedgRNAOnly = FALSE, annotatePaired = FALSE, BSgenomeName = Hsapiens, chromToSearch = "chrX", txdb = TxDb.Hsapiens.UCSC.hg19.knownGene, orgAnn = org.Hs.egSYMBOL, max.mismatch = 1, outputDir = outputDir, overwrite = TRUE, scoring.method = "CFDscore", PAM.pattern = "NNG$|NGN$", predIndelFreq = TRUE) if(exists("resultsIndelF")) { print(head(resultsIndelF$indelFreq[[1]])) mapply(write.table, resultsIndelF$indelFreq, file=paste0(names(resultsIndelF$indelFreq), '.xls'), sep = "\t", row.names = FALSE) } @ \end{scriptsize} \section{References} \begin{thebibliography}{5} \bibitem[Lihua Julie Zhu, 2015]{Zhu, 2015} Lihua Julie Zhu. Overview of guide RNA design tools for CRISPR-Cas9 genome editing technology. Frontiers in Biology. August 2015, Volume 10, Issue 4, pp 289-296 \bibitem[Mali P. et al., 2013]{Mali et al., 2013} Mali P. et al., CAS9 transcriptional activators for target specificity screening and paired nickases for cooperative genome engineering. Nat Biotechnol. 2013. 31(9):833-8 \bibitem[Hsu, P.D et al., 2013]{Hsu et al., 2013} Hsu, P.D. et al., DNA targeting specificity of rNA-guided Cas9 nucleases. Nat Biotechnol. 2013. 31:827-834. \bibitem[Doench JG, Hartenian E, Graham DB, Tothova Z, Hegde M, Smith I, Sullender M, Ebert BL, Xavier RJ, Root DE]{Doench et al., 2014} Doench et al., 2014 Rational design of highly active sgRNAs for CRISPR-Cas9-mediated gene inactivation. Nat Biotechnol. 2014 Sep 3. doi: 10.1038 nbt.3026 \bibitem[Lihua Julie Zhu, Benjamin R. Holmes, Neil Aronin and Michael Brodsky] {Zhu et al., 2014} Lihua Julie Zhu, Benjamin R. Holmes, Neil Aronin and Michael Brodsky. CRISPRseek: a Bioconductor package to identify target-specific guide RNAs for CRISPR-Cas9 genome-editing systems. Plos One Sept 23rd 2014 \bibitem[Moreno-Mateos, M., Vejnar, C., Beaudoin, J. et al., 2015] {Moreno-Mateos, M., Vejnar, C., Beaudoin, J. et al., 2015} Moreno-Mateos, M., Vejnar, C., Beaudoin, J. et al. CRISPRscan: designing highly efficient sgRNAs for CRISPR-Cas9 targeting in vivo. Nat Methods 12, 982–988 (2015) doi:10.1038/nmeth.3543 \bibitem[Doench JG et al., 2016]{Doench et al., 2016} Doench et al. Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9. Nat Biotechnol. 2016 Jan 18. doi:10.1038/nbt.3437 \bibitem[Komor et al., 2016]{Komor et al., 2016}Komor, A. C., Kim, Y. B., Packer, M. S., Zuris, J. A. and Liu, D. R. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 (2016). \bibitem[Gaudelli et al., 2017]{Gaudelli et al., 2017}Gaudelli, N. M. et al. Programmable base editing of A or T to G or C in genomic DNA without DNA cleavage. Nature 551, 464–471 (2017). \bibitem[Kim et al., 2018]{Kim et al., 2018}Kim et al., Deep learning improves prediction of CRISPR–Cpf1 guide RNA activityNat Biotechnol 36, 239–241 (2018). https://doi.org/10.1038/nbt.4061 \bibitem[Chen et al., 2019]{Che et al., 2019}Wei Chen, Aaron McKenna, Jacob Schreiber et al., Massively parallel profiling and predictive modeling of the outcomes of CRISPR/Cas9-mediated double-strand break repair, Nucleic Acids Research, Volume 47, Issue 15, 05 September 2019, Pages 7989–8003, https://doi.org/10.1093/nar/gkz487. \end{thebibliography} \section{Session Info} <<>>= sessionInfo() @ \end{document}