0 Important announcement

NxtIRFcore’s full functionality (plus more) will be replaced by SpliceWiz in Bioconductor version 3.16 onwards!

1 Installation

To install this package, start R (version “4.2.1”) and enter:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("NxtIRFdata")

Start using NxtIRFdata:

library(NxtIRFdata)

2 Obtaining the example NxtIRF genome and gene annotation files

Examples in SpliceWiz are demonstrated using an artificial genome and gene annotation. A synthetic reference, with genome sequence (FASTA) and gene annotation (GTF) files are provided, based on the genes SRSF1, SRSF2, SRSF3, TRA2A, TRA2B, TP53 and NSUN5. These genes, each with an additional 100 flanking nucleotides, were used to construct an artificial “chromosome Z” (chrZ). Gene annotations, based on release-94 of Ensembl GRCh38 (hg38), were modified with genome coordinates corresponding to this artificial chromosome.

These files can be accessed as follows:

example_fasta = chrZ_genome()
example_gtf = chrZ_gtf()

3 Obtaining the example BAM file dataset for SpliceWiz

The set of 6 BAM files used in the SpliceWiz vignette / example code can be downloaded to a path of the user’s choice using the following function:

bam_paths = example_bams(path = tempdir())

Note that this downloads BAM files and not their respective BAI (BAM file indices). This is because SpliceWiz reads BAM files natively and does not require RSamtools. BAI files are provided with BAM files in their respective ExperimentHub entries for users wishing to view these files using RSamtools.

4 Obtaining the Mappability Exclusion BED files

NxtIRFdata retrieves the relevant records from AnnotationHub and makes a local copy of the BED file. This BED file is used to produce Mappability Exclusion information to SpliceWiz.

Note that this function is intended to be called internally by SpliceWiz. Users interested in the format or nature of the Mappability BED file can call this function to examine the contents of the BED file

# To get the MappabilityExclusion for hg38 as a GRanges object
gr = get_mappability_exclusion(genome_type = "hg38", as_type = "GRanges")

# To get the MappabilityExclusion for hg38 as a locally-copied gzipped BED file
bed_path = get_mappability_exclusion(genome_type = "hg38", as_type = "bed.gz",
    path = tempdir())

# Other `genome_type` values include "hg19", "mm10", and "mm9"

5 Accessing NxtIRFdata via ExperimentHub

The data deposited in ExperimentHub can be accessed as follows:

library(ExperimentHub)
eh = ExperimentHub()
NxtIRF_hub = query(eh, "NxtIRF")

NxtIRF_hub

temp = eh[["EH6792"]]
temp

temp = eh[["EH6787"]]
temp

5 Information about the example BAM files

For more information about the example BAM files, refer to the NxtIRFdata package documentation:

?`NxtIRFdata-package`

SessionInfo

sessionInfo()
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] ExperimentHub_2.14.0 AnnotationHub_3.14.0 BiocFileCache_2.14.0
## [4] dbplyr_2.5.0         BiocGenerics_0.52.0  NxtIRFdata_1.12.0   
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1            dplyr_1.1.4                
##  [3] blob_1.2.4                  filelock_1.0.3             
##  [5] R.utils_2.12.3              Biostrings_2.74.0          
##  [7] bitops_1.0-9                fastmap_1.2.0              
##  [9] RCurl_1.98-1.16             GenomicAlignments_1.42.0   
## [11] XML_3.99-0.17               digest_0.6.37              
## [13] lifecycle_1.0.4             KEGGREST_1.46.0            
## [15] RSQLite_2.3.7               magrittr_2.0.3             
## [17] compiler_4.4.1              rlang_1.1.4                
## [19] sass_0.4.9                  tools_4.4.1                
## [21] utf8_1.2.4                  yaml_2.3.10                
## [23] rtracklayer_1.66.0          knitr_1.48                 
## [25] S4Arrays_1.6.0              bit_4.5.0                  
## [27] curl_5.2.3                  DelayedArray_0.32.0        
## [29] abind_1.4-8                 BiocParallel_1.40.0        
## [31] withr_3.0.2                 purrr_1.0.2                
## [33] R.oo_1.26.0                 grid_4.4.1                 
## [35] stats4_4.4.1                fansi_1.0.6                
## [37] SummarizedExperiment_1.36.0 cli_3.6.3                  
## [39] rmarkdown_2.28              crayon_1.5.3               
## [41] generics_0.1.3              httr_1.4.7                 
## [43] rjson_0.2.23                DBI_1.2.3                  
## [45] cachem_1.1.0                zlibbioc_1.52.0            
## [47] parallel_4.4.1              AnnotationDbi_1.68.0       
## [49] BiocManager_1.30.25         XVector_0.46.0             
## [51] restfulr_0.0.15             matrixStats_1.4.1          
## [53] vctrs_0.6.5                 Matrix_1.7-1               
## [55] jsonlite_1.8.9              IRanges_2.40.0             
## [57] S4Vectors_0.44.0            bit64_4.5.2                
## [59] jquerylib_0.1.4             glue_1.8.0                 
## [61] codetools_0.2-20            BiocVersion_3.20.0         
## [63] GenomeInfoDb_1.42.0         BiocIO_1.16.0              
## [65] GenomicRanges_1.58.0        UCSC.utils_1.2.0           
## [67] tibble_3.2.1                pillar_1.9.0               
## [69] rappdirs_0.3.3              htmltools_0.5.8.1          
## [71] GenomeInfoDbData_1.2.13     R6_2.5.1                   
## [73] evaluate_1.0.1              Biobase_2.66.0             
## [75] lattice_0.22-6              png_0.1-8                  
## [77] R.methodsS3_1.8.2           Rsamtools_2.22.0           
## [79] memoise_2.0.1               bslib_0.8.0                
## [81] SparseArray_1.6.0           xfun_0.48                  
## [83] MatrixGenerics_1.18.0       pkgconfig_2.0.3