Abstract
NxtIRFdata is a data package containing ready-made BED files of Mappability exclusion genomic regions. It also contains a fully-functioning example data set with a “mock” genome and genome annotation to demonstrate the functionalities of SpliceWiz, a powerful and interactive analysis and visualisation tool for alternative splicing.
NxtIRFcore’s full functionality (plus more) will be replaced by SpliceWiz in Bioconductor version 3.16 onwards!
To install this package, start R (version “4.2.1”) and enter:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("NxtIRFdata")
Start using NxtIRFdata:
Examples in SpliceWiz are demonstrated using an artificial genome and gene annotation. A synthetic reference, with genome sequence (FASTA) and gene annotation (GTF) files are provided, based on the genes SRSF1, SRSF2, SRSF3, TRA2A, TRA2B, TP53 and NSUN5. These genes, each with an additional 100 flanking nucleotides, were used to construct an artificial “chromosome Z” (chrZ). Gene annotations, based on release-94 of Ensembl GRCh38 (hg38), were modified with genome coordinates corresponding to this artificial chromosome.
These files can be accessed as follows:
The set of 6 BAM files used in the SpliceWiz vignette / example code can be downloaded to a path of the user’s choice using the following function:
Note that this downloads BAM files and not their respective BAI (BAM file indices). This is because SpliceWiz reads BAM files natively and does not require RSamtools. BAI files are provided with BAM files in their respective ExperimentHub entries for users wishing to view these files using RSamtools.
NxtIRFdata retrieves the relevant records from AnnotationHub and makes a local copy of the BED file. This BED file is used to produce Mappability Exclusion information to SpliceWiz.
Note that this function is intended to be called internally by SpliceWiz. Users interested in the format or nature of the Mappability BED file can call this function to examine the contents of the BED file
# To get the MappabilityExclusion for hg38 as a GRanges object
gr = get_mappability_exclusion(genome_type = "hg38", as_type = "GRanges")
# To get the MappabilityExclusion for hg38 as a locally-copied gzipped BED file
bed_path = get_mappability_exclusion(genome_type = "hg38", as_type = "bed.gz",
path = tempdir())
# Other `genome_type` values include "hg19", "mm10", and "mm9"
The data deposited in ExperimentHub can be accessed as follows:
For more information about the example BAM files, refer to the NxtIRFdata package documentation:
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /home/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_GB LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/New_York
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] ExperimentHub_2.14.0 AnnotationHub_3.14.0 BiocFileCache_2.14.0
## [4] dbplyr_2.5.0 BiocGenerics_0.52.0 NxtIRFdata_1.12.0
##
## loaded via a namespace (and not attached):
## [1] tidyselect_1.2.1 dplyr_1.1.4
## [3] blob_1.2.4 filelock_1.0.3
## [5] R.utils_2.12.3 Biostrings_2.74.0
## [7] bitops_1.0-9 fastmap_1.2.0
## [9] RCurl_1.98-1.16 GenomicAlignments_1.42.0
## [11] XML_3.99-0.17 digest_0.6.37
## [13] lifecycle_1.0.4 KEGGREST_1.46.0
## [15] RSQLite_2.3.7 magrittr_2.0.3
## [17] compiler_4.4.1 rlang_1.1.4
## [19] sass_0.4.9 tools_4.4.1
## [21] utf8_1.2.4 yaml_2.3.10
## [23] rtracklayer_1.66.0 knitr_1.48
## [25] S4Arrays_1.6.0 bit_4.5.0
## [27] curl_5.2.3 DelayedArray_0.32.0
## [29] abind_1.4-8 BiocParallel_1.40.0
## [31] withr_3.0.2 purrr_1.0.2
## [33] R.oo_1.26.0 grid_4.4.1
## [35] stats4_4.4.1 fansi_1.0.6
## [37] SummarizedExperiment_1.36.0 cli_3.6.3
## [39] rmarkdown_2.28 crayon_1.5.3
## [41] generics_0.1.3 httr_1.4.7
## [43] rjson_0.2.23 DBI_1.2.3
## [45] cachem_1.1.0 zlibbioc_1.52.0
## [47] parallel_4.4.1 AnnotationDbi_1.68.0
## [49] BiocManager_1.30.25 XVector_0.46.0
## [51] restfulr_0.0.15 matrixStats_1.4.1
## [53] vctrs_0.6.5 Matrix_1.7-1
## [55] jsonlite_1.8.9 IRanges_2.40.0
## [57] S4Vectors_0.44.0 bit64_4.5.2
## [59] jquerylib_0.1.4 glue_1.8.0
## [61] codetools_0.2-20 BiocVersion_3.20.0
## [63] GenomeInfoDb_1.42.0 BiocIO_1.16.0
## [65] GenomicRanges_1.58.0 UCSC.utils_1.2.0
## [67] tibble_3.2.1 pillar_1.9.0
## [69] rappdirs_0.3.3 htmltools_0.5.8.1
## [71] GenomeInfoDbData_1.2.13 R6_2.5.1
## [73] evaluate_1.0.1 Biobase_2.66.0
## [75] lattice_0.22-6 png_0.1-8
## [77] R.methodsS3_1.8.2 Rsamtools_2.22.0
## [79] memoise_2.0.1 bslib_0.8.0
## [81] SparseArray_1.6.0 xfun_0.48
## [83] MatrixGenerics_1.18.0 pkgconfig_2.0.3