NanoporeRNASeq contains RNA-Seq data from the K562 and MCF7 cell lines that were generated by the SG-NEx project (https://github.com/GoekeLab/sg-nex-data). Each of these cell line has three replicates, with 1 direct RNA sequencing data and 2 cDNA sequencing data. The files contains reads aligned to the human genome (Grch38) chromosome 22 (1:25500000).
data("SGNexSamples")
SGNexSamples
##> DataFrame with 6 rows and 6 columns
##> sample_id Platform cellLine protocol cancer_type
##> <character> <character> <character> <character> <character>
##> 1 K562_directcDNA_repl.. MinION K562 directcDNA Leukocyte
##> 2 K562_directcDNA_repl.. GridION K562 directcDNA Leukocyte
##> 3 K562_directRNA_repli.. GridION K562 directRNA Leukocyte
##> 4 MCF7_directcDNA_repl.. MinION MCF7 directcDNA Breast
##> 5 MCF7_directcDNA_repl.. GridION MCF7 directcDNA Breast
##> 6 MCF7_directRNA_repli.. GridION MCF7 directRNA Breast
##> fileNames
##> <character>
##> 1 NanoporeRNASeq/versi..
##> 2 NanoporeRNASeq/versi..
##> 3 NanoporeRNASeq/versi..
##> 4 NanoporeRNASeq/versi..
##> 5 NanoporeRNASeq/versi..
##> 6 NanoporeRNASeq/versi..data("HsChr22BambuAnnotation")
HsChr22BambuAnnotation
##> GRangesList object of length 1500:
##> $ENST00000043402
##> GRanges object with 2 ranges and 2 metadata columns:
##> seqnames ranges strand | exon_rank exon_endRank
##> <Rle> <IRanges> <Rle> | <integer> <integer>
##> [1] 22 20241415-20243110 - | 2 1
##> [2] 22 20268071-20268531 - | 1 2
##> -------
##> seqinfo: 1 sequence from an unspecified genome; no seqlengths
##>
##> $ENST00000086933
##> GRanges object with 3 ranges and 2 metadata columns:
##> seqnames ranges strand | exon_rank exon_endRank
##> <Rle> <IRanges> <Rle> | <integer> <integer>
##> [1] 22 19148576-19149095 - | 3 1
##> [2] 22 19149663-19149916 - | 2 2
##> [3] 22 19150025-19150283 - | 1 3
##> -------
##> seqinfo: 1 sequence from an unspecified genome; no seqlengths
##>
##> $ENST00000155674
##> GRanges object with 8 ranges and 2 metadata columns:
##> seqnames ranges strand | exon_rank exon_endRank
##> <Rle> <IRanges> <Rle> | <integer> <integer>
##> [1] 22 17137511-17138357 - | 8 1
##> [2] 22 17138550-17138738 - | 7 2
##> [3] 22 17141059-17141233 - | 6 3
##> [4] 22 17143098-17143131 - | 5 4
##> [5] 22 17145024-17145117 - | 4 5
##> [6] 22 17148448-17148560 - | 3 6
##> [7] 22 17149542-17149745 - | 2 7
##> [8] 22 17165209-17165287 - | 1 8
##> -------
##> seqinfo: 1 sequence from an unspecified genome; no seqlengths
##>
##> ...
##> <1497 more elements>We can visualize the one sample for a single gene ENST00000215832 (MAPK1)
library(ggbio)
range <- HsChr22BambuAnnotation$ENST00000215832
# plot mismatch track
library(BSgenome.Hsapiens.NCBI.GRCh38)
# plot annotation track
tx <- autoplot(range, aes(col = strand), group.selfish = TRUE)
# plot coverage track
coverage <- autoplot(bamFiles[[1]], aes(col = coverage), which = range)
# merge the tracks into one plot
tracks(annotation = tx, coverage = coverage, heights = c(1, 3)) + theme_minimal()Applying bambu to bamFiles
bambu returns a SummarizedExperiment object
se
##> class: RangedSummarizedExperiment
##> dim: 1542 6
##> metadata(2): incompatibleCounts warnings
##> assays(4): counts CPM fullLengthCounts uniqueCounts
##> rownames(1542): BambuTx1 BambuTx2 ... ENST00000641933 ENST00000641967
##> rowData names(11): TXNAME GENEID ... txid eqClassById
##> colnames(6): 1eedff745d12ab_3844 1eedff44d8e8f4_3846 ...
##> 1eedffd3f3e14_3852 1eedff178a3d26_3854
##> colData names(1): nameWe can visualize the annotated and novel isoforms identified in this gene example using plot functions from bambu
##> [[1]]
##> TableGrob (4 x 1) "arrange": 4 grobs
##> z cells name grob
##> 1 1 (2-2,1-1) arrange gtable[layout]
##> 2 2 (3-3,1-1) arrange gtable[layout]
##> 3 3 (4-4,1-1) arrange gtable[layout]
##> 4 4 (1-1,1-1) arrange text[GRID.text.285]
sessionInfo()
##> R version 4.6.0 RC (2026-04-17 r89917)
##> Platform: x86_64-pc-linux-gnu
##> Running under: Ubuntu 24.04.4 LTS
##>
##> Matrix products: default
##> BLAS: /home/biocbuild/bbs-3.24-bioc/R/lib/libRblas.so
##> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##>
##> locale:
##> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
##> [3] LC_TIME=en_GB LC_COLLATE=C
##> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
##> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
##> [9] LC_ADDRESS=C LC_TELEPHONE=C
##> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##>
##> time zone: America/New_York
##> tzcode source: system (glibc)
##>
##> attached base packages:
##> [1] stats4 stats graphics grDevices utils datasets methods
##> [8] base
##>
##> other attached packages:
##> [1] bambu_3.15.0
##> [2] SummarizedExperiment_1.43.0
##> [3] Biobase_2.73.1
##> [4] MatrixGenerics_1.25.0
##> [5] matrixStats_1.5.0
##> [6] BSgenome.Hsapiens.NCBI.GRCh38_1.3.1000
##> [7] BSgenome_1.81.0
##> [8] rtracklayer_1.73.0
##> [9] BiocIO_1.23.3
##> [10] ggbio_1.61.0
##> [11] ggplot2_4.0.3
##> [12] Rsamtools_2.29.0
##> [13] Biostrings_2.81.0
##> [14] XVector_0.53.0
##> [15] GenomicRanges_1.65.0
##> [16] IRanges_2.47.0
##> [17] S4Vectors_0.51.1
##> [18] Seqinfo_1.3.0
##> [19] NanoporeRNASeq_1.23.0
##> [20] ExperimentHub_3.3.0
##> [21] AnnotationHub_4.3.0
##> [22] BiocFileCache_3.3.0
##> [23] dbplyr_2.5.2
##> [24] BiocGenerics_0.59.0
##> [25] generics_0.1.4
##>
##> loaded via a namespace (and not attached):
##> [1] DBI_1.3.0 bitops_1.0-9 RBGL_1.89.0
##> [4] gridExtra_2.3 httr2_1.2.2 formatR_1.14
##> [7] rlang_1.2.0 magrittr_2.0.5 biovizBase_1.61.0
##> [10] otel_0.2.0 compiler_4.6.0 RSQLite_2.4.6
##> [13] GenomicFeatures_1.65.0 png_0.1-9 vctrs_0.7.3
##> [16] reshape2_1.4.5 ProtGenerics_1.45.0 stringr_1.6.0
##> [19] pkgconfig_2.0.3 crayon_1.5.3 fastmap_1.2.0
##> [22] backports_1.5.1 labeling_0.4.3 rmarkdown_2.31
##> [25] graph_1.91.0 UCSC.utils_1.9.0 purrr_1.2.2
##> [28] bit_4.6.0 xfun_0.57 cachem_1.1.0
##> [31] cigarillo_1.3.0 GenomeInfoDb_1.49.0 jsonlite_2.0.0
##> [34] blob_1.3.0 DelayedArray_0.39.0 BiocParallel_1.47.0
##> [37] parallel_4.6.0 cluster_2.1.8.2 VariantAnnotation_1.59.0
##> [40] R6_2.6.1 bslib_0.10.0 stringi_1.8.7
##> [43] RColorBrewer_1.1-3 rpart_4.1.27 xgboost_3.2.1.1
##> [46] jquerylib_0.1.4 Rcpp_1.1.1-1.1 knitr_1.51
##> [49] base64enc_0.1-6 BiocBaseUtils_1.15.0 Matrix_1.7-5
##> [52] nnet_7.3-20 tidyselect_1.2.1 rstudioapi_0.18.0
##> [55] dichromat_2.0-0.1 abind_1.4-8 yaml_2.3.12
##> [58] codetools_0.2-20 curl_7.1.0 lattice_0.22-9
##> [61] tibble_3.3.1 plyr_1.8.9 withr_3.0.2
##> [64] KEGGREST_1.53.0 S7_0.2.2 evaluate_1.0.5
##> [67] foreign_0.8-91 pillar_1.11.1 BiocManager_1.30.27
##> [70] filelock_1.0.3 checkmate_2.3.4 OrganismDbi_1.55.1
##> [73] RCurl_1.98-1.18 ensembldb_2.37.0 BiocVersion_3.24.0
##> [76] scales_1.4.0 glue_1.8.1 lazyeval_0.2.3
##> [79] Hmisc_5.2-5 tools_4.6.0 data.table_1.18.2.1
##> [82] GenomicAlignments_1.49.0 XML_3.99-0.23 grid_4.6.0
##> [85] tidyr_1.3.2 AnnotationDbi_1.75.0 colorspace_2.1-2
##> [88] restfulr_0.0.16 htmlTable_2.5.0 Formula_1.2-5
##> [91] cli_3.6.6 rappdirs_0.3.4 S4Arrays_1.13.0
##> [94] dplyr_1.2.1 AnnotationFilter_1.37.0 gtable_0.3.6
##> [97] sass_0.4.10 digest_0.6.39 SparseArray_1.13.0
##> [100] rjson_0.2.23 htmlwidgets_1.6.4 farver_2.1.2
##> [103] memoise_2.0.1 htmltools_0.5.9 lifecycle_1.0.5
##> [106] httr_1.4.8 bit64_4.8.0