Contents

0.1 Overview

This article combines the final benchmark runs used for the current BamScale benchmark summary:

These runs were generated with the same server-first benchmark harness, the same balanced profile family, the same deterministic case order, and the same worker/thread budget policy. Seqqual is reported separately because it includes both the fair compatibility track and the optimized compact track.

0.2 Data Loading

0.3 Benchmark Provenance

Run Workloads Profile CPU Logical cores RAM (GB) Successful cases
run_20260320_133141 step1, galignments balanced Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz 96 723.6 32
run_20260320_162359 seqqual balanced Intel(R) Xeon(R) Gold 6252 CPU @ 2.10GHz 96 723.6 26

0.4 Methods Rationale

This benchmark suite covers three distinct access patterns:

Across all runs, the benchmark design emphasizes:

0.5 Input Files

The same underlying four selected BAMs were used for both runs, with repeated files allowed to populate the 12-file multi-file benchmark set.

file source selected_for_single selected_for_multi size_mb has_index
/home/chiragp/.cache/R/ExperimentHub/134ab745547e62_2073 chipseqDBData TRUE TRUE 548.3 TRUE
/home/chiragp/.cache/R/ExperimentHub/134ab7270c5da5_2072 chipseqDBData FALSE TRUE 320.4 TRUE
/home/chiragp/.cache/R/ExperimentHub/134ab721bfa655_2071 chipseqDBData FALSE TRUE 305.2 TRUE
/home/chiragp/.cache/R/ExperimentHub/134ab7231ef47d_2074 chipseqDBData FALSE TRUE 227.3 TRUE

0.6 Reference Counts

run scenario workload n_files n_records total_mb
run_20260320_133141 multi galignments 12 77543925 4203.4
run_20260320_133141 multi step1 12 132482148 4203.4
run_20260320_133141 single galignments 1 4670364 548.3
run_20260320_133141 single step1 1 16675372 548.3
run_20260320_162359 multi seqqual 12 132482148 4203.4
run_20260320_162359 single seqqual 1 16675372 548.3

0.7 Best-Observed Summary

Scenario Workload Track Method family Method Plan Median time (s)
multi galignments fair BamScale BamScale (balanced budget) 1x48 61.204
multi galignments fair GenomicAlignments GenomicAlignments::readGAlignments + BiocParallel 12x1 90.777
multi seqqual fair BamScale BamScale (balanced budget) 1x48 181.277
multi seqqual fair Rsamtools Rsamtools::scanBam + BiocParallel 12x1 134.667
multi seqqual optimized BamScale BamScale (compact seqqual budget) 12x4 175.451
multi step1 fair BamScale BamScale (balanced budget) 1x48 68.469
multi step1 fair Rsamtools Rsamtools::scanBam + BiocParallel 12x1 107.882
single galignments fair BamScale BamScale 1x48 4.047
single galignments fair GenomicAlignments GenomicAlignments::readGAlignments 1x1 10.568
single seqqual fair BamScale BamScale 1x48 26.295
single seqqual fair Rsamtools Rsamtools::scanBam 1x1 24.327
single seqqual optimized BamScale BamScale (compact seqqual) 1x48 15.026
single step1 fair BamScale BamScale 1x48 8.035
single step1 fair Rsamtools Rsamtools::scanBam 1x1 15.403

0.8 BamScale-versus-Comparator Fold Change

Scenario Workload Track Comparator family Comparator method Comparator plan Comparator time (s) BamScale method BamScale plan BamScale time (s) Comparator/BamScale BamScale % faster
single galignments fair GenomicAlignments GenomicAlignments::readGAlignments 1x1 10.568 BamScale 1x48 4.047 2.611 61.705
single step1 fair Rsamtools Rsamtools::scanBam 1x1 15.403 BamScale 1x48 8.035 1.917 47.835
multi galignments fair GenomicAlignments GenomicAlignments::readGAlignments + BiocParallel 12x1 90.777 BamScale (balanced budget) 1x48 61.204 1.483 32.578
multi step1 fair Rsamtools Rsamtools::scanBam + BiocParallel 12x1 107.882 BamScale (balanced budget) 1x48 68.469 1.576 36.534
single seqqual fair Rsamtools Rsamtools::scanBam 1x1 24.327 BamScale 1x48 26.295 0.925 -8.088
single seqqual optimized Rsamtools Rsamtools::scanBam 1x1 24.327 BamScale (compact seqqual) 1x48 15.026 1.619 38.232
multi seqqual fair Rsamtools Rsamtools::scanBam + BiocParallel 12x1 134.667 BamScale (balanced budget) 1x48 181.277 0.743 -34.611
multi seqqual optimized Rsamtools Rsamtools::scanBam + BiocParallel 12x1 134.667 BamScale (compact seqqual budget) 12x4 175.451 0.768 -30.285

0.9 Single-File step1

Method family Method Plan Median time (s)
BamScale BamScale 1x48 8.0350
BamScale BamScale 1x24 8.9795
BamScale BamScale 1x12 9.3775
BamScale BamScale 1x4 11.0565
Rsamtools Rsamtools::scanBam 1x1 15.4030
BamScale BamScale 1x1 16.3500

0.10 Single-File galignments

Method family Method Plan Median time (s)
BamScale BamScale 1x48 4.0470
BamScale BamScale 1x24 4.1920
BamScale BamScale 1x12 4.7585
BamScale BamScale 1x4 6.6230
GenomicAlignments GenomicAlignments::readGAlignments 1x1 10.5680
BamScale BamScale 1x1 11.0405

0.11 Multi-File step1

0.12 Multi-File galignments

0.13 Single-File seqqual

Method Plan Median time (s)
BamScale (compact) 1x48 15.0265
BamScale (compact) 1x24 16.4440
BamScale (compact) 1x12 17.0985
BamScale (compact) 1x4 19.8780
Rsamtools::scanBam 1x1 24.3275
BamScale (compact) 1x1 25.5300
BamScale (compatible) 1x48 26.2950
BamScale (compatible) 1x12 27.6200
BamScale (compatible) 1x24 28.3390
BamScale (compatible) 1x4 31.6780
BamScale (compatible) 1x1 35.6830

0.14 Multi-File seqqual

0.15 Compact-versus-Compatible seqqual

Scenario Workload Plan Compatible time (s) Compact time (s) Compatible/Compact
multi seqqual 1x48 181.277 186.321 0.973
multi seqqual 2x24 573.752 337.596 1.700
multi seqqual 4x12 404.183 241.524 1.673
multi seqqual 8x6 361.750 210.290 1.720
multi seqqual 12x4 270.889 175.451 1.544
single seqqual 1x1 35.683 25.530 1.398
single seqqual 1x4 31.678 19.878 1.594
single seqqual 1x12 27.620 17.098 1.615
single seqqual 1x24 28.339 16.444 1.723
single seqqual 1x48 26.295 15.026 1.750

0.16 Interpretation

The benchmark shows a consistent pattern across the final runs:

In practical terms, these results support the following guidance:

0.17 Session Information

## R version 4.6.0 RC (2026-04-17 r89917)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB              LC_COLLATE=C              
##  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/New_York
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] scales_1.4.0     ggplot2_4.0.3    tidyr_1.3.2      dplyr_1.2.1     
## [5] readr_2.2.0      BamScale_0.99.9  BiocStyle_2.40.0
## 
## loaded via a namespace (and not attached):
##  [1] SummarizedExperiment_1.42.0 gtable_0.3.6               
##  [3] xfun_0.57                   bslib_0.10.0               
##  [5] Biobase_2.72.0              lattice_0.22-9             
##  [7] tzdb_0.5.0                  vctrs_0.7.3                
##  [9] tools_4.6.0                 bitops_1.0-9               
## [11] generics_0.1.4              stats4_4.6.0               
## [13] parallel_4.6.0              tibble_3.3.1               
## [15] pkgconfig_2.0.3             Matrix_1.7-5               
## [17] RColorBrewer_1.1-3          S7_0.2.2                   
## [19] S4Vectors_0.50.0            cigarillo_1.2.0            
## [21] lifecycle_1.0.5             compiler_4.6.0             
## [23] farver_2.1.2                Rsamtools_2.28.0           
## [25] Biostrings_2.80.0           tinytex_0.59               
## [27] Seqinfo_1.2.0               codetools_0.2-20           
## [29] htmltools_0.5.9             sass_0.4.10                
## [31] yaml_2.3.12                 pillar_1.11.1              
## [33] crayon_1.5.3                jquerylib_0.1.4            
## [35] BiocParallel_1.46.0         DelayedArray_0.38.1        
## [37] cachem_1.1.0                magick_2.9.1               
## [39] abind_1.4-8                 ompBAM_1.16.0              
## [41] tidyselect_1.2.1            digest_0.6.39              
## [43] purrr_1.2.2                 bookdown_0.46              
## [45] labeling_0.4.3              fastmap_1.2.0              
## [47] grid_4.6.0                  cli_3.6.6                  
## [49] SparseArray_1.12.2          magrittr_2.0.5             
## [51] S4Arrays_1.12.0             dichromat_2.0-0.1          
## [53] withr_3.0.2                 bit64_4.8.0                
## [55] rmarkdown_2.31              XVector_0.52.0             
## [57] matrixStats_1.5.0           bit_4.6.0                  
## [59] otel_0.2.0                  hms_1.1.4                  
## [61] evaluate_1.0.5              knitr_1.51                 
## [63] GenomicRanges_1.64.0        IRanges_2.46.0             
## [65] rlang_1.2.0                 Rcpp_1.1.1-1.1             
## [67] glue_1.8.1                  BiocManager_1.30.27        
## [69] BiocGenerics_0.58.0         vroom_1.7.1                
## [71] jsonlite_2.0.0              R6_2.6.1                   
## [73] MatrixGenerics_1.24.0       GenomicAlignments_1.48.0