Contents

1 Background

The BiocBuildReporter package provides access to years of Bioconductor build system data, representing a comprehensive record of package builds across:

The Bioconductor build system runs regularly, testing all packages to ensure they meet quality standards and work correctly across different platforms. This dataset captures the results of these builds, including:

This vignette demonstrates how to use BiocBuildReporter functions to explore and analyze this rich dataset to understand:

The Bioconductor Build Report Logs are being processed by BiocBuildDB. The BiocBuildDB creates parquet files read by this package for analysis.

2 Installation and Loading

BiocBuildReporter is a Bioconductor package and can be installed through BiocManager::install().

if (!"BiocManager" %in% rownames(installed.packages()))
     install.packages("BiocManager")
BiocManager::install("BiocBuildReporter", dependencies=TRUE)

After the package is installed, it can be loaded into R workspace by

library(BiocBuildReporter)

We will also load libraries need to run this vignette:

library(BiocBuildReporter)
library(dplyr)
library(ggplot2)
library(tidyr)

3 Accessing Bioconductor Build Report Data

3.1 Getting All Available Tables

The simplest way to start is to download all available data tables. The get_all_bbs_tables() function retrieves all three parquet files containing Bioconductor build data:

# Download all available tables
# This will cache the tables for quick subsequent access
get_all_bbs_tables()

The function downloads three tables:

  1. build_summary: Results of each build stage for every package
  2. info: Package metadata including version, maintainer, and git information
  3. propagation_status: Information about package propagation to the community

3.2 Getting Individual Tables

You can also retrieve individual tables using get_bbs_table():

# Get the build summary table
build_summary <- get_bbs_table("build_summary")

# Get the info table
info <- get_bbs_table("info")

# Get the propagation status table
propagation_status <- get_bbs_table("propagation_status")

Once downloaded, subsequent calls to these functions will use cached data, making analysis much faster.

3.3 Remote Read vs Local Download

The package allows the option to either download locally and read from a locally saved BiocFileCache, or to read files remotely. The default is to download locally and use a cached version of the files for analysis. To read remotely, calls to get_bbs_table or get_all_bbs_tables should be altered to use the argument useLocal=FALSE.

info <- get_bbs_table("info", useLocal=FALSE)

Bioconductor devel reports are daily, meaning cached data can get out of date quickly. If using locally cached data, you can update/re-download the files using the argument updateLocal=TRUE. The default is not to update.

info <- get_bbs_table("info", useLocal=TRUE, updateLocal=TRUE)

4 Package-Specific Queries

This section shows usage of BiocBuildReporter provided helper functions for package specific analysis.

4.1 Package Release Information

The get_package_release_info() function retrieves version and git information for a package across all Bioconductor releases:

# Get release information for BiocFileCache
bfc_releases <- get_package_release_info("BiocFileCache")
bfc_releases
#> # A tibble: 11 × 5
#>    Package       Version git_branch   git_last_commit git_last_commit_date
#>    <chr>         <chr>   <chr>        <chr>           <dttm>              
#>  1 BiocFileCache 2.0.0   RELEASE_3_13 280a8f9         2021-05-19 16:26:36 
#>  2 BiocFileCache 2.2.1   RELEASE_3_14 cc91212         2022-01-20 13:21:33 
#>  3 BiocFileCache 2.4.0   RELEASE_3_15 2c00eee         2022-04-26 15:39:42 
#>  4 BiocFileCache 2.6.1   RELEASE_3_16 fdeb0ad         2023-02-17 11:39:09 
#>  5 BiocFileCache 2.8.0   RELEASE_3_17 d088b32         2023-04-25 14:53:05 
#>  6 BiocFileCache 2.10.2  RELEASE_3_18 c95edcc         2024-03-27 16:42:13 
#>  7 BiocFileCache 2.12.0  RELEASE_3_19 a655653         2024-04-30 14:56:59 
#>  8 BiocFileCache 2.14.0  RELEASE_3_20 66862c5         2024-10-29 15:18:07 
#>  9 BiocFileCache 2.16.2  RELEASE_3_21 22fec96         2025-08-25 12:44:02 
#> 10 BiocFileCache 3.0.0   RELEASE_3_22 81fd6e0         2025-10-29 15:37:29 
#> 11 BiocFileCache 3.1.0   devel        c4f8ba6         2025-10-29 15:37:29

This shows:

  • Package versions across different Bioconductor releases
  • Git branches (devel, RELEASE_3_22, etc.)
  • Git commit hashes
  • Last commit dates

This is useful for tracking when a package was updated in different Bioconductor releases. It also is useful for mapping which git commit correpsonds to the currently available version in a given Bioconductor release.

4.2 Package Build Results

The get_package_build_results() function retrieves latest build status information for a package for all available builders on a given Bioconductor branch:

# Get build results for BiocFileCache on branch RELEASE_3_22
get_package_build_results("BiocFileCache", branch="RELEASE_3_22")
#> # A tibble: 22 × 9
#>    package       node      stage   version status endedat             git_branch
#>    <chr>         <chr>     <chr>   <chr>   <chr>  <dttm>              <chr>     
#>  1 BiocFileCache kjohnson1 buildb… 3.0.0   OK     2025-12-21 11:31:00 RELEASE_3…
#>  2 BiocFileCache kjohnson1 builds… 3.0.0   OK     2025-12-19 01:06:04 RELEASE_3…
#>  3 BiocFileCache kjohnson1 checks… 3.0.0   OK     2025-12-19 21:04:05 RELEASE_3…
#>  4 BiocFileCache kjohnson1 install 3.0.0   OK     2025-12-18 20:29:23 RELEASE_3…
#>  5 BiocFileCache kjohnson3 buildb… 3.0.0   OK     2025-11-11 04:12:43 RELEASE_3…
#>  6 BiocFileCache kjohnson3 builds… 3.0.0   OK     2025-11-10 20:48:12 RELEASE_3…
#>  7 BiocFileCache kjohnson3 checks… 3.0.0   OK     2025-11-10 23:57:58 RELEASE_3…
#>  8 BiocFileCache kjohnson3 install 3.0.0   OK     2025-11-10 20:05:16 RELEASE_3…
#>  9 BiocFileCache lconway   buildb… 3.0.0   OK     2025-11-04 06:29:51 RELEASE_3…
#> 10 BiocFileCache lconway   builds… 3.0.0   OK     2025-11-03 21:17:15 RELEASE_3…
#> # ℹ 12 more rows
#> # ℹ 2 more variables: git_last_commit <chr>, git_last_commit_date <dttm>

This shows:

  • Node (builder machine name)
  • Build stage (install, build, check)
  • Package version
  • Build status (OK, WARNING, ERROR, TIMEOUT)
  • Date of completion
  • Git branch
  • Git commit hashes
  • Last commit dates

This is useful for showing the last known status of a package on a given release for all active builders.

4.3 Package Error Counts

The package_error_count() function provides statistics on how often a package has failed during builds:

# Get error counts for BiocFileCache
bfc_errors <- package_error_count("BiocFileCache")
bfc_errors
#> # A tibble: 408 × 6
#>    node      version    stage    count_total count_error git_branch  
#>    <chr>     <pckg_vrs> <chr>          <int>       <int> <chr>       
#>  1 machv2    2.0.0      buildbin           1           0 RELEASE_3_13
#>  2 machv2    2.0.0      buildsrc           1           0 RELEASE_3_13
#>  3 machv2    2.0.0      checksrc           1           0 RELEASE_3_13
#>  4 machv2    2.0.0      install            1           0 RELEASE_3_13
#>  5 nebbiolo1 2.0.0      buildsrc           1           0 RELEASE_3_13
#>  6 nebbiolo1 2.0.0      checksrc           1           0 RELEASE_3_13
#>  7 nebbiolo1 2.0.0      install            1           0 RELEASE_3_13
#>  8 tokay2    2.0.0      buildbin           1           0 RELEASE_3_13
#>  9 tokay2    2.0.0      buildsrc           1           0 RELEASE_3_13
#> 10 tokay2    2.0.0      checksrc           1           0 RELEASE_3_13
#> # ℹ 398 more rows

# Filter to a specific branch
bfc_errors_release <- package_error_count("BiocFileCache", branch = "RELEASE_3_22")
bfc_errors_release
#> # A tibble: 22 × 6
#>    node      version    stage    count_total count_error git_branch  
#>    <chr>     <pckg_vrs> <chr>          <int>       <int> <chr>       
#>  1 kjohnson1 3.0.0      buildbin           6           0 RELEASE_3_22
#>  2 kjohnson1 3.0.0      buildsrc           8           2 RELEASE_3_22
#>  3 kjohnson1 3.0.0      checksrc           6           2 RELEASE_3_22
#>  4 kjohnson1 3.0.0      install            8           0 RELEASE_3_22
#>  5 kjohnson3 3.0.0      buildbin           8           0 RELEASE_3_22
#>  6 kjohnson3 3.0.0      buildsrc          10           2 RELEASE_3_22
#>  7 kjohnson3 3.0.0      checksrc           8           0 RELEASE_3_22
#>  8 kjohnson3 3.0.0      install           10           0 RELEASE_3_22
#>  9 lconway   3.0.0      buildbin           4           0 RELEASE_3_22
#> 10 lconway   3.0.0      buildsrc           5           1 RELEASE_3_22
#> # ℹ 12 more rows

# Filter to a specific builder
bfc_errors_builder <- package_error_count("BiocFileCache", 
                                          builder = "nebbiolo2", 
                                          branch = "RELEASE_3_22")
bfc_errors_builder
#> # A tibble: 3 × 6
#>   node      version    stage    count_total count_error git_branch  
#>   <chr>     <pckg_vrs> <chr>          <int>       <int> <chr>       
#> 1 nebbiolo2 3.0.0      buildsrc          60           9 RELEASE_3_22
#> 2 nebbiolo2 3.0.0      checksrc          51           7 RELEASE_3_22
#> 3 nebbiolo2 3.0.0      install           60           0 RELEASE_3_22

This returns:

  • Node (builder machine name)
  • Package version
  • Build stage (install, build, check)
  • Total number of runs
  • Total number of errors
  • Git branch

For the devel branch, you can filter to the most recent version:

# Get devel errors
dev_errors <- package_error_count("BiocFileCache", branch = "devel")

# Filter to current devel version
dev_errors |> filter(version == max(version))
#> # A tibble: 11 × 6
#>    node      version    stage    count_total count_error git_branch
#>    <chr>     <pckg_vrs> <chr>          <int>       <int> <chr>     
#>  1 kjohnson3 3.1.0      buildbin          56           0 devel     
#>  2 kjohnson3 3.1.0      buildsrc          69          13 devel     
#>  3 kjohnson3 3.1.0      checksrc          56           2 devel     
#>  4 kjohnson3 3.1.0      install           69           0 devel     
#>  5 lconway   3.1.0      buildbin           9           0 devel     
#>  6 lconway   3.1.0      buildsrc          11           2 devel     
#>  7 lconway   3.1.0      checksrc           9           1 devel     
#>  8 lconway   3.1.0      install           11           0 devel     
#>  9 nebbiolo1 3.1.0      buildsrc         108          14 devel     
#> 10 nebbiolo1 3.1.0      checksrc          94           8 devel     
#> 11 nebbiolo1 3.1.0      install          108           0 devel

This is useful for comparing how often a package failed during a given stage vs how often it attempted that stage. It gives an overview of frquency of failures.

4.4 Package Failures Over Time

The package_failure_over_time() function gives and overview of how long a package has been failing on a given builder:

# Get failure events for BiocFileCache on nebbiolo1 and
# group events in a 24 hour period
package_failures_over_time("BiocFileCache", "nebbiolo1", 24)
#> # A tibble: 21 × 7
#>    version    episode first_failure       last_failure        n_failures stages 
#>    <pckg_vrs>   <int> <dttm>              <dttm>                   <int> <chr>  
#>  1 3.1.0           12 2025-12-19 02:17:53 2025-12-19 02:17:53          1 checks…
#>  2 3.1.0           11 2025-12-15 02:57:57 2025-12-16 21:29:20          3 builds…
#>  3 3.1.0           10 2025-12-13 02:32:33 2025-12-13 02:32:33          1 checks…
#>  4 3.1.0            9 2025-12-08 21:30:14 2025-12-11 21:27:29          4 builds…
#>  5 3.1.0            8 2025-12-05 21:26:25 2025-12-05 21:26:25          1 builds…
#>  6 3.1.0            7 2025-12-01 21:30:34 2025-12-03 21:39:35          3 builds…
#>  7 3.1.0            6 2025-11-25 21:30:29 2025-11-26 21:28:08          2 builds…
#>  8 3.1.0            5 2025-11-22 02:27:43 2025-11-22 02:27:43          1 checks…
#>  9 3.1.0            4 2025-11-18 02:32:56 2025-11-18 02:32:56          1 checks…
#> 10 3.1.0            3 2025-11-15 02:26:25 2025-11-15 02:26:25          1 checks…
#> # ℹ 11 more rows
#> # ℹ 1 more variable: statuses <chr>

This shows:

  • Package version
  • Sequential count of number of failure episodes
  • Time of first event failure
  • Time of last event failure
  • Number of failures during that episode
  • Stages of failures
  • Status of stages

This is useful to track the length of failure events to determine if it is intermittent or consistent. The grouping of events is given as an argument, in this example we used 24 hours. This is to account for branches having different build cadenances and allowing sequential builds to be potentially grouped together.

5 Exploratory Data Analysis

This section provides examples of utilizing the report tables for broader analysis and queries.

5.1 Package Growth Over Time

Let’s explore how the number of Bioconductor packages has grown over time:

# Get info table
info <- get_bbs_table("info")

# Count unique packages by branch
package_counts <- info |>
  group_by(git_branch) |>
  summarise(
    n_packages = n_distinct(Package),
    .groups = "drop"
  ) |>
  arrange(desc(n_packages))

# Display the counts
package_counts
#> # A tibble: 12 × 2
#>    git_branch   n_packages
#>    <chr>             <int>
#>  1 devel              3060
#>  2 RELEASE_3_22       2885
#>  3 RELEASE_3_21       2859
#>  4 RELEASE_3_19       2816
#>  5 RELEASE_3_20       2807
#>  6 RELEASE_3_18       2778
#>  7 RELEASE_3_17       2723
#>  8 RELEASE_3_16       2670
#>  9 RELEASE_3_15       2622
#> 10 RELEASE_3_14       2528
#> 11 RELEASE_3_13       2475
#> 12 master               33

# Visualize package counts by branch
ggplot(package_counts, aes(x = reorder(git_branch, n_packages), y = n_packages)) +
  geom_col(fill = "steelblue") +
  coord_flip() +
  labs(
    title = "Number of Packages by Bioconductor Branch",
    x = "Branch",
    y = "Number of Packages"
  ) +
  theme_minimal()

5.2 Build Status Distribution

Understanding the distribution of build statuses helps identify overall system health:

# Get build summary table
build_summary <- get_bbs_table("build_summary")

# Count build statuses
status_counts <- build_summary |>
  count(status) |>
  arrange(desc(n))

status_counts
#> # A tibble: 4 × 2
#>   status          n
#>   <chr>       <int>
#> 1 OK       13091938
#> 2 WARNINGS   641941
#> 3 ERROR      417673
#> 4 TIMEOUT     17687

# Visualize status distribution
ggplot(status_counts, aes(x = reorder(status, n), y = n)) +
  geom_col(aes(fill = status)) +
  scale_fill_manual(values = c(
    "OK" = "green3",
    "WARNING" = "orange",
    "ERROR" = "red",
    "TIMEOUT" = "darkred"
  )) +
  coord_flip() +
  labs(
    title = "Distribution of Build Statuses",
    x = "Status",
    y = "Count"
  ) +
  theme_minimal() +
  theme(legend.position = "none")

5.3 Platform-Specific Analysis

Different platforms may have different build characteristics:

# Analyze build status by platform (node)
platform_status <- build_summary |>
  group_by(node, status) |>
  summarise(count = n(), .groups = "drop") |>
  group_by(node) |>
  mutate(
    total = sum(count),
    percentage = count / total * 100
  ) |>
  ungroup()

# Show error rates by platform
error_rates <- platform_status |>
  filter(status %in% c("ERROR", "TIMEOUT")) |>
  group_by(node) |>
  summarise(
    error_count = sum(count),
    total = first(total),
    error_rate = sum(percentage),
    .groups = "drop"
  ) |>
  arrange(desc(error_rate))

head(error_rates, 10)
#> # A tibble: 10 × 4
#>    node      error_count   total error_rate
#>    <chr>           <int>   <int>      <dbl>
#>  1 kakapo1           144     953      15.1 
#>  2 riesling1          12      88      13.6 
#>  3 amarone           226    1679      13.5 
#>  4 biocgpu           272    2808       9.69
#>  5 taishan         55374  944243       5.86
#>  6 kunpeng2        50358  907145       5.55
#>  7 kjohnson3       88229 2437076       3.62
#>  8 nebbiolo1       64831 2106213       3.08
#>  9 kjohnson2         251    8628       2.91
#> 10 palomino3       11216  436957       2.57

5.4 Build Stage Analysis

Understanding which build stage most often fails:

# Analyze failures by stage
stage_failures <- build_summary |>
  filter(status %in% c("ERROR", "TIMEOUT")) |>
  count(stage, status) |>
  arrange(desc(n))

stage_failures
#> # A tibble: 8 × 3
#>   stage    status       n
#>   <chr>    <chr>    <int>
#> 1 buildsrc ERROR   219184
#> 2 checksrc ERROR   116956
#> 3 install  ERROR    79831
#> 4 checksrc TIMEOUT   9484
#> 5 buildsrc TIMEOUT   8114
#> 6 buildbin ERROR     1702
#> 7 buildbin TIMEOUT     57
#> 8 install  TIMEOUT     32

# Visualize
ggplot(stage_failures, aes(x = stage, y = n, fill = status)) +
  geom_col() +
  scale_fill_manual(values = c("ERROR" = "red", "TIMEOUT" = "darkred")) +
  labs(
    title = "Build Failures by Stage",
    x = "Build Stage",
    y = "Number of Failures",
    fill = "Status"
  ) +
  theme_minimal()

5.5 Most Problematic Packages

Identify packages with the highest error rates:

# Find packages with most errors
package_errors <- build_summary |>
  filter(status %in% c("ERROR", "TIMEOUT")) |>
  count(package, status) |>
  group_by(package) |>
  summarise(
    total_errors = sum(n),
    .groups = "drop"
  ) |>
  arrange(desc(total_errors))

# Top 10 packages with most errors
head(package_errors, 10)
#> # A tibble: 10 × 2
#>    package    total_errors
#>    <chr>             <int>
#>  1 lapmix             2214
#>  2 netZooR            1751
#>  3 hypeR              1689
#>  4 ChemmineOB         1594
#>  5 XNAString          1526
#>  6 Repitools          1514
#>  7 ccfindR            1480
#>  8 gpuMagic           1468
#>  9 slalom             1399
#> 10 Harshlight         1395

5.6 Maintainer Analysis

Analyze package maintenance patterns:

# Get unique packages per maintainer
maintainer_packages <- info |>
  group_by(Maintainer) |>
  summarise(
    n_packages = n_distinct(Package),
    packages = paste(unique(Package), collapse = ", "),
    .groups = "drop"
  ) |>
  arrange(desc(n_packages))

# Top maintainers by number of packages
head(maintainer_packages, 10)
#> # A tibble: 10 × 3
#>    Maintainer                      n_packages packages                          
#>    <chr>                                <int> <chr>                             
#>  1 Bioconductor Package Maintainer         72 GSE62944, hgu2beta7, TENxBrainDat…
#>  2 Aaron Lun                               62 celldex, chipseqDBData, DropletTe…
#>  3 Hervé Pagès                             25 pasillaBamSubset, RNAseqData.HNRN…
#>  4 VJ Carey                                25 harbChIP, leeBamViews, MAQCsubset…
#>  5 Laurent Gatto                           24 depmap, RforProteomics, CTdata, h…
#>  6 Marcel Ramos                            24 curatedTCGAData, SingleCellMultiM…
#>  7 Michael Love                            20 airway, fission, macrophage, null…
#>  8 Jianhong Ou                             17 ATACseqQC, ATACseqTFEA, ChIPpeakA…
#>  9 Guangchuang Yu                          16 ChIPseeker, clusterProfiler, DOSE…
#> 10 Mike Smith                              16 BeadArrayUseCases, HD2013SGI, min…

# Distribution of packages per maintainer
ggplot(maintainer_packages, aes(x = n_packages)) +
  geom_histogram(binwidth = 1, fill = "steelblue", color = "white") +
  labs(
    title = "Distribution of Packages per Maintainer",
    x = "Number of Packages",
    y = "Number of Maintainers"
  ) +
  theme_minimal()

5.7 Temporal Analysis

Analyze build patterns over time:

# Analyze build patterns over time
build_summary <- build_summary |>
  mutate(
    date = as.Date(startedat),
    month = format(startedat, "%Y-%m")
  )

# Build activity by month
monthly_builds <- build_summary |>
  count(month) |>
  mutate(month_date = as.Date(paste0(month, "-01")))

ggplot(monthly_builds, aes(x = month_date, y = n)) +
  geom_line(color = "steelblue", linewidth = 1) +
  geom_point(color = "steelblue") +
  labs(
    title = "Build Activity Over Time",
    x = "Month",
    y = "Number of Builds"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))


# Error rate over time
monthly_errors <- build_summary |>
  group_by(month) |>
  summarise(
    total = n(),
    errors = sum(status %in% c("ERROR", "TIMEOUT")),
    error_rate = errors / total * 100,
    .groups = "drop"
  ) |>
  mutate(month_date = as.Date(paste0(month, "-01")))

ggplot(monthly_errors, aes(x = month_date, y = error_rate)) +
  geom_line(color = "red", linewidth = 1) +
  geom_point(color = "red") +
  labs(
    title = "Build Error Rate Over Time",
    x = "Month",
    y = "Error Rate (%)"
  ) +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

6 Bioconductor Report Overview

This sections demonstrates usage of BiocBuildReporter helper functions for broader analysis.

6.1 Get Bioconductor Build Report

The get_build_report() function will retrieve the Bioconductor Build Report for any given day. Optionally you may specify a specific Bioconductor git branch or build machine name to filter further.

# Retrieves the build report for all packages on December 29, 2025
# Filtering also for RELEASE_3_22 branch and linux "nebbiolo1" build machine
get_build_report("2025-12-29", branch="RELEASE_3_22", builder="nebbiolo2")
#> # A tibble: 3,602 × 12
#>    package node     stage version status startedat           endedat            
#>    <chr>   <chr>    <fct> <pckg_> <chr>  <dttm>              <dttm>             
#>  1 ABSSeq  nebbiol… inst… 1.64.0  OK     2025-12-29 20:05:54 2025-12-29 20:06:15
#>  2 ABSSeq  nebbiol… buil… 1.64.0  OK     2025-12-29 21:11:23 2025-12-29 21:12:29
#>  3 ABarray nebbiol… inst… 1.78.0  OK     2025-12-29 20:16:38 2025-12-29 20:16:47
#>  4 ABarray nebbiol… buil… 1.78.0  OK     2025-12-29 21:11:23 2025-12-29 21:11:36
#>  5 ACE     nebbiol… inst… 1.28.0  OK     2025-12-29 20:18:52 2025-12-29 20:19:11
#>  6 ACE     nebbiol… buil… 1.28.0  OK     2025-12-29 21:11:23 2025-12-29 21:12:19
#>  7 ACME    nebbiol… inst… 2.66.0  OK     2025-12-29 20:14:48 2025-12-29 20:14:54
#>  8 ACME    nebbiol… buil… 2.66.0  OK     2025-12-29 21:11:23 2025-12-29 21:11:44
#>  9 ADAM    nebbiol… inst… 1.26.0  OK     2025-12-29 20:41:39 2025-12-29 20:42:11
#> 10 ADAM    nebbiol… buil… 1.26.0  OK     2025-12-29 21:11:23 2025-12-29 21:12:07
#> # ℹ 3,592 more rows
#> # ℹ 5 more variables: command <chr>, report_md5 <chr>, git_branch <chr>,
#> #   git_last_commit <chr>, git_last_commit_date <dttm>

This shows:

  • Package name
  • Node (builder machine name)
  • Build stage (install, build, check)
  • Package version
  • Status of stage (OK, WARNING, ERROR, TIMEOUT)
  • Time started
  • Time completed
  • Command utilized to initiate run
  • Report md5 sum
  • Git branch
  • Git commit hashes
  • Last commit dates

This function is meant to programatically reproduce a daily report for any given day for any given builder and branch.

6.2 Get List of Failing Packages

The get_failing_packages() function will return a list of all the currently failing packages for a given branch and build machine name:

# returns all failing packages for RELEASE_3_22 branch
# for build machine nebbolo2
get_failing_packages("RELEASE_3_22", "nebbiolo2")
#> # A tibble: 657 × 6
#>    git_branch   package                  version    node      stages    statuses
#>    <chr>        <chr>                    <pckg_vrs> <chr>     <chr>     <chr>   
#>  1 RELEASE_3_22 AHMassBank               1.10.0     nebbiolo2 checksrc  ERROR   
#>  2 RELEASE_3_22 AMARETTO                 1.26.0     nebbiolo2 buildsrc  ERROR   
#>  3 RELEASE_3_22 ANCOMBC                  2.12.0     nebbiolo2 buildsrc… ERROR   
#>  4 RELEASE_3_22 ANF                      1.32.0     nebbiolo2 buildsrc  ERROR   
#>  5 RELEASE_3_22 APAlyzer                 1.24.0     nebbiolo2 checksrc  ERROR   
#>  6 RELEASE_3_22 ASURAT                   1.14.0     nebbiolo2 buildsrc… ERROR   
#>  7 RELEASE_3_22 AUCell                   1.32.0     nebbiolo2 buildsrc  ERROR   
#>  8 RELEASE_3_22 AWAggregator             1.0.0      nebbiolo2 buildsrc  ERROR   
#>  9 RELEASE_3_22 AWAggregatorData         1.0.0      nebbiolo2 buildsrc  ERROR, …
#> 10 RELEASE_3_22 AlphaMissense.v2023.hg19 3.18.2     nebbiolo2 buildsrc  ERROR   
#> # ℹ 647 more rows

This shows:

  • Git branch
  • Package name
  • Package version
  • Node (builder machine name)
  • Build stages (install, build, check)
  • Build statuses (OK, WARNING, ERROR, TIMEOUT)

This gives a quick list of all failing packages. If querying the currently active Bioconductor branches (see get_latest_branches()), maintainers of these packages should be contacted to fix their packages to avoid deprecation and removal from Bioconductor.

7 Conclusion

The BiocBuildReporter package provides powerful tools for analyzing Bioconductor build system data. This vignette demonstrated:

  1. Data Access: Using get_bbs_table() and get_all_bbs_tables() to retrieve build data
  2. Package-Specific Queries: Using get_package_release_info(), get_package_build_results(), package_error_count() and package_failures_over_time() to analyze individual packages
  3. Exploratory Analysis: Examining package growth, build statuses, platform differences, and temporal patterns
  4. Bioconductor Report Overview: Using get_build_report() and get_failing_packages()

This dataset can help package developers, maintainers, and the Bioconductor community to:

For more information about specific functions, see their documentation with ?function_name.


sessionInfo()
#> R Under development (unstable) (2026-03-05 r89546)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.23-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] tidyr_1.3.2              ggplot2_4.0.2            dplyr_1.2.0             
#> [4] BiocBuildReporter_0.99.2 BiocStyle_2.39.0        
#> 
#> loaded via a namespace (and not attached):
#>  [1] utf8_1.2.6          rappdirs_0.3.4      sass_0.4.10        
#>  [4] generics_0.1.4      RSQLite_2.4.6       digest_0.6.39      
#>  [7] magrittr_2.0.4      evaluate_1.0.5      grid_4.6.0         
#> [10] RColorBrewer_1.1-3  bookdown_0.46       fastmap_1.2.0      
#> [13] blob_1.3.0          jsonlite_2.0.0      DBI_1.3.0          
#> [16] tinytex_0.58        BiocManager_1.30.27 purrr_1.2.1        
#> [19] scales_1.4.0        httr2_1.2.2         jquerylib_0.1.4    
#> [22] cli_3.6.5           rlang_1.1.7         dbplyr_2.5.2       
#> [25] bit64_4.6.0-1       withr_3.0.2         cachem_1.1.0       
#> [28] yaml_2.3.12         otel_0.2.0          tools_4.6.0        
#> [31] memoise_2.0.1       filelock_1.0.3      curl_7.0.0         
#> [34] assertthat_0.2.1    vctrs_0.7.1         R6_2.6.1           
#> [37] magick_2.9.1        BiocFileCache_3.1.0 lifecycle_1.0.5    
#> [40] bit_4.6.0           arrow_23.0.1.1      pkgconfig_2.0.3    
#> [43] pillar_1.11.1       bslib_0.10.0        gtable_0.3.6       
#> [46] Rcpp_1.1.1          glue_1.8.0          xfun_0.56          
#> [49] tibble_3.3.1        tidyselect_1.2.1    knitr_1.51         
#> [52] dichromat_2.0-0.1   farver_2.1.2        htmltools_0.5.9    
#> [55] labeling_0.4.3      rmarkdown_2.30      compiler_4.6.0     
#> [58] S7_0.2.1