quantbayes
================

<!-- badges: start -->

<!-- badges: end -->

# quantbayes

`quantbayes` provides a minimal Bayesian transform for evidence
sufficiency from a binary matrix of zero and one entries. The method is
simple, portable, and independent of any rule set.

# Installation

``` r
# Install development version
remotes::install_github("switzerland-omics/quantbayes")
```

# Data layout

`quantbayes` expects a matrix where:

- rows are variants
- columns are evidence rules
- entries are 0 or 1
- NA is treated as 0

Example from the built in dataset:

``` r
head(core_test_data)
```

Convert to matrix:

``` r
x <- as.matrix(core_test_data[, -1])
rownames(x) <- core_test_data[[1]]
```

# Run quantbayes in one step

``` r
res <- quant_es_core(x)

res$global
head(res$variants)
```

Global contains posterior summaries. Variants contains per variant
theta, credible intervals, and percentiles.

# Default plots

``` r
plots <- quant_es_plots(res, x)

plots$p_global
plots$p_overlay
plots$p_matrix
plots$p_p_hat
plots$p_theta_ci
plots$p_combined
```

These cover density, overlay of top candidates, evidence matrix,
observed proportions, credible intervals, and a combined panel.

# Highlight variants of interest

``` r
highlight_demo <- list(
  list(id = rownames(x)[1], colour = "#ee4035", size = 4),
  list(id = rownames(x)[5], colour = "#2f4356", size = 4)
)

plots2 <- quant_es_plots(res, x, highlight_points = highlight_demo)
plots2$p_overlay
```

# Custom palettes

``` r
pal10 <- colorRampPalette(c("black", "grey"))(10)
pal20 <- colorRampPalette(c("skyblue", "navy"))(20)

plots_custom <- quant_es_plots(
  res,
  x,
  palette10 = pal10,
  palette20 = pal20
)

plots_custom$p_overlay
```

Any plot returned by `quant_es_plots` is a standard ggplot, so users can
layer themes or labels.

``` r
plots$p_overlay + ggplot2::theme_minimal()
```

# File based input

`quantbayes` can read flat files of binary values:

``` r
tmp <- tempfile(fileext = ".tsv")
write.table(core_test_data, tmp, sep = "\t", quote = FALSE, row.names = FALSE)

res_file <- quant_es_from_binary_table(tmp)
res_file$global
```

# Save plots

``` r
outdir <- "quantbayes_output"
if (!dir.exists(outdir)) dir.create(outdir)

ggplot2::ggsave(
  file.path(outdir, "overlay.png"),
  plots$p_overlay,
  width = 6,
  height = 4,
  dpi = 120
)
```

# Save tables

``` r
write.csv(
  res$variants,
  file.path(outdir, "variants_results.csv"),
  row.names = FALSE
)

write.csv(
  as.data.frame(res$global),
  file.path(outdir, "global_summary.csv"),
  row.names = FALSE
)
```

# Full workflow in six lines

``` r
data(core_test_data)

x <- as.matrix(core_test_data[, -1])
rownames(x) <- core_test_data[[1]]

res <- quant_es_core(x)
plots <- quant_es_plots(res, x)
plots$p_combined
```

------------------------------------------------------------------------

# Core results example

## Clinical genetics example

After Whole Genome Sequencing, a proprietary candidate selection tool
identified potential causal variants. A clinical laboratory requires
verifiable evidence to support or refute these findings. Each candidate
variant was evaluated using a minimal and independent evidence set that
records whether supporting evidence is present or absent under a
Qualifying Variant Evidence Standard.

``` r
res_df <- as.data.frame(res$variants)
global_df <- as.data.frame(res$global)

res_df <- res_df[order(res_df$theta_mean, decreasing = TRUE), ]

head(res_df)
head(global_df)
```

Example output:

| variant_id          | k   | m   | theta_mean | theta_lower | theta_upper | percentile |
|---------------------|-----|-----|------------|-------------|-------------|------------|
| 2-54234474-G-A_AR   | 18  | 24  | 0.7308     | 0.5487      | 0.8793      | 99.875     |
| 6-72183475-CG-N_AR  | 18  | 24  | 0.7308     | 0.5487      | 0.8793      | 99.875     |
| 1-14682421-G-A_AD   | 17  | 24  | 0.6923     | 0.5061      | 0.8505      | 99.375     |
| 7-751853912-C-T_AR  | 17  | 24  | 0.6923     | 0.5061      | 0.8505      | 99.375     |
| X-224319469-CT-C_XR | 16  | 24  | 0.6538     | 0.4650      | 0.8203      | 97.000     |
| X-414698664-CT-C_XD | 16  | 24  | 0.6538     | 0.4650      | 0.8203      | 97.000     |

------------------------------------------------------------------------

**Scenario:** autosomal recessive disease XXX with a primary candidate
variant. **Variant:** 2-54234474-G-A_AR

Estimated values from `quantbayes`:

- Posterior evidence sufficiency: **0.731**
- Credible interval: **0.549 to 0.879**
- Percentile: **99.88**

These values reflect the relative strength of evidence for this variant
within the tested panel.

Total evaluated variants: **400** Global evidence sufficiency: **0.52**
Credible interval: **0.38 to 0.65**. Across all variants, the mean theta
is **0.52** and the median is **0.54**.

------------------------------------------------------------------------

# Vignette

The full vignette includes highlighting, palettes, file input, and plot
saving:

    vignette("quantbayes")
