---
title: "Provide WikipathwaysDb databases for AnnotationHub"
author: "Kozo Nishida"
graphics: no
package: AHWikipathwaysDbs
output:
    BiocStyle::html_document:
        toc_float: true
vignette: >
    %\VignetteIndexEntry{Provide WikipathwaysDb databases for AnnotationHub}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
    %\VignetteDepends{AnnotationHub}
---

```{r style, echo = FALSE, results = 'asis', message=FALSE}
BiocStyle::markdown()
```

# Fetch Wikipathways databases from `AnnotationHub`

The `AHWikipathwaysDbs` package provides the metadata for all Wikipathways
tibble databases in `r Biocpkg("AnnotationHub")`. First we load/update the
`AnnotationHub` resource.

```{r load-lib, message = FALSE}
library(AnnotationHub)
ah <- AnnotationHub()
```

Next we list all Wikipathways entries from `AnnotationHub`.

```{r list-wikipathwaysdb}
query(ah, "wikipathways")
```

We can confirm the metadata in AnnotationHub in Bioconductor S3 bucket
with `mcols()`.

```{r confirm-metadata}
mcols(query(ah, "wikipathways"))
```

We query only the Wikipathways tibble for species *Homo sapiens*.

```{r query-hsa}
qr <- query(ah, c("wikipathways", "Homo sapiens"))
qr
```
There are a tibble in the result.
Let's get a tibble of it here.

```{r load-hsatbl}
hsatbl <- qr[[1]]
hsatbl
```

Each row shows information for one metabolite.
This tibble indicates which pathway of Wikipathways has those metabolites.
Each metabolite has a the name, HMDB_ID, KEGG_ID, ChEBI_ID, DrugBank_ID,
PubChem_CID, ChemSpider_ID, KNApSAcK_ID, Wikidata_ID, CAS and InChI Key
as well as the pathway information to which it belongs.

To get the metabolites defined for *Amino Acid metabolism* we can call.

```{r get-metabolites}
hsatbl[hsatbl$`pathway_name`=="Amino Acid metabolism", ]
```

# Creating Wikipathways tibbles

This section describes the automated way to create Wikipathways
tibble databases using
[GPML XML files](https://wikipathways-data.wmcloud.org/20210410/gpml/).

## Creating Wikipathways tibble databases

To create the databases we use the `createWikipathwaysMetabolitesDb` function.
This function downloads the zip archive of "Wikipathways GPML" XML files.
Then, those XMLs are integrated into a table for each species and tibbleed.

The function has no parameters.
In other words, it does not have the function of making tibble only for a
specific species, but makes tibble for all species in Wikipathways.

```{r create-rda, eval = FALSE}
library(AHWikipathwaysDbs)
scr <- system.file("scripts/make-data.R", package = "AHWikipathwaysDbs")
source(scr)
createWikipathwaysMetabolitesDb()
```

The tibble is stored in the rda file and saved in the current working
directory.