---
title: "Using the GEOfastq Package"
output: rmarkdown::html_vignette
vignette: >
    %\VignetteIndexEntry{Using the GEOfastq Package}
    %\VignetteEngine{knitr::rmarkdown}
    %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
    collapse = TRUE,
    comment = "#>"
)
```

```{r setup}
library(GEOfastq)
```

# Installation

`GEOfastq` can be installed from Bioconductor as follows:

```{r, eval = FALSE}
if(!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("GEOfastq")
```

# Overview of GEOfastq

The NCBI [Gene Expression Omnibus](https://www.ncbi.nlm.nih.gov/geo/) (GEO)
offers a convenient interface to explore high-throughput experimental data such
as RNA-seq. GEO deposits RNA-seq data as sra files to the Sequence Read Archive
(SRA) which can be converted to fastq files using `fastq-dump`. This conversion
process can be quite slow and it is usually more convenient to download fastq
files for a GEO accession generated by the European Nucleotide Archive (ENA).
`GEOfastq` crawls GEO to retrieve metadata and ENA fastq urls, and then
downloads them.


# Getting Started using GEOfastq

To get fastq data for a GEO series, we first retrieve the metadata for a GEO
accession:


```{r}
gse_name <- 'GSE133758'
gse_text <- crawl_gse(gse_name)
```

Next, we extract the sample accessions for this study and retrieve the GEO
metadata and ENA fastq url for an example:


```{r}
gsm_names <- extract_gsms(gse_text)
gsm_name <- gsm_names[182]
srp_meta <- crawl_gsms(gsm_name)
```

Now that we have retrieved the necessary metadata, we are ready to download the
fastq files for this sample:

```{r}
data_dir <- tempdir()

# example using smaller file
srp_meta <- data.frame(
        run  = 'SRR014242',
        row.names = 'SRR014242',
        gsm_name = 'GSM315559',
        ebi_dir = get_dldir('SRR014242'), stringsAsFactors = FALSE)

res <- get_fastqs(srp_meta, data_dir)
```

# Session info

The following package and versions were used in the production of this vignette.

```{r echo=FALSE}
sessionInfo()
```