--- title: "Importing Data" author: "Shian Su" output: html_document vignette: > %\VignetteIndexEntry{Importing Data} %\VignetteEngine{knitr::rmarkdown} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) # preload to avoid loading messages library(NanoMethViz) ``` ```{r} library(NanoMethViz) ``` In order to use this package, your data must be converted from the output of methylation calling software to a special tabix format. Due to the use of the Unix `sort` function, this can only currently be done on a Linux or MacOS system. We currently support output from * Nanopolish * f5c The conversion can be done using the `create_tabix_file()` function. We provide example data of nanopolish output within the package, we can look inside to see how the data looks coming out of nanopolish ```{r} methy_calls <- system.file(package = "NanoMethViz", c("sample1_nanopolish.tsv.gz", "sample2_nanopolish.tsv.gz")) # have a look at the first 10 rows of methy_data methy_calls_example <- read.table( methy_calls[1], sep = "\t", header = TRUE, nrows = 6) methy_calls_example ``` We then create a temporary path to store a converted file, this will be deleted once you exit your R session. Once `create_tabix_file()` is run, it will create a tabix file along with its index. Because we have a small amount of data, we can read in a small portion of it to see how it looks, do not do this with large datasets as it decompresses all the data and will take very long to run. ```{r, message=F} methy_tabix <- file.path(tempdir(), "methy_data.bgz") samples <- c("sample1", "sample2") # you should see messages when running this yourself create_tabix_file(methy_calls, methy_tabix, samples) # don't do this with actual data # we have to use gzfile to tell R that we have a gzip compressed file methy_data <- read.table( gzfile(methy_tabix), col.names = methy_col_names(), nrows = 6) methy_data ``` Now `methy_tabix` will be the path to a tabix object that is ready for use with NanoMethViz. Please head over to the "Introduction" vignette to see how to use this data for visualisation!