% -*- mode: noweb; noweb-default-code-mode: R-mode; -*- %\VignetteIndexEntry{xmapbridge primer} %\VignetteKeywords{Visualization, Genome Browser, Affymetrix } %\VignetteDepends{} %\VignettePackage{xmapbridge} %documentclass[12pt, a4paper]{article} \documentclass[12pt]{article} \usepackage{amsmath,pstricks} % With MikTeX 2.9, using pstricks also requires using auto-pst-pdf or running % pdflatex will fail. Note that using auto-pst-pdf requires to set environment % variable MIKTEX_ENABLEWRITE18=t on Windows, and to set shell_escape = t in % texmf.cnf for Tex Live on Unix/Linux/Mac. \usepackage{auto-pst-pdf} \usepackage{hyperref} \usepackage[authoryear,round]{natbib} \usepackage{color} \definecolor{NoteGrey}{rgb}{0.96,0.97,0.98} \textwidth=6.2in \textheight=9.5in @ \parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-1in \newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\code}[1]{{\textit{#1}}} \author{Tim Yates, Crispin J Miller} \begin{document} \title{xmapbridge: Graphically Displaying Numeric Data in the X:Map Genome Browser} \maketitle \tableofcontents \section{Introduction} \code{xmapbridge} is a package that allows numeric data to be displayed in the web based genome browser, X:Map. X:Map uses the Google Maps API to provide a real-time scrollable view of the genome. It supports a number of species, and can be accessed at \url{http://xmap.picr.man.ac.uk}. \addvspace{.3cm} To do this, it communicates with X:Map via a small Java application called \code{XMapBridge}. \code{XMapBridge} sits on your local machine, interprets the graph data generated by \code{xmapbridge} and uses it to create plots, which are then passed to the genome browser to be displayed via HTTP. \begin{figure}[htp] \centering \includegraphics{bridge} \caption{\small{The XMapBridge in operation}} \label{fig:1} \end{figure} \addvspace{.3cm} Importantly, all graph rendering is performed on the local machine so none of the underlying graph data is uploaded to the X:Map server to generate your graphs. \addvspace{.3cm} Graph plotting in R is done using calls to the functions \Rfunction{xmap.plot} and \Rfunction{xmap.points}, which have parameters that aim to be similar to those used by the standard \Rfunction{plot} methods in R. These result in data being written to a set of files (in a specific directory structure) that contain the data to be displayed, as well as some additional meta-data describing each of the graphs. \addvspace{.3cm} When the \code{XMapBridge} application is running (figure \ref{fig:1}), it monitors the same directory structure used by \code{xmapbridge} and uses the information written out from the Bioconductor session to generate its plots. These are then passed via HTTP to the genome browser for display. \addvspace{.3cm} \fcolorbox{black}{NoteGrey} { \begin{minipage}{13.5cm} Note that the \code{XMapBridge} must be running on the same machine as your webbrowser. Only one instance of the XMapBridge can be run on a machine at any one time, and the X:Map webpage will try to connect to localhost to load javascript and graph data. If you do not see the "Connect" button when \code{XMapBridge} is running, the most likely reason is this. \end{minipage} } \section{Quickstart} The steps required to get the \code{xmapbridge} and \code{XMapBridge} programs working are as follows: \begin{enumerate} \item Install Java 5+ (if it's not already installed). \item Install \code{XMapBridge} (see the section 'Installation of the XMapBridge software' at the end of this Vignette for more details). \item Install \code{xmapbridge} (this package). \end{enumerate} \addvspace{.3cm} By default, the \code{XMapBridge} java application and the \code{xmapbridge} R package use a folder called \code{.xmb\_cache} in your home directory. \addvspace{.3cm} If this folder does not exist when the R functions are first called (or the environment variable isn't set -- see below), then \code{xmapbridge} will offer to create the folder for you. \addvspace{.3cm} If this folder does not exist when the \code{XMapBridge} is first run, (and there is no environment variable set), then the application will run in demo mode, showing three demo graphs for you to see how it works on your system (figure \ref{fig:2}). \begin{figure}[htp] \centering \includegraphics{figure} \caption{\small{The process to show the graphs in the browser (1a) Click on the connect button (1b) This changes into a drop-down select box (1c) select the graph of choice}} \label{fig:2} \end{figure} \addvspace{.3cm} If you don't want to use this default directory, then you need to create an environment variable \code{XMAP\_BRIDGE\_CACHE} which contains a path to the folder you want to use as a cache folder instead. The details of how to set this variable on multiple operating systems can be found in the \code{README.TXT} file which was included in the \code{XMapBridge} download. \addvspace{.3cm} The function \Rfunction{xmap.plot} generates a project folder structure, adds a graph to this structure, and adds the plot data in the appropriate directory for \code{XMapBridge} to find it. So, for example: <>= #We are actually writing everything to a tmp directory behind the scenes. path <- tempdir() old.path <- Sys.getenv( "XMAP_BRIDGE_CACHE" ) Sys.setenv( XMAP_BRIDGE_CACHE=path ) #We put things back later @ <>= library( xmapbridge ) set.seed( 100 ) x <- runif( 100, 1100000,1200000 ) y <- 10 * sin( 2 * pi * ( x-1000000 ) / 10000 ) @ <>= xmap.plot( x, y, species="homo_sapiens", chr="1", type="line" ) @ <>= cat( xmap.debug( xmap.plot( x, y, species="homo_sapiens", chr="1", type="line" ), newlines=TRUE ) ) @ \begin{figure}[htp] \centering \includegraphics{figure1} \caption{\small{X:Map displaying a simple graph}} \label{fig:3} \end{figure} Will first prompt you to ask if you want to create the default directory (if it's not already there) and then generate a line plot between position 1,100,000bp, and 1,200,000bp on human chromosome 1. As you can see from the output, it actually starts at location 1,103,015bp due to the \Rfunction{runif}. (In the following section more complex examples show how to generate graphs for real (Affymetrix exon) array data). \addvspace{.3cm} If you re-start \code{XMapBridge} (assuming it was in demo-mode before), it should now find this directory and the browser (once refreshed or pointed at the X:Map website) should offer you your new graph via the drop-down menu. When you then select it, you should see it displayed, as in figure \ref{fig:3}. The next time you use the package, this directory will already exist, so you won't get prompted again, unless you delete or move the cache directory. \addvspace{.3cm} \fcolorbox{black}{NoteGrey} { \begin{minipage}{13.5cm} The XMapBridge java application scans your cache folder every few seconds to see if anything has been updated, however these changes can take some time to "bubble" up to the javascript in the browser. There should be no need to refresh the X:Map page and reload the bridge connection applet, but it may take up to 30 seconds for the list of graphs to be updated. When the list is updated, the drop-down selection box reverts back to its initial position, removing any currently viewed graph. \end{minipage} } \section{Plotting Graphs, the simple way } As we saw above, the simplest way to plot a graph is simply to call \Rfunction{xmap.plot}. As well as x,y coordinates, you also need to specify the chromosome and species. Labels can be provided and the type of plot can also be specified. Later we'll see how to set transparency and colours for the plots. \addvspace{.3cm} Here we explore how to generate some plots for Affymetrix Exon arrays. We first load an example dataset (see \code{?exon.data} for more details on the origins of this dataset): \addvspace{.3cm} <>= data(xmapbridge) @ \addvspace{.3cm} The dataframe \code{exon.data} contains information on 1747 probesets targeting 80 different genes, for a triplicate comparison between two cell lines, mcf7 and mcf10a. To generate our plots, we first take our array data and split it into different pieces, one for each gene: \addvspace{.3cm} <>= l <- split(exon.data,exon.data$"gene") @ \addvspace{.3cm} Then, for each gene we need to extract the appropriate x locations (and which chromosome it is on). Initially, we just do it for the first gene in the list: \addvspace{.3cm} <>= g <- l[[1]] x <- g$"seq_region_end" + (g$"seq_region_end" - g$"seq_region_start")/2 y <- g[,2] chr <- unique(g$"name") @ \addvspace{.3cm} Now we can plot the data for one of our replicates: \addvspace{.3cm} <>= xmap.plot(x,y,chr,species="homo_sapiens",type="area",col="#ff000066") @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="area",col="#ff000066"), newlines=TRUE ) ) @ \addvspace{.3cm} Currently, \code{scatter, line, bar, step, area and steparea} plots are supported. These are be pretty self explanatory - the following should generate plots similar to those in figure \ref{fig:4}: \begin{figure}[htp] \centering \includegraphics{diffstyles.png} \caption{\small{Different styles of plot: (a) scatter, (b) line, (c) bar, (d) area and (e) steparea}} \label{fig:4} \end{figure} \addvspace{.3cm} <>= xmap.plot(x,y,chr,species="homo_sapiens",type="scatter",col="#ff000066") @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="scatter",col="#ff000066"), newlines=TRUE ) ) @ <>= xmap.plot(x,y,chr,species="homo_sapiens",type="line",col="#ff000066") @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="line",col="#ff000066"), newlines=TRUE ) ) @ <>= xmap.plot(x,y,chr,species="homo_sapiens",type="bar",col="#ff000066") @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="bar",col="#ff000066"), newlines=TRUE ) ) @ <>= xmap.plot(x,y,chr,species="homo_sapiens",type="area",col="#ff000066") @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="area",col="#ff000066"), newlines=TRUE ) ) @ <>= xmap.plot(x,y,chr,species="homo_sapiens",type="steparea",col="#ff000066") @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="steparea",col="#ff000066"), newlines=TRUE ) ) @ \addvspace{.3cm} \section{Multiple plots on a graph} \addvspace{.3cm} \code{XMapBridge} can generate a graph with multiple plots on it. This is done by calling \Rfunction{xmap.points} after the initial \Rfunction{xmap.plot} call. Each call to \Rfunction{xmap.points} will add a new \code{Plot} to the current \code{Graph}: \addvspace{.3cm} <>= #Pick an interesting gene g <- l[["ENSG00000128394"]] x <- g$"seq_region_end" + (g$"seq_region_end" - g$"seq_region_start")/2 chr <- unique(g$"name") y <- g[,2] @ <>= xmap.plot(x,y,chr,species="homo_sapiens",type="area",col="#ff000033",ylim=c(0,16)) @ <>= cat( xmap.debug( xmap.plot(x,y,chr,species="homo_sapiens",type="area",col="#ff000033",ylim=c(0,16)), newlines=TRUE ) ) @ <>= xmap.points(x,y,chr,type="line",col="#ff0000ff") @ <>= cat( xmap.debug( xmap.points(x,y,chr,type="line",col="#ff0000ff"), newlines=TRUE ) ) @ <>= #We were plotting the 1st array (2nd column in g), now plot the 4th (5th column in g) y <- g[,5] @ <>= xmap.points(x,y,chr,type="area",col="#0000ff33") @ <>= cat( xmap.debug( xmap.points(x,y,chr,type="area",col="#0000ff33"), newlines=TRUE ) ) @ <>= xmap.points(x,y,chr,type="line",col="#0000ffff") @ <>= cat( xmap.debug( xmap.points(x,y,chr,type="line",col="#0000ffff"), newlines=TRUE ) ) @ \begin{figure}[htp] \centering \includegraphics{multiple.png} \caption{\small{Multiple plots on the same graph}} \label{fig:5} \end{figure} \addvspace{.3cm} This should look like figure \ref{fig:5}. (For each array we first generate the shaded area, and then add the edge as an additional line plot). \addvspace{.3cm} \fcolorbox{black}{NoteGrey} { \begin{minipage}{13.5cm} Note: As you can see from the output, these calls to \Rfunction{xmap.points} returns a new \code{Plot} object that resides within the same \code{Graph} and \code{Project} at the plot returned from the initial \Rfunction{xmap.plot} call. \end{minipage} } \addvspace{.3cm} \section{More fine grained control} A quick look at the XMapBridge GUI will show that all of the graphs and plots generated so far have been placed in a hierarchy, beginning with a 'Project' and with a series of graphs and plots underneath that. \addvspace{.3cm} This is fine if we are happy with all of the plots generated during a single session to be grouped into a single project in this way, but we can express more fine grained control if required. \addvspace{.3cm} In the next example, we will create a new project and then iterate along our list of genes, \code{l}, generating a new graph for each gene. Each of these graphs will be placed in the new project we've created. \addvspace{.3cm} First, we set up some colours and names for our individual plots: \addvspace{.3cm} <<>= cols.bg <- c(rep("#ff000011",3),rep("#0000ff11",3)) cols.fg <- c(rep("#ff0000ff",3),rep("#0000ffff",3)) names <- c("7.1","7.2","7.3","10a.1","10a.2","10a.3") @ \addvspace{.3cm} (We will explain how colours are defined later on). \addvspace{.3cm} Then we create a new project and store the resultant id in the variable \code{pid}: <<>= pid <- xmap.project.new("mcf7 vs mcf10a") @ \addvspace{.3cm} Next, we write a function that can plot a single gene. It takes a matrix of data (i.e. one of the elements of \code{l}) and the project id: \addvspace{.3cm} <<>= plot.a.gene <- function(g,pid) { x <- g$"seq_region_end" + (g$"seq_region_end" - g$"seq_region_start")/2 y <- g[,2:7] chr <- unique(g$"name") gene <- unique(g$"gene") xlim <- range(x) gph <- xmap.graph.new(projectid=pid, name=gene,desc="vignette example", min=0, max=16, chr=chr, start=xlim[1], stop=xlim[2], ylab="value",species="homo_sapiens") for(i in 1:6) { xmap.points(graphid=gph,x=x,y=y[,i],type="area",col=cols.bg[i],xlab="") xmap.points(graphid=gph,x=x,y=y[,i],type="line",col=cols.fg[i],xlab=names[i]) } } @ \addvspace{.3cm} Note how the function \Rfunction{xmap.graph.new} is used to create a new graph. It takes a projectid as a parameter, which specifies which project the graph should be placed in, and returns a graphid. This is used by \Rfunction{xmap.points} to specify which graph to add the plots to. \addvspace{.3cm} Finally, we can generate a graph for each of our genes using \Rfunction{lapply} across \code{l}: <<>= lapply(l,plot.a.gene,pid=pid) @ \addvspace{.3cm} \section{Specifying Graph Colours} By default, the \code{XMapBridge} allocates unique colours to each plot in the graph, however, note how in the previous example, the colour of the plot is being set as an RGB value with an extra alpha value specifying the opacity. The format for these colours is \code{"\#RRGGBBAA"} with each of the Red, Green, Blue and Alpha components being a hexadecimal number from 00 to FF. \addvspace{.3cm} So, for example, a colour value of \code{"\#00339955"} for the first plot gives a (slightly green) blue colour with the opacity set to 33\% so that the underlying images from the X:Map website (and the underlying plots on this same graph) can be seen through the current plot (see figure \ref{fig:5}). \addvspace{.3cm} <>= library( RColorBrewer ) @ <>= #A light grey almost transparent step area xmap.plot( x, y, type="steparea", col="#11111111", ylab="intensity", chr="4", species="homo_sapiens" ) @ <>= cat( xmap.debug( xmap.plot( x, y, type="steparea", col="#11111111", ylab="intensity", chr="4", species="homo_sapiens" ), newlines=TRUE ) ) @ <>= #With an opaque orange edge xmap.points( x, y, type="step", col="#F7941DFF" ) @ <>= cat( xmap.debug( xmap.points( x, y, type="step", col="#F7941DFF" ), newlines=TRUE ) ) @ \addvspace{.3cm} The function \Rfunction{xmap.col} then provides an easy way to set alpha values: \addvspace{.3cm} <>= #Five levels of red, chosen by the RColorBrewer reds <- brewer.pal( 5, "Reds" ) reds <- xmap.col( reds, alpha=0x55 ) for( i in 1:5 ) { xmap.points( x, y / i, type="area", col=reds[i] ) } @ \begin{figure}[htp] \centering \includegraphics{coloursandalpha} \caption{\small{Specifying colours and alpha values}} \label{fig:5} \end{figure} \fcolorbox{black}{NoteGrey} { \begin{minipage}{13.5cm} Note that \Rfunction{xmap.plot} also allows colours to specified as an integer. In this case, the alpha value goes first; the format is \code{0xAARRGGBB}. \Rfunction{xmap.col}, returns integers in this form. \end{minipage} } <>= #put the environment variable back. Not strictly necessary, but why not? Sys.setenv(XMAP_BRIDGE_CACHE=old.path) @ \section{Installation and configuration details} \subsection{Installation of the XMapBridge software} The XMapBridge Java application can be downloaded from the X:Map website at \url{http://xmap.picr.man.ac.uk/download/bridge} as a simple ZIP file. When extracted, there are comprehensive installation instructions to be found in the file \code{README.TXT}. \subsection{Configuration of the xmapbridge package} \code{xmapbridge} and \code{XMapBridge} can be told to look in another location apart from the default for the directory they share. This is specified by an environment variable with the name \code{XMAP\_BRIDGE\_CACHE}, which points to a folder you wish to use. Again, comprehensive instructions for doing this on the main operationg systems can be found in the \code{README.TXT} file that is part of the \code{XMapBridge} download. \end{document}