% $Id: ChromHeatMap.Rnw 2076 2009-10-06 10:04:19Z tfrayner $ % % \VignetteIndexEntry{Plotting expression data with ChromHeatMap} % \VignetteKeywords{expression, plotting, chromosome, idiogram, cytoband} % \VignettePackage{ChromHeatMap} \documentclass[a4paper]{article} \SweaveOpts{eps=false} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\textit{#1}}} \newcommand{\Rpackage}[1]{{\textbf{#1}}} \begin{document} \title{ChromHeatMap} \author{Tim F. Rayner} \maketitle \begin{center} Cambridge Institute of Medical Research \end{center} \section{Introduction} The \Rpackage{ChromHeatMap} package provides functions for visualising expression data in a genomic context, by generating heat map images in which data is plotted along a given chromosome for all the samples in a data matrix. These functions rely on the existence of a suitable \Rpackage{AnnotationDbi} package which provides chromosome location information for the probe- or gene-level identifiers used in your data set. The data themselves must be in either an ExpressionSet, or a data matrix with row names corresponding to probe or gene identifiers and columns corresponding to samples. While the \Rpackage{ChromHeatMap} package was originally designed for use with microarray data, given an appropriate \Rpackage{AnnotationDbi} package it can also be used to visualise data from next-generation sequencing experiments. The output heatmap can include sample clustering, and data can either be plotted for each strand separately, or both strands combined onto a single heat map. An idiogram showing the cytogenetic banding pattern of the chromosome will be plotted for supported organisms (at the time of writing: \textit{Homo sapiens}, \textit{Mus musculus} and \textit{Rattus norvegicus}; please contact the maintainer to request additions). Once a heat map has been plotted, probes or genes of interest can be identified interactively. These identifiers may then be mapped back to gene symbols and other annotation via the \Rpackage{AnnotationDbi} package. \section{Data preparation} Expression data in the form of a data matrix must initially be mapped onto its corresponding chromosome coordinates. This is done using the \Rfunction{makeChrStrandData}: <<>>= library('ALL') data('ALL') selSamples <- ALL$mol.biol %in% c('ALL1/AF4', 'E2A/PBX1') ALLs <- ALL[, selSamples] library('ChromHeatMap') chrdata<-makeChrStrandData(exprs(ALLs), lib='hgu95av2') @ The output \Robject{chrdata} object here contains the expression data indexed by coordinate. Note that the \Rfunction{makeChrStrandData} function is based on the \Rfunction{Makesense} function in the \Rpackage{geneplotter} package, removing the internal call to lowess to avoid smoothing the data (which is undesirable in this case). The \Rfunction{makeChrStrandData} function is used specifically because it incorporates information on both the start and end chromosome coordinates for each locus. This allows the \Rfunction{plotChrMap} function to accurately represent target widths on the chromosome plot. \section{Plotting the heat map} Once the data has been prepared, a single call to \Rfunction{plotChrMap} will generate the chromosome heat map. There are many options available for this plot, and only a couple of them are illustrated here. Here we generate a whole-chromosome plot (chromosome 19), with both strands combined into a single heat map: \begin{center} <>= groupcol <- ifelse( ALLs$mol.biol == 'ALL1/AF4', 'red', 'green' ) plotChrMap(chrdata, 19, strands='both', RowSideColors=groupcol) @ \end{center} Chromosomes can be subsetted by cytoband or start/end coordinates along the chromosome. The following illustrates how one might plot the strands separately (this is the default behavior): \begin{center} <>= plotmap<-plotChrMap(chrdata, 1, cytoband='q23', interval=50000, srtCyto=0, cexCyto=1.2) @ \end{center} Other options include subsetting of samples, adding a color key to indicate sample subsets, deactivating the sample-based clustering and so on. See the help pages for \Rfunction{plotChrMap} and \Rfunction{drawMapDendro} for details. Note that the default colors provided by the \Rfunction{heat.colors} function are not especially attractive or informative; consider using custom-defined colors, for example by using the \Rpackage{RColorBrewer} package. The output of the \Rfunction{plotChrMap} function can be subsequently used with the \Rfunction{grabChrMapProbes} function which enables the user to identify the probes or genes responsible for heatmap bands of interest. Note that the \Rfunction{layout} and \Rfunction{par} options for the current graphics device are \emph{not} reset following generation of the image. This is so that the \Rfunction{grabChrMapProbes} function can accurately identify the region of interest when the user interactively clicks on the diagram. \section{Interactive probe/gene identification} Often it will be of interest to determine exactly which probes or genes are shown to be up- or down-regulated by the \Rfunction{plotChrMap} heat map. This can be done using the \Rfunction{grabChrMapProbes} function. This takes the output of the \Rfunction{plotChrMap} function, asks the user to mouse-click the heatmap on either side of the bands of interest and returns a character vector of the locus identifiers in that region. These can then be passed to the \Rpackage{AnnotationDbi} function \Rfunction{mget} to identify which genes are being differentially expressed. <>= probes <- grabChrMapProbes( plotmap ) genes <- unlist(mget(probes, envir=hgu95av2SYMBOL, ifnotfound=NA)) @ Note that due to the way the expression values are plotted, genes which lie very close to each other on the chromosome may have been averaged to give a signal that could be usefully plotted at screen resolution. In such cases the locus identifiers will be returned concatenated, separated by semicolons (e.g. ``\texttt{37687\_i\_at;37688\_f\_at;37689\_s\_at}''). Typically this is easily solved by zooming in on a region of interest, using either the ``cytoband'' or ``start'' and ``end'' options to \Rfunction{plotChrMap}. See also the ``interval'' option for another approach to this problem. \section{Session information} The version number of R and packages loaded for generating the vignette were: <>= sessionInfo() @ \end{document}