\name{geneData} \Rdversion{1.1} \alias{geneData} \title{ View the expression data for selected genes } \description{ This function outputs and visualizes the expression data for seleted genes. Potential output files include: a tab-delimited text file, a heatmap in PDF format, and a scatter plot in PDF format. } \usage{ geneData(genes, exprs, ref = NULL, samp = NULL, outname = "array", txt = TRUE, heatmap = FALSE, scatterplot = FALSE, samp.mean = FALSE, pdf.size = c(7, 7), cols = NULL, scale = "row", limit = NULL, label.groups = TRUE, ...) } \arguments{ \item{genes}{ character, either a vector of interesting genes IDs or a 2-column matrix, where the first column specifies gene IDs used in \code{expData} while the second column gives another type of IDs to use for the output data files. } \item{exprs}{ an expression matrix or matrix-like data structure, with genes as rows and samples as columns. } \item{ref}{ a numeric vector of column numbers for the reference condition or phenotype (i.e. the control group) in the exprs data matrix. Default ref = NULL, all columns are considered as target experiments. } \item{samp}{ a numeric vector of column numbers for the target condition or phenotype (i.e. the experiment group) in the exprs data matrix. Default samp = NULL, all columns other than ref are considered as target experiments. } \item{outname}{ a character string, to be used as the prefix of the output data files. Default to be "array". } \item{txt}{ boolean, whether to output the selected gene data as a tab-delimited text file. Default to be TRUE. } \item{heatmap}{ boolean, whether to plot heatmap for the selected gene data as a PDF file. Default to be FALSE. } \item{scatterplot}{ boolean, whether to make scatter plot for the selected gene data as a PDF file. Default to be FALSE. } \item{samp.mean}{ boolean, whether to take the mean of gene data over the ref and samp group when making the scatter plot. Default to be FALSE, i.e. make scatter plots for the first two ref-samp pairs and label them differently on the same graph panel. } \item{pdf.size}{ a numeric vector to specify the the width and height of PDF graphics region in inches. Default to be c(7, 7). } \item{cols}{ a character vector to specify colors used for the heatmap image blocks. Default to be NULL, i.e. to generate a green-red spectrum based on the gene data automatically. } \item{scale}{ character indicating if the values should be centered and scaled in either the row direction or the column direction, or none for the heatmap. The default is "row", other options include "column" and "none". } \item{limit}{ numeric value to specify the maximal absolute value of gene data to visualize using the heatmap. Gene data beyong will be reset to equal this value. Default to NULL, i.e. plot all gene data values. This argument allows optimal differentiation between most gene data values when extremely positive/negative values exsit and squeeze the normal-value region. Recommend limit = 3 when the gene data is scaled by row. } \item{label.groups}{ boolean, whether to label the two sample groups, i.e. ref and samp, differently using side color bars along the heatmap area. Default to be TRUE. } \item{\dots}{ other arguments to be passed into the inside \code{heatmap2} function. } } \details{ This function integrated three most common presentation methods for gene expression data: tab-delimited text file, heatmap and scatter plot. Heatmap is ideal for visualizing relative changes with gene-wise standardized (or row-scaled) data. The heatmap is generated by calling a improved version of the heatmap.2 function from gplots package. Scatter plot is ideal for visualizing the modest or small but consistent changes over a gene set between two states under comparison. Although \code{geneData} is designed to be a standard-alone function, it is frequently used in tandem with \code{essGene} function to present the changes of the essential genes in signficant gene sets. } \value{ The function returns invisible 1 when successfully executed. } \references{ Luo, W., Friedman, M., Shedden K., Hankenson, K. and Woolf, P GAGE: Generally Applicable Gene Set Enrichment for Pathways Analysis. BMC Bioinformatics 2009, 10:161 } \author{ Weijun Luo } \seealso{ \code{\link{essGene}} extract the essential member genes in a gene set; \code{\link{gage}} the main function for GAGE analysis; } \examples{ data(gse16873) cn=colnames(gse16873) hn=grep('HN',cn, ignore.case =TRUE) dcis=grep('DCIS',cn, ignore.case =TRUE) #kegg test for 1-directional changes data(kegg.gs) gse16873.kegg.p <- gage(gse16873, gsets = kegg.gs, ref = hn, samp = dcis) rownames(gse16873.kegg.p$greater)[1:3] gs=unique(unlist(kegg.gs[rownames(gse16873.kegg.p$greater)[1:3]])) essData=essGene(gs, gse16873, ref =hn, samp =dcis) head(essData) ref1=1:6 samp1=7:12 #generated text file for data table, pdf files for heatmap and scatterplot for (gs in rownames(gse16873.kegg.p$greater)[1:3]) { outname = gsub(" |:|/", "_", substr(gs, 10, 100)) geneData(genes = kegg.gs[[gs]], exprs = essData, ref = ref1, samp = samp1, outname = outname, txt = TRUE, heatmap = TRUE, Colv = FALSE, Rowv = FALSE, dendrogram = "none", limit = 3, scatterplot = TRUE) } } \keyword{multivariate} \keyword{manip}