%\VignetteIndexEntry{quick views of eSet instances} %\VignetteDepends{Biobase, ALL} %\VignetteKeywords{Data representation, Analysis} %\VignettePackage{Biobase} % % NOTE -- ONLY EDIT THE .Rnw FILE!!! The .tex file is % likely to be overwritten. % \documentclass[12pt]{article} \usepackage{amsmath,fullpage} \usepackage[authoryear,round]{natbib} \usepackage{hyperref} \textwidth=6.2in \textheight=8.5in %\parskip=.3cm \oddsidemargin=.1in \evensidemargin=.1in \headheight=-.3in \newcommand{\scscst}{\scriptscriptstyle} \newcommand{\scst}{\scriptstyle} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textit{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rfunarg}[1]{{\texttt{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \textwidth=6.2in \bibliographystyle{plainnat} \begin{document} %\setkeys{Gin}{width=0.55\textwidth} \title{quick view tools for eSets} \author{VJ Carey} \maketitle \tableofcontents \section{Introduction} In teaching a course where a large number of datasets are introduced over a short period of time, the relationship between data content and software infrastructure can be hard to master. This document introduces a number of experimental approaches to getting rapid access to key elements of eSet derivatives. We will work with the ALL data for demonstration. <>= if (!("Biobase" %in% search())) library(Biobase) if (!("ALL" %in% search())) library(ALL) if (!("ALL" %in% objects())) data(ALL) <>= library(Biobase) library(ALL) data(ALL) ALL @ \section{An alternative to the current show method} It could be nice to tell the package from which the dataset was loaded. <>= dataSource = function(dsn) { if (!is(dsn, "character")) dsn = try(deparse(substitute(dsn))) if (inherits(dsn, "try-error")) stop("can't parse dsn arg") dd = data()$results if (is.na(match(dsn, dd[,"Item"]))) return(NULL) paste("package:", dd[ dd[,"Item"] == dsn, "Package" ], sep="") } <>= setGeneric("peek", function(x,maxattr=10)standardGeneric("peek")) setMethod("peek", c("eSet", "numeric"), function(x,maxattr=10) { ds = dataSource(deparse(substitute(x))) if (!is.null(ds)) ds = paste(" [from ", ds, "]", sep="") else ds = "" cat(deparse(substitute(x)), ds, ":\n", sep="") cat("Platform annotation: ", annotation(x),"\n") cat("primary assay results are:\n") print(dim(x)) cat("sample attributes are:\n") vn = rownames(varMetadata(x)) ld = substr(varMetadata(x)$labelDescription,1,50) dd = data.frame("labelDescription[truncated]"=ld) rownames(dd) = vn if ((ndd <- nrow(dd)) <= maxattr) show(dd) else { cat("first", maxattr, "of", ndd, "attributes:\n") show(dd[1:maxattr,,drop=FALSE]) } cat("----------\n") cat("use varTable to see values/freqs of all sample attributes\n") cat("----------\n") }) setMethod("peek", c("eSet", "missing"), function(x,maxattr=10) { ds = dataSource(deparse(substitute(x))) if (!is.null(ds)) ds = paste(" [from ", ds, "]", sep="") else ds = "" cat(deparse(substitute(x)), ds, ":\n", sep="") cat("Platform annotation: ", annotation(x),"\n") cat("primary assay results are:\n") print(dim(x)) cat("sample attributes are:\n") vn = rownames(varMetadata(x)) ld = substr(varMetadata(x)$labelDescription,1,50) dd = data.frame("labelDescription[truncated]"=ld) rownames(dd) = vn if ((ndd <- nrow(dd)) <= maxattr) show(dd) else { cat("first", maxattr, "of", ndd, "attributes:\n") show(dd[1:maxattr,,drop=FALSE]) } cat("----------\n") cat("use varTable to see values/freqs of all sample attributes\n") cat("----------\n") }) setGeneric("varTable", function(x, full=FALSE, max=Inf) standardGeneric("varTable")) setMethod("varTable", c("eSet", "missing", "ANY"), function(x, full=FALSE, max=Inf) varTable(x, FALSE, max)) setMethod("varTable", c("eSet", "logical", "ANY"), function(x, full=FALSE, max=Inf) { ans = lapply( names(pData(x)), function(z)table(x[[z]]) ) tans = lapply(ans, names) kp = 1:min(max,length(tans)) if (!full) ans = sapply(tans, selectSome, 3)[kp] else ans = tans[kp] names(ans) = names(pData(x))[kp] ans }) setGeneric("varNames", function(x) standardGeneric("varNames")) setMethod("varNames", "eSet", function(x) names(pData(x))) @ We use \texttt{peek} to get a concise view: <>= peek(ALL) @ \section{Sample characterization} Getting a handle on sample characterization requires survey of variable names. <>= varNames(ALL) @ In addition, we need to know values taken. This can be very cumbersome. We have a few parameters on how much detail is provided. <>= varTable(ALL, max=4) @ In the above, we are only showing 4 attributes. By default all attributes would be shown. Note that the report on range of values is truncated and is character mode. We can show the full range of values using the \texttt{full} parameter. <>= varTable(ALL, full=TRUE, max=4) @ \end{document}