%\VignetteIndexEntry{PDInfo Package Building Affymetrix Mapping Chips} %\VignetteKeywords{SNP, Mapping} %\VignettePackage{pdInfoBuilder} \documentclass{article} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rcode}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textsf{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \begin{document} \title{Building a PDInfo Package for an Affymetrix Mapping Chip} \date{November, 2006} \author{Seth Falcon \and Benilton Carvalho} \maketitle <>= options(width=60) options(continue=" ") options(prompt="R> ") @ \section{IMPORTANT!} Users are asked to download the packages for Affymetrix SNP chips from the BioConductor website! The contents of this vignette explain the essence of the builder for SNP chips. \section{What you will need} Aside from the \Rpackage{pdInfoBuilder} package and its dependencies, you will need the following files in order to follow along and build a ``PDInfo'' package for the Affymetrix Mapping 250K NSP chip. \begin{itemize} \item \verb+Mapping250K_Nsp.cdf+: The binary CDF file from Affymetrix containing feature-level annotation. \item \verb+Mapping250K_Nsp_annot.csv+: Delimited text file from Affymetrix containing featureSet-level annotation. \item \verb+Mapping250K_Nsp_probe_tab+: Delimited text file from Affymetrix containing probe sequence information for all PM probes. \end{itemize} \section{Building a PDInfo package} Assuming the source files described above are in your current working directory, the following commands will produce a PDInfo package. <>= library("pdInfoBuilder") cdfFile <- "Mapping250K_Nsp.cdf" csvAnno <- "Mapping250K_Nsp_annot.csv" csvSeq <- "Mapping250K_Nsp_probe_tab" pkg <- new("AffySNPPDInfoPkgSeed", version="0.1.5", author="Seth Falcon", email="sfalcon@fhcrc.org", biocViews="AnnotationData", genomebuild="NCBI Build 35, May 2004", cdfFile=cdfFile, csvAnnoFile=csvAnno, csvSeqFile=csvSeq) makePdInfoPackage(pkg, destDir=".") @ We are able to complete the above step on a dual-CPU dual-core 3GHz Xeon box with 8GB RAM in about 20 minutes. The only step that requires a significant amount of RAM is generating the coded sequence matrix. I haven't yet looked carefully at it, but I think it should run on a 4GB system. \end{document}