%\VignetteIndexEntry{BETR Vignette} %\VignetteDepends{betr, Biobase} %\VignetteKeywords{Expression, TimeCourse} %\VignettePackage{betr} \documentclass{article} \newcommand{\Rfunction}[1]{{\texttt{#1}}} \newcommand{\Rmethod}[1]{{\texttt{#1}}} \newcommand{\Rcode}[1]{{\texttt{#1}}} \newcommand{\Robject}[1]{{\texttt{#1}}} \newcommand{\Rpackage}[1]{{\textsf{#1}}} \newcommand{\Rclass}[1]{{\textit{#1}}} \newcommand{\betr}{\Rpackage{betr }} \usepackage{graphicx} \begin{document} \title{An Introduction to the BETR Package} \date{February, 2011} \author{Martin Aryee\footnote{aryee@jhu}} \maketitle \begin{center} Department of Oncology\\Johns Hopkins School of Medicine\\Baltimore, MD, USA \end{center} <>= options(width=60) options(continue=" ") options(prompt="R> ") @ \section{Introduction} The \betr package package implements the BETR (Bayesian Estimation of Temporal Regulation) algorithm to rank differentially expressed genes in microarray time-course data. It calculates the probability of differential expression for each feature (gene) in a data set. The algorithm takes the correlations between time points into account, resulting in increased sensitivity compared to analyzing each time point in isolation. It can be used to make comparisons between two conditions (e.g. two treatments) or within a single condition where the first baseline measurement is compared to subsequent time points (e.g. a developmental time-course). A key step in the algorithm involves fitting two different models to each gene's data. The first assumes no differential expresson, whereas the second allows for correlated differential expression between time points. By considering which of the two models better fit the data we can calculate the probability of differential expression for each gene. The algorithm is described in detail in \cite{Aryee2009}. \section{Example} We will use the simulated 1000-gene dataset included in the \betr package for this demonstration. The data represents an experiment comparing gene expression in $StrainA$ and $StrainB$. There are 3 replicates from each strain and each replicate is sampled at 4 time points. The samples are hybridized to one-color arrays giving a total of 24 arrays. <>= library(betr) library(Biobase) data("timeEset") pData(timeEset)[1:15,] @ We load the \betr package and run the \Rclass{betr} function to calculate the probability of differential expression for each gene. <>= prob <- betr(eset=timeEset, cond=pData(timeEset)$strain, timepoint=pData(timeEset)$time, replicate=pData(timeEset)$rep, alpha=0.05) head(prob) @ The parameters used include: \begin{itemize} \item \Rclass{eset}: an \Rclass{ExpressionSet} object containing the experimental data \item \Rclass{cond}: a vector indicating the experimental condition (strain in this example) for each sample \item \Rclass{timepoint}: a vector indicating the time point for each sample \item \Rclass{replicate}: a vector indicating which replicate each sample belongs to \item \Rclass{alpha}: the acceptable false positive rate \end{itemize} The function returns a vector containing the probabilities of differential expression for each feature (row) in the data set. Genes with the best evidence for differential expression will have values close to 1. \bibliography{betrVignette}{} \bibliographystyle{plain} \section{Details} This document was written using: <<>>= sessionInfo() @ \end{document}